Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
School of Information Technologies
Dr. Ying Zhou
COMP5338: Advanced Data Models 2.Sem./2012
Week 11: Cassandra Tutorial
16.10.2012
Cassandra cluster
The Cassandra cluster we use in this course consists of 3 IBM instances: vhost0029,
vhost0823 and vhost0976, all running Red Hat Enterprise Linux v6 OS. The cluster is
built from tarball (version 1.0.11) released by Apache. The default settings are used for
most of the configurations.
Question 1: Cassandra CLI: Basic Operations
In this exercise, we use the basic interactive command line interface CLI to explore Cas-
sandra data model and operations. Cassandra is shipped with another interactive client
cqlsh, which provides SQL like Cassandra Qquery Language support. cqlsh is installed
under the same directory as cassandra-cli. You may try it as well.
a) Connect to one of the nodes in the cluster using your private key. After login, set the
CASSANDRA HOME environment variable:
export CASSANDRA_HOME=/home/idcuser/dev/apache-cassandra-1.0.11
Now start Cassandra command line client:
$CASSANDRA_HOME/bin/cassandra-cli
Once you are in the Cassandra CLI environment, you can type in commands to inter-
act with the Cassandra cluster. Every command should end with a semi column!
The following command sets up a thrift connection to a node (vhost0029) in the clus-
ter:
connect vhost0029/9160;
If the connection is successful, you should see a message like this:
Connected to: "COMP5338 Cluster" on vhost0029/9160
You may replace the host name with any other host in the cluster. Type help; to see
a list of commands you can use in the CLI environment.
b) Create a keyspace with your login name and switch to it:
create keyspace ;
use ;
1
c) Create a static column family users to store user information. The cf should have
three columns: fullName, state and birthYear. The first two are of String type while
the birthYear is of integer type. We also build secondary indexes on state and
birthYear columns.
create column family users with comparator= AsciiType
and key_validation_class=AsciiType
and column_metadata=[
{column_name: fullName, validation_class: AsciiType},
{column_name: state, validation_class: AsciiType, index_type: KEYS},
{column_name: birthYear, validation_class: LongType, index_type: KEYS}
];
d) Insert some data in the column family using set command. The general format of
a set command is: set [][]=. The following
commands insert five rows each with three columns.
set users[bsanderson][fullName] = ’Brandon Sanderson’;
set users[bsanderson][birthYear] = 1983;
set users[bsanderson][state] = ’NSW’;
set users[prothfuss][fullName] = ’Patrick Rothfuss’;
set users[prothfuss][birthYear] = 1983;
set users[prothfuss][state] = ’VIC’;
set users[htayler][fullName] = ’Howard Tayler’;
set users[htayler][birthYear] = 1978;
set users[htayler][state] = ’NSW’;
set users[awonderland][fullName] = ’Alice Wonderland’;
set users[awonderland][birthYear] = 1987;
set users[awonderland][state] = ’VIC’;
set users[bbuilder][fullName] = ’Bob Builder’;
set users[bbuilder][birthYear] = 1984;
set users[bbuilder][state] = ’NSW’;
e) Below is an example of typical query commands you can use to explore data in a
column family in CLI. The list command shows rows in a column family. By default,
it lists up to 100 rows. A different limiting number can be specified as parameter of the
command. The get command is the main querying command. You can use it to get a
row, a column or to run SQL like queries if your column family has secondary index.
list users;
get users[htayler];
get users[htayler][birthYear];
get users where birth_date = 1983;
get users where state = ’NSW’ and birthYear > 1980;
2
Question 2: Cassandra Thrift Client: Basic Operations
This exercise demonstrates how to create column family, insert data and run basic queries
using Thrift API. The sample code CassandraBasicExample.java can be downloaded
from http://web.it.usyd.edu.au/~comp5338/code/. To compile and run the sample on
lab pc, download the two required libraries: apache-cassandra-thrift-1.0.11.jar and
libthrift-0.6.jar and include them in your project’s java build path. To compile and
run on IBM instance, include all libraries in lib folder of cassandra home directory in the
classpath.
The CassandraBasicExample class contains several methods. The init method contains
standard statements of openning a connection to Cassandra cluster. The cleanUp method
releases the connection. Cassandra stores all data as byte array internally. Thrift API uses
ByteBuffer to wrap the byte array. Any application specific data needs to be wrapped in
a ByteBuffer before sending it to the cluster. The result comes back is also wrapped in
ByteBuffer format.
To run the sample, use the name of the keyspace your just created as command line
argument. If successful, a column family called webGraph will be created in your keyspace
and two rows of data will be inserted. You can use CLI to inspect the column family and
its data.
Question 3: Cassandra Thrift Client: Secondary Index Query
a) This exercise demonstrates secondary index based queries. Download sample code
CassandraComplexQueryExample.java from http://web.it.usyd.edu.au/~comp5338/
code/.
The sample code assumes that you have finished the first exercise and have created a
column family called users in your keyspace. The queryStateAge method implements
the CLI query get users where state = ’NSW’ and birthYear > 1980 in Thrift API.
It also shows how to encode/decode an integer typed data in/from a ByteBuffer.
You need to supply your own keyspace name as command line argument to run this
sample.
b) Write two separate methods to print out all information about user htayler and to find
all users that was born in 1983.
Question 4: Cassandra Schema Design Exercise
In Week 9 tutorial, you are asked to design an HBase table to store student profile and
transcript inforamtion. Design a Cassandra column family for the same data and queries.
3