Quite often we may require to populate hbase tables for issue recreations. Here's a simple procedure for this ( from cloudavenue.com
) :
1) For creating a table 'testtable' with a column family 'colfam1'
create 'testtable', 'colfam1'
list 'testtable'
3) To test insert data into the 'testtable' table.
put 'testtable', 'myrow-1', 'colfam1:q1', 'value-1'
put 'testtable', 'myrow-2', 'colfam1:q2', 'value-2'
put 'testtable', 'myrow-2', 'colfam1:q3', 'value-3'
The HBase Shell is (J)Ruby’s IRB with some HBase-related commands added. Anything that can be done in IRB, can also be done in the HBase Shell. The below command will insert 1K rows into the 'testtable' table.
for i in '0'..'9' do for j in '0'..'9' do \
for k in '0'..'9' do put 'testtable', "row-#{i}#{j}#{k}", \
"colfam1:#{j}#{k}", "#{j}#{k}" end end end
4) For getting data from the 'testtable' table
get 'testtable', 'myrow-1'
scan 'testtable'
5) For deleting data from the 'testtable' table.
delete 'testtable', 'myrow-2', 'colfam1:q2'
6) For deleting the table.
disable 'testtable'
drop 'testtable'
1) For creating a table 'testtable' with a column family 'colfam1'
create 'testtable', 'colfam1'
list 'testtable'
3) To test insert data into the 'testtable' table.
put 'testtable', 'myrow-1', 'colfam1:q1', 'value-1'
put 'testtable', 'myrow-2', 'colfam1:q2', 'value-2'
put 'testtable', 'myrow-2', 'colfam1:q3', 'value-3'
The HBase Shell is (J)Ruby’s IRB with some HBase-related commands added. Anything that can be done in IRB, can also be done in the HBase Shell. The below command will insert 1K rows into the 'testtable' table.
for i in '0'..'9' do for j in '0'..'9' do \
for k in '0'..'9' do put 'testtable', "row-#{i}#{j}#{k}", \
"colfam1:#{j}#{k}", "#{j}#{k}" end end end
4) For getting data from the 'testtable' table
get 'testtable', 'myrow-1'
scan 'testtable'
5) For deleting data from the 'testtable' table.
delete 'testtable', 'myrow-2', 'colfam1:q2'
6) For deleting the table.
disable 'testtable'
drop 'testtable'
If you want to test a sample csv import,
Use this bash one liner to generate a csv as you want :
for i in `seq 1 19`; do for j in `seq 1 9`; do for k in `seq 1 9`; do echo "row"$i",col"$j",value"$i"-"$ j"-"$k; done; done; done
According to the amount of data you want to load, increase the max value of the loop variables.
This should produce output of the form
row1,col1,value1-1-1
row1,col1,value1-1-2
row1,col1,value1-1-3
row1,col1,value1-1-4
row1,col1,value1-1-5
row1,col1,value1-1-6
row1,col1,value1-1-7
row1,col1,value1-1-8
row1,col1,value1-1-9
row1,col2,value1-2-1
row1,col2,value1-2-2
row1,col2,value1-2-3
row1,col2,value1-2-4
row1,col2,value1-2-5
row1,col2,value1-2-6
row1,col2,value1-2-7
Create a sample table
create 'testtable', 'colfam1'
hadoop jar /opt/mapr/hbase/hbase-0.94.5/ hbase-0.94.5-mapr.jar importtsv -Dimporttsv.columns=colfam1:row,colfam1:col,colfam1:val
Update :
In case you want to generate and print random data on the screen from a ruby shell,
irb(main):014:0> require 'securerandom'
=> true
irb(main):015:0> for i in '1'..'10' do puts SecureRandom.hex
irb(main):016:1> end
8917ccbb7f0bea0d54d0e98e12b416cf
9cd1865fd43482174b3088c6749075de
1d009056e9fcc0b2ddf4352eb824a97d
1abeb9bb4b0993ad732335818fdc8835
d41cf0ca16be930d0aa3925651a10ec4
732dc0d79e7b7d82e4b5ac21d8b00f5c
519fc21d6d0a76a467dd2f2d14741090
27fb689fd3d9b8f4b17b17535681214b
6454ff61e5ef116688ca172ba13aa80c
83ecb50f1e9ab42d1e320119e24a9a9c
=> "1".."10"
irb(main):017:0>
This can be used on the HBase shell to insert into the table
hbase(main):001:0> require 'securerandom'; for i in '0'..'9' do for j in '0'..'9' do \
for k in '0'..'9' do put 'testtable', SecureRandom.hex , \
"colfam1:#{j}#{k}", "#{j}#{k}" end end end