root, and the password is the one you set when creating the EMR cluster. Once your credentials are validated, you can enter the command line interface.[root@172 ~]# su hadoop[hadoop@172 root]$ cd /usr/local/service/hbase/[hadoop@172 hbase]$
[hadoop@172 hbase]$ vim conf/hbase-site.xml<property><name>hbase.master.hostname</name><value>$thriftIP</value></property><property><name>hbase.regionserver.thrift.port</name><value>$port</value></property>
$port is the port number of the Thrift server.[hadoop@172 hbase]$ jps4711 ThriftServer
[hadoop@172 hbase]$ hbase shellhbase(main):001:0> create 'thrift_test', 'cf'hbase(main):005:0> listthrift_test1 row(s) in 0.2270 secondshbase(main):001:0> quit
[hadoop@172 hbase]$ suPassword: ********[root@172 hbase]# yum install python-pip[root@172 hbase]# pip install hbase-thrift
Hbase_client.py, and add the following code to it:#! /usr/bin/env python#coding=utf-8from thrift.transport import TSocket,TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasesocket = TSocket.TSocket('$thriftIP ', $port)socket.setTimeout(5000)transport = TTransport.TBufferedTransport(socket)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()print client.getTableNames()
$thriftIP is the IP address of the master node on the private network, and $port is the port number of ThriftService.[hadoop@172 hbase]$ python Hbase_client.py['thrift_test']
Create_table.py and add the following code to it:#! /usr/bin/env python#coding=utf-8from thrift import Thriftfrom thrift.transport import TSocket,TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasefrom hbase.ttypes import ColumnDescriptor,Mutation,BatchMutation,TRegionInfofrom hbase.ttypes import IOError,AlreadyExistssocket = TSocket.TSocket('$thriftIP ',$port)socket.setTimeout(5000)transport = TTransport.TBufferedTransport(socket)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()new_table = ColumnDescriptor(name = 'cf:',maxVersions = 1)client.createTable('thrift_test_1',[new_table])tables = client.getTableNames()socket.close()print tables
thrift_test_1 in HBase and output all existing tables:[hadoop@172 hbase]$ python Create_table.py['thrift_test', 'thrift_test_1']
Insert.py and add the following code to it:#! /usr/bin/env python#coding=utf-8from thrift import Thriftfrom thrift.transport import TSocket,TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasefrom hbase.ttypes import ColumnDescriptor,Mutation,BatchMutation,TRegionInfofrom hbase.ttypes import IOError,AlreadyExistssocket = TSocket.TSocket('$thriftIP ', $port)socket.setTimeout(5000)transport = TTransport.TBufferedTransport(socket)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()mutation1 = [Mutation(column = "cf:a",value = "value1")]client.mutateRow('thrift_test_1',"row1",mutation1)mutation2 = [Mutation(column = "cf:b",value = "value2")]client.mutateRow('thrift_test_1',"row1",mutation2)mutation1 = [Mutation(column = "cf:a",value = "value3")]client.mutateRow('thrift_test_1',"row2",mutation1)mutation2 = [Mutation(column = "cf:b",value = "value4")]client.mutateRow('thrift_test_1',"row2",mutation2)socket.close()
thrift_test_1 table in HBase, each with two data entries, which can be viewed in HBase Shell:hbase(main):005:0> scan 'thrift_test_1'ROW COLUMN+CELLrow1 column=cf:a, timestamp=1530697238581, value=value1row1 column=cf:b, timestamp=1530697238587, value=value2row2 column=cf:a, timestamp=1530704886969, value=value3row2 column=cf:b, timestamp=1530704886975, value=value42 row(s) in 0.0190 seconds
Scan_table.py and add the following code to it:#! /usr/bin/env python#coding=utf-8from thrift import Thriftfrom thrift.transport import TSocket,TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasefrom hbase.ttypes import ColumnDescriptor,Mutation,BatchMutation,TRegionInfofrom hbase.ttypes import IOError,AlreadyExistssocket = TSocket.TSocket('$thriftIP ', $port)socket.setTimeout(5000)transport = TTransport.TBufferedTransport(socket)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()result1 = client.getRow("thrift_test_1","row1")print result1for r in result1:print 'the rowname is ',r.rowprint 'the frist value is ',r.columns.get('cf:a').valueprint 'the second value is ',r.columns.get('cf:b').valuescanId = client.scannerOpen('thrift_test_1',"",["cf"])result2 = client.scannerGetList(scanId,10)print result2client.scannerClose(scanId)socket.close()
[hadoop@172 hbase]$ python Scan_table.py[TRowResult(columns={'cf:a': TCell(timestamp=1530697238581, value='value1'), 'cf:b': TCell(timestamp=1530697238587, value='value2')}, row='row1')]the rowname is row1the frist value is value1the second value is value2[TRowResult(columns={'cf:a': TCell(timestamp=1530697238581, value='value1'), 'cf:b': TCell(timestamp=1530697238587, value='value2')}, row='row1'), TRowResult(columns={'cf:a': TCell(timestamp=1530704886969, value='value3'), 'cf:b': TCell(timestamp=1530704886975, value='value4')}, row='row2')]
Delete_row.py and add the following code to it:#! /usr/bin/env python#coding=utf-8from thrift import Thriftfrom thrift.transport import TSocket,TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasefrom hbase.ttypes import *socket = TSocket.TSocket('$thriftIP ',$port)socket.setTimeout(5000)transport = TTransport.TBufferedTransport(socket)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()client.deleteAllRow("thrift_test_1","row2")socket.close()
[hadoop@172 hbase]$ python Delete_row.py[hadoop@172 hbase]$ hbase shellhbase(main):004:0> scan 'thrift_test_1'ROW COLUMN+CELLrow1 column=cf:a, timestamp=1530697238581, value=value1row1 column=cf:b, timestamp=1530697238587, value=value21 row(s) in 0.2050 seconds
Feedback