Hbase/HbaseRest

DEPRECATED This page describes the deprecated o.a.h.h.rest REST server. Its been replaced by stargate. See the contrib/stargate package in hbase. Its documentation can be found here: stargate description Old Documentation This page describes the hbase REST api. It starts with the specification. Towards the end are examples using curl as a client and a description -- it won‘t work with a browser -- on how to start an instance of the REST server outside of the master web UI. NOTE: REST was refactored in hbase 0.20.0 (not out as of this writing). While the API was preserved, the implementation now supports XML or JSON serialization. See HBASE-1064 for detail -- stack on 01/21/2009 System Information GET / Retrieve a list of all the tables in HBase. Returns: XML entity body that contains a list of the tables like so:

restest_table1

restest_table2

POST / Create a table. Headers: Content-type: text/xml: The client is sending the table metadata in an XML entity. Returns: HTTP 200 (OK) if the table could successfully be created. GET /[table_name] Retrieve metadata about the table. This includes column family descriptors. Returns: XML entity body that contains all the metadata about the table:

restesta:NONENONE32147483647b:NONENONE32147483647

PUT /[table_name] Update the table schema. Headers: Content-type: text/xml: The client is sending the table metadata in an XML entity. Returns: HTTP 200 (OK) if the table could successfully be updated. DELETE /[table_name] Delete this table. Returns: HTTP 202 (Accepted) if the table could successfully be deleted. POST /[table_name]/disable Disable this table Returns: HTTP 202 (Accepted) if the table could successfully be disabled. POST /[table_name]/enable Enable this table Returns: HTTP 202 (Accepted) if the table could successfully be enabled. GET /[table_name]/regions Retrieve a list of the regions for this table so that you can efficiently split up the work (a la MapReduce). Options: start_row, stop_row: Only return the list of regions that contain the range start_row...stop_row Returns: XML entity body that describes the regions:

0101

Row Interaction GET /[table_name]/row/[row_key]/timestamps Retrieve a list of all the timestamps available for this row key. Returns: XML entity body that describes the list of available timestamps:

St.Ack comment: Currently not supported in native hbase client but we should add it GET /[table_name]/row/[row_key]/ GET /[table_name]/row/[row_key]/[timestamp] Retrieve data from a row, constrained by an optional timestamp value. Headers: Accept: text/xml: The client is expecting an XML entity body that contains the columns and data together. Multipart/related: The client is expecting raw binary data, but organized into a multipart response. The client must be prepared to parse the column values out of the data. not supported yet. Parameters: column: specify one or more column parameters (&-separated) to get the content of specific cells. If omitted, the result will contain all columns in the row. POST/PUT /[table_name]/row/[row_key]/ POST/PUT /[table_name]/row/[row_key]/[timestamp] Set the value of one or more columns for a given row key with an optional timestamp. Headers: Content-type: text/xml: The client is sending one or more columns of data in an XML entity. The column value must be base64 encoded. Multipart/related: The client is sending multiple columns of data encoded with boundaries. Not supported yet. Parameters: column: specify one or more column parameters (&-separated) to get the content of specific cells. If omitted, the result will contain all columns in the row. Returns: HTTP 200 (OK) if the column(s) could successfully be saved. HTTP 415 (Unsupported Media Type) if the query string column options do not match the Content-type header, or if the binary data of either octet-stream or Multipart/related is unreadable. DELETE /[table_name]/row/[row_key]/ DELETE /[table_name]/row/[row_key]/[timestamp] Delete the specified columns from the row. If there are no columns specified, then it will delete ALL columns. Optionally, specify a timestamp. Parameters: column: specify one or more column parameters (&-separated) to get the content of specific cells. If omitted, the result will contain all columns in the row. Returns: HTTP 202 (Accepted) if the column(s) were deleted. sishen comment: Delete based on timestamp currently not supported yet but we need implement it Scanning POST/PUT /[table_name]/scanner Request that a scanner be created with the specified options. Returns a scanner ID that can be used to iterate over the results of the scanner. Parameters: column: specify one or more column parameters (&-separated) to get the content of specific cells. If omitted, the result will contain all columns in the row. start_row, stop_row: Starting and ending keys that enclose the region that should be scanned. timestamp: Timestamp at which to start the scanner. Returns: HTTP 201 (Created) with a Location header that references the scanner URI. Example: /first_table/scanner?timestamp=1234348890231890&column=colfam1:name&start_row=first_key&stop_row=last_key St.Ack comment 11/18/2007: I added timestamp parameter. Should start_row, stop_row, OR timestamp be on the URL path to sync. with how they are specified GET‘ing, etc? POST /[table_name]/scanner/[scanner_id] Return the current item in the scanner and advance to the next one. Think of it as a queue dequeue operation. Headers: Accept: text/xml: The client is expecting an XML entity body that contains the columns and data together. Multipart/related: The client is expecting raw binary data, but organized into a multipart response. The client must be prepared to parse the column values out of the data. Not supported yet. Parameters: limit: return N numbers of items Returns: HTTP 200 (OK) and an entity that describes the current row in the scanner. The entity value of this request depends on the Accept header. See the documentation for getting an individual row for data format. If the scanner is used up, HTTP 404 (Not Found). DELETE /[table_name]/scanner/[scanner_id] Close a scanner. You must call this when you are done using a scanner to deallocate it. Returns: HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the scanner id is invalid. HTTP 410 (Gone) if the scanner is already closed or the lease time has expired. Multiple Columns in Query String In any case where a request can take multiple column names in the query string, the syntax should be: GET http://server:port/first_table/row/row_key?column=fam1:name&column=fam2:address This avoids the problems with having semicolon separators in a single query string parameter, and is easily read into an array in Java. Starting the REST Server By default, an instance of the REST servlet runs in the master UI; just browse to http://MASTER_HOST:MASTER_PORT/api/ (Results are returned as xml by default so you may have to look at source to see results). If you intend to use the hbase REST API heavily, to run an instance of the REST server outside of the master, do the following: cd $HBASE_HOME bin/hbase rest start Pass --help to see REST server usage. Request Spec This is a the spec for the Hbase-REST API done under the aegis of HADOOP-2068. It was committed 11/30/2007. Examples using curl Here is a POST of create table. [email protected]:~/Work/Personal/java/apache/hbase-trunk$curl -v -X POST -T - http://localhost:60050/api/ * About to connect() to localhost port 60050 (#0) *   Trying ::1... connected * Connected to localhost (::1) port 60050 (#0) > POST /api/ HTTP/1.1 > User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 > Host: localhost:60050 > Accept: */* > Transfer-Encoding: chunked > Expect: 100-continue > < HTTP/1.1 100 Continue

tablessubscription2NONEfalsetrue

^D < HTTP/1.1 200 OK < Date: Wed, 13 Aug 2008 18:59:38 GMT < Server: Jetty/5.1.4 (Mac OS X/10.5.4 i386 java/1.5.0_13 < Content-Length: 0 < * Connection #0 to host localhost left intact * Closing connection #0 Here is a POST of disable table. [email protected]:~/Work/Personal/java/apache/hbase-trunk$curl -v -X POST http://localhost:60050/api/tables/disable * About to connect() to localhost port 60050 (#0) *   Trying ::1... connected * Connected to localhost (::1) port 60050 (#0) > POST /api/tables/disable HTTP/1.1 > User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 > Host: localhost:60050 > Accept: */* > < HTTP/1.1 202 Accepted < Date: Wed, 13 Aug 2008 18:55:03 GMT < Server: Jetty/5.1.4 (Mac OS X/10.5.4 i386 java/1.5.0_13 < Content-Length: 0 < * Connection #0 to host localhost left intact * Closing connection #0 Here is a POST of enable table. [email protected]:~/Work/Personal/java/apache/hbase-trunk$curl -v -X POST http://localhost:60050/api/tables/enable * About to connect() to localhost port 60050 (#0) *   Trying ::1... connected * Connected to localhost (::1) port 60050 (#0) > POST /api/tables/enable HTTP/1.1 > User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 > Host: localhost:60050 > Accept: */* > < HTTP/1.1 202 Accepted < Date: Wed, 13 Aug 2008 18:56:20 GMT < Server: Jetty/5.1.4 (Mac OS X/10.5.4 i386 java/1.5.0_13 < Content-Length: 0 < * Connection #0 to host localhost left intact * Closing connection #0 Here is a DELETE of a table. [email protected]:~/Work/Personal/java/apache/hbase-trunk$curl -v -X DELETE http://localhost:60050/api/tables * About to connect() to localhost port 60050 (#0) *   Trying ::1... connected * Connected to localhost (::1) port 60050 (#0) > DELETE /api/tables HTTP/1.1 > User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 > Host: localhost:60050 > Accept: */* > < HTTP/1.1 202 Accepted < Date: Wed, 13 Aug 2008 18:57:41 GMT < Server: Jetty/5.1.4 (Mac OS X/10.5.4 i386 java/1.5.0_13 < Content-Length: 0 < * Connection #0 to host localhost left intact * Closing connection #0 Here is a GET of a row. Notice how values are Base64‘d. durruti:~/Documents/checkouts/hadoop-trunk/src/contrib/hbase stack$ curl -v  http://XX.XX.XX.151:60010/api/restest/row/y           * About to connect() to XX.XX.XX.151 port 60010 *   Trying XX.XX.XX.151... * connected * Connected to XX.XX.XX.151 (208.84.6.151) port 60010 > GET /api/restest/row/y HTTP/1.1 User-Agent: curl/7.13.1 (powerpc-apple-darwin8.0) libcurl/7.13.1 OpenSSL/0.9.7l zlib/1.2.3 Host: XX.XX.XX.151:60010 Pragma: no-cache Accept: */* < HTTP/1.1 200 OK < Date: Thu, 29 Nov 2007 00:24:39 GMT < Server: Jetty/5.1.4 (Mac OS X/10.4.11 i386 java/1.5.0_07 < Content-Type: text/xml;charset=UTF-8 < Transfer-Encoding: chunked

a:YQ==Here is an example PUT to column ‘a:‘ of row ‘y‘: durruti:~/Documents/checkouts/hadoop-trunk/src/contrib/hbase stack$ curl -v -T /tmp/y.row http://XX.XX.XX.151:60010/api/restest/row/y?column=a: * About to connect() to XX.XX.XX.151 port 60010 *   Trying XX.XX.XX.151... * connected * Connected to XX.XX.XX.151 (208.84.6.151) port 60010 > PUT /api/restest/row/y?column=a: HTTP/1.1 User-Agent: curl/7.13.1 (powerpc-apple-darwin8.0) libcurl/7.13.1 OpenSSL/0.9.7l zlib/1.2.3 Host: XX.XX.XX.151:60010 Pragma: no-cache Accept: */* Content-Length: 100 Expect: 100-continue < HTTP/1.1 100 Continue < HTTP/1.1 200 OK < Date: Thu, 29 Nov 2007 00:26:36 GMT < Server: Jetty/5.1.4 (Mac OS X/10.4.11 i386 java/1.5.0_07 < Content-Length: 0 The file /tmp/y.row has these contents: a:YQ==Here is example that gets a scanner and then does a next to obtain first row value (The ‘-T /tmp/y.row‘ is just to fake curl into doing a POST): durruti:~/Documents/checkouts/hadoop-trunk/src/contrib/hbase stack$ curl -v -T /tmp/y.row http://XX.XX.XX.151:60010/api/restest/scanner?column=a: * About to connect() to XX.XX.XX.151 port 60010 *   Trying XX.XX.XX.151... * connected * Connected to XX.XX.XX.151 (XX.XX.XX.151) port 60010 > PUT /api/restest/scanner?column=a: HTTP/1.1 User-Agent: curl/7.13.1 (powerpc-apple-darwin8.0) libcurl/7.13.1 OpenSSL/0.9.7l zlib/1.2.3 Host: XX.XX.XX.151:60010 Pragma: no-cache Accept: */* Content-Length: 100 Expect: 100-continue < HTTP/1.1 100 Continue < HTTP/1.1 201 Created < Date: Thu, 29 Nov 2007 00:20:50 GMT < Server: Jetty/5.1.4 (Mac OS X/10.4.11 i386 java/1.5.0_07 < Location: /api/restest/scanner/e5e2ce25 < Content-Length: 0 * Connection #0 to host XX.XX.XX.151 left intact * Closing connection #0 durruti:~/Documents/checkouts/hadoop-trunk/src/contrib/hbase stack$ curl -v -T /tmp/y.row http://208.84.6.151:60010/api/restest/scanner/e5e2ce25 * About to connect() to XX.XX.XX.151 port 60010 *   Trying XX.XX.XX.151... * connected * Connected to XX.XX.XX.151 (208.84.6.151) port 60010 > PUT /api/restest/scanner/e5e2ce25 HTTP/1.1 User-Agent: curl/7.13.1 (powerpc-apple-darwin8.0) libcurl/7.13.1 OpenSSL/0.9.7l zlib/1.2.3 Host: XX.XX.XX.151:60010 Pragma: no-cache Accept: */* Content-Length: 100 Expect: 100-continue < HTTP/1.1 100 Continue < HTTP/1.1 200 OK < Date: Thu, 29 Nov 2007 00:20:58 GMT < Server: Jetty/5.1.4 (Mac OS X/10.4.11 i386 java/1.5.0_07 < Content-Type: text/xml;charset=UTF-8 < Transfer-Encoding: chunked y1196293620892a:YQ==

时间: 2024-12-14 06:19:09

Hbase/HbaseRest的相关文章

hbase 分享笔记

hbase 测试例子文件:http://download.csdn.net/detail/ruishenh/9551930 hbase 是什么 官方说明:Use Apache HBase? when youneed random, realtime read/write access to your Big Data. This project's goalis the hosting of very large tables -- billions of rows X millions of

hbase过滤器(1)

最近在公司做hbase就打算复习下它的过滤器以便不时之需,RowFilter根据行键(rowkey)筛选数据 public void filter() throws IOException { Filter rf = new RowFilter(CompareFilter.CompareOp.LESS, new BinaryComparator(Bytes.toBytes("35643b94-b396-4cdc-abd9-029ca495769d"))); Scan s = new S

[原创]HBase学习笔记(1)-安装和部署

HBase安装和部署 使用的HBase版本是1.2.4 1.安装步骤(默认hdfs已安装好) # 下载并解压安装包 cd tools/ tar -zxf hbase-1.2.4-bin.tar.gz   # 重命名为hbase mv hbase-1.2.4 hbase # 将hadoop目录下的hdfs-site.xml 和 core-stie.xml拷贝到 hbase下的conf 目录中 cd /home/work/tools/hbase/conf cp /home/work/tools/ha

Hbase delete遇到的常见异常: Exception in thread &quot;main&quot; java.lang.UnsupportedOperationException

hbase 执行批量删除时出现错误: Exception in thread "main" java.lang.UnsupportedOperationException at java.util.AbstractList.remove(AbstractList.java:161) at org.apache.hadoop.hbase.client.HTable.delete(HTable.java:852) 这种异常其实很常见,remove操作不支持,为什么会出现不支持的情况呢?检查

HBase学习

记录HBase的学习过程.之后会陆续添加内容. 读取hbase的博客,理解hbase是什么.推荐博文: 1,HBase原理,基础架构,基础概念 2,HBase超详细介绍 ----------------------------------------------------- 一.直接实践吧! 1,HBase standalone模式安装 版本:1.2.4 参考文档:http://archive.cloudera.com/cdh5/cdh/5/hbase-0.98.6-cdh5.3.3/book

基于HBase的时间序列数据库(改进)

基本知识: 期望:1.利用高效的行.列键组织数据存储方式和使用平滑的数据持久策略缓解集群压力 2.利用zookeeper保障数据一致性(选举Leader) 提高性能的技术:数据压缩.索引技术.实体化视图 zookeeper 监控HRegionServer,保存Root Region实际地址,HMaster物理地址,减轻分布式应用从头开发协作服务的负担 HMaster管理HRegionServer负载均衡 日志根据Hadoop的SequenceFile存储 HBase主要处理实际数据文件和日志文件

hbase shell 命令

1.首先要打开hbase,使用jps查看进程 jps是java进程状态工具,它会返回进程ID和服务名称 [email protected]:~/Apache/hbase-0.94.15-security$ jps 3082 NameNode 6245 HRegionServer 3493 JobTracker 6064 HMaster 5999 HQuorumPeer 3638 TaskTracker 3259 DataNode 3413 SecondaryNameNode 6320 Jps 2

HBASE遇到的java.lang.OutOfMemoryError: unable to create new native thread解决方法

简单分享一下,类似问题的解决方法 刚才在某机器上上xxx用户下压测时遇到这个问题,连xxx都进不去了 说明xxx用户下无法创建跟多的线程了(当然root用户没这个问题) 系统能够创建的最大线程数:(MaxProcessMemory - JVMMemory – 系统内存) / (ThreadStackSize) = Number of threads 有两种方式: 减少xxx下的ThreadStackSize 增加xxx下的nproc数量 修改 [[email protected]]$ ulimi

MapReduce/Hbase进阶提升(原理剖析、实战演练)

什么是MapReduce? MapReduce是一种编程模型,用于大规模数据集(大于1TB)的并行运算.概念"Map(映射)"和"Reduce(归约)",和他们的主要思想,都是从函数式编程语言里借来的,还有从矢量编程语言里借来的特性.他极大地方便了编程人员在不会分布式并行编程的情况下,将自己的程序运行在分布式系统上. 当前的软件实现是指定一个Map(映射)函数,用来把一组键值对映射成一组新的键值对,指定并发的Reduce(归约)函数,用来保证所有映射的键值对中的每一