HBase 协处理器统计行数

环境：cdh5.1.0

启用协处理器方法1.

启用协处理器 Aggregation(Enable Coprocessor Aggregation)

我们有两个方法：1.启动全局aggregation，能过操纵所有的表上的数据。通过修改hbase-site.xml这个文件来实现，只需要添加如下代码：

<property>
   <name>hbase.coprocessor.user.region.classes</name>
   <value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>
 </property>

启用协处理器方法2.

启用表aggregation，只对特定的表生效。通过HBase Shell 来实现。

(1)disable指定表。hbase> disable ‘mytable‘

(2)添加aggregation hbase> alter ‘mytable‘, METHOD => ‘table_att‘,‘coprocessor‘=>‘|org.apache.hadoop.hbase.coprocessor.AggregateImplementation||‘

(3)重启指定表 hbase> enable ‘mytable‘

代码：

package com.jamesfen.hbase;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.coprocessor.AggregationClient;
import org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter;
import org.apache.hadoop.hbase.coprocessor.ColumnInterpreter;
import org.apache.hadoop.hbase.util.Bytes;

public class MyAggregationClient {

	private static final byte[] TABLE_NAME = Bytes.toBytes("bigtable1w");
	private static final byte[] CF = Bytes.toBytes("bd");
	public static void main(String[] args) throws Throwable {
	Configuration customConf = new Configuration();
	customConf.set("hbase.zookeeper.quorum",
	"192.168.58.101");
	//提高RPC通信时长
	customConf.setLong("hbase.rpc.timeout", 600000);
	//设置Scan缓存
	customConf.setLong("hbase.client.scanner.caching", 1000);
	Configuration configuration = HBaseConfiguration.create(customConf);
	AggregationClient aggregationClient = new AggregationClient(
	configuration);
	Scan scan = new Scan();
	//指定扫描列族，唯一值
	scan.addFamily(CF);
	//long rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan);
	long rowCount = aggregationClient.rowCount(TableName.valueOf("bigtable1w"), new LongColumnInterpreter(), scan);
	System.out.println("row count is " + rowCount);

	}

}

时间： 2024-12-22 04:51:46

HBase 协处理器统计行数的相关文章

Hbase Java API包括协处理器统计行数

package com.zy; import java.io.IOException; import org.apache.commons.lang.time.StopWatch; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.*; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.cli

linux、WINDOWS命令行下查找和统计行数

Windows命令提示符中统计行数

使用内置工具FIND统计cmd.exe输出的行数非常方便! 在命令行环境中工作时,能够统计不同工具的输出结果的行数有时会非常有用.许多Unix/Linux操作系统都包含带有许多功能选项的wc 工具,Windows则没有内置一样的替代品,但是Windows命令提示符(cmd.exe)原生支持了部分相同功能. 本文将讲述在cmd.exe中我们可以如何使用FIND 工具来统计行数.工具find,有些类似于Unix上的grep,自MS-DOS以来就一直存在, 使用简单. 假设我们有一台Windows服务

oracle查询表统计行数与注释

SELECT TABLE_NAME,NUM_ROWS,(select COMMENTS from user_tab_comments WHERE TABLE_NAME=C.TABLE_NAME) FROM user_tables CWHERE NUM_ROWS>0 查询表统计行数与注释

SQL Server遍历所有表统计行数

DECLARE CountTableRecords CURSOR READ_ONLY FOR SELECT sst.name, Schema_name(sst.schema_id) FROM sys.tables sst WHERE sst.TYPE = 'U' DECLARE @name VARCHAR(80), @schema VARCHAR(40) OPEN CountTableRecords FETCH NEXT FROM CountTableRecords INTO @name, @s

C++->10.3.2-3，使用文件流类录入数据，并统计行数

题目:建立一个文本文件,从键盘录入一篇短文存放在该文件中短文由若干行构成,每行不超过80个字符,并统计行数. /* #include<iostream.h>#include<stdlib.h>#include<fstream.h>void main(){ fstream iofs; char *p,str[80],str1[80]; int x=0; p=&str[1]; cout<<"Please input the file n

快速扫描文本文件，统计行数，并返回每一行的索引位置(Delphi、C#)

由项目需要,需要扫描1200万行的文本文件.经网友的指点与测试,发现C#与Delphi之间的差距并不大.不多说,列代码测试: 下面是Delphi的代码: //遍历文件查找回车出现的次数function ScanEnterFile(const FileName:string):TInt64Array;var MyFile:TMemoryStream;//文件内存 rArray:TInt64Array; //行索引结果集 size,curIndex:int64;//文件大小,当前流

Windows 下统计行数的命令

大家都知道在Linux下统计文本行数能够用wc -l 命令.比如: -bash-3.2$ cat pif_install.log | wc -l 712 但在Windows下怎样统计输出文本的行数呢,答案是使用find /c命令 1.统计包括某字符串的行数. 比如在统计网络连接时的TIME_WAIT数等 netstat -an | find /i /c "TIME_WAIT" 这里/i參数是忽略大写和小写./c參数是统计包括"TIME_WAIT"字符串的行数或记录数

hdfs统计行数和统计文件大小

使用hdfs有时候需要统计文件行数和文件的大小 1.hdfs下载文件夹中多个文件 hadoop fs -get /目录目录 2.统计多个文件行数 hadoop fs -cat /文件* | wc -l 3.统计文件大小 hadoop fs -count /文件* 统计单个文件只需要精确到文件即可