Pinpoint接入业务监控后数据量大涨,平均每周Hbase数据增量25G左右,数据量太大,需要对数据进行定期清理,否则监控可用性降低。
操作步骤
查找出数据大的hbase表
[[email protected] worker]# du -sh hbase/data/default/* 2.2M hbase/data/default/AgentEvent 348K hbase/data/default/AgentInfo 2.6M hbase/data/default/AgentLifeCycle 329M hbase/data/default/AgentStatV2 34M hbase/data/default/ApiMetaData 44K hbase/data/default/ApplicationIndex 66M hbase/data/default/ApplicationMapStatisticsCallee_Ver2 60M hbase/data/default/ApplicationMapStatisticsCaller_Ver2 16M hbase/data/default/ApplicationMapStatisticsSelf_Ver2 1.1M hbase/data/default/ApplicationStatAggre 1.1G hbase/data/default/ApplicationTraceIndex 976K hbase/data/default/HostApplicationMap_Ver2 15M hbase/data/default/SqlMetaData_Ver2 848K hbase/data/default/StringMetaData 21G hbase/data/default/TraceV2
24小时产生数据大概20G,发现其中TraceV2及ApplicationTraceIndex数据比较大,设置TTL分别为7Day及14Day
进入hbase修改表ttl
[[email protected] ~]# /usr/local/hbase-1.0.3/bin/hbase shell 2019-08-19 15:43:20,320 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hbase(main):002:0> list TABLE AgentEvent AgentInfo AgentLifeCycle AgentStatV2 ApiMetaData ApplicationIndex ApplicationMapStatisticsCallee_Ver2 ApplicationMapStatisticsCaller_Ver2 ApplicationMapStatisticsSelf_Ver2 ApplicationStatAggre ApplicationTraceIndex HostApplicationMap_Ver2 SqlMetaData_Ver2 StringMetaData TraceV2 15 row(s) in 0.0100 seconds => ["AgentEvent", "AgentInfo", "AgentLifeCycle", "AgentStatV2", "ApiMetaData", "ApplicationIndex", "ApplicationMapStatisticsCallee_Ver2", "ApplicationMapStatisticsCaller_Ver2", "ApplicationMapStatisticsSelf_Ver2", "ApplicationStatAggre", "ApplicationTraceIndex", "HostApplicationMap_Ver2", "SqlMetaData_Ver2", "StringMetaData", "TraceV2"] hbase(main):004:0> describe ‘TraceV2‘ Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => ‘S‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘PREFIX‘, TTL => ‘5184000 SECONDS ( 60 DAYS)‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATION_SCOPE => ‘0‘} 1 row(s) in 0.1190 seconds hbase(main):005:0> disable ‘TraceV2‘ 0 row(s) in 4.2190 seconds hbase(main):006:0> alter ‘TraceV2‘ , {NAME=>‘S‘,TTL=>‘604800‘} Updating all regions with the new schema... 256/256 regions updated. Done. 0 row(s) in 1.0980 seconds hbase(main):009:0> enable ‘TraceV2‘ 0 row(s) in 4.2370 seconds hbase(main):010:0> describe ‘TraceV2‘ Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => ‘S‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘PREFIX‘, TTL => ‘604800 SECONDS (7 DAYS)‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATION_SCOPE => ‘0‘} 1 row(s) in 0.0160 seconds hbase(main):002:0> describe ‘TraceV2‘ Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => ‘S‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘PREFIX‘, TTL => ‘5184000 SECONDS (60 DAYS)‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATION_SCOPE => ‘0‘} 1 row(s) in 0.1000 seconds
设置ApplicationTraceIndex的TTL为 14天
hbase(main):011:0> describe ‘ApplicationTraceIndex‘ Table ApplicationTraceIndex is ENABLED ApplicationTraceIndex COLUMN FAMILIES DESCRIPTION {NAME => ‘I‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘PREFIX‘, TTL => ‘5184000 SECONDS ( 60 DAYS)‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATION_SCOPE => ‘0‘} 1 row(s) in 0.0150 seconds hbase(main):012:0> disable ‘ApplicationTraceIndex‘ 0 row(s) in 1.1660 seconds hbase(main):013:0> alter ‘ApplicationTraceIndex‘ , {NAME=>‘I‘,TTL=>‘1209600‘} Updating all regions with the new schema... 16/16 regions updated. Done. 0 row(s) in 1.0550 seconds hbase(main):014:0> enable ‘ApplicationTraceIndex‘ 0 row(s) in 0.3520 seconds hbase(main):015:0> describe ‘ApplicationTraceIndex‘ Table ApplicationTraceIndex is ENABLED ApplicationTraceIndex COLUMN FAMILIES DESCRIPTION {NAME => ‘I‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘PREFIX‘, TTL => ‘1209600 SECONDS ( 14 DAYS)‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATION_SCOPE => ‘0‘} 1 row(s) in 0.0200 seconds hbase(main):016:0> major_compact ‘ApplicationTraceIndex‘ 0 row(s) in 0.1660 seconds
备注
major_compact的操作目的
合并文件
清除删除、过期、多余版本的数据
提高读写数据的效率
604800 7day
describe ‘TraceV2‘
disable ‘TraceV2‘
alter ‘TraceV2‘ , {NAME=>‘S‘,TTL=>‘604800‘}
enable ‘TraceV2‘
describe ‘TraceV2‘
major_compact ‘TraceV2‘
1209600 14day
describe ‘ApplicationTraceIndex‘
disable ‘ApplicationTraceIndex‘
alter ‘ApplicationTraceIndex‘ , {NAME=>‘I‘,TTL=>‘1209600‘}
enable ‘ApplicationTraceIndex‘
describe ‘ApplicationTraceIndex‘
major_compact ‘ApplicationTraceIndex‘
[[email protected] ~]# du -sh /worker/hbase/data/* 14G /worker/hbase/data/default 348K /worker/hbase/data/hbase
原文地址:https://www.cnblogs.com/FireLL/p/11612522.html
时间: 2024-10-13 19:29:43