报错信息:
2017-05-13 15:14:30,035 DEBUG [pool-9-thread-10] dict.DictionaryGenerator:94 : Dictionary class: org.apache.kylin.dict.TrieDictionary 2017-05-13 15:14:30,036 ERROR [pool-9-thread-10] common.HadoopShellExecutable:65 : error execute HadoopShellExecutable{id=2657ff38-35b0-4a33-9f6b-fee48031147f-03, name=B uild Dimension Dictionary, state=RUNNING} java.lang.RuntimeException: Failed to create dictionary on HM_30_1_PROD_20170418_20170512.OUTPATIENT_VISIT_ANTIBIOTICS.VISIT_SN at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Too high cardinality is not suitable for dictionary -- cardinality: 5980217 at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:96) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321) ... 14 more
解决方案:
1. 请使用其他编码方式,例如“fixed_length”,“integer”等。
2. Apache Kylin中对上亿字符串的精确Count_Distinct示例 – lxw的大数据田地
时间: 2024-10-25 09:23:23