错误1:发布topologies到远程集群时,出现Nimbus host is not set异常。异常内容如下所示:
[[email protected] bin]# ./storm jar /home/clx/storm-starter.jar storm.starter.WordCountTopology wordcount
Running: export STORM_JAR=/home/clx/storm-starter.jar; java -client -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -cp /home/clx/storm-0.5.4/storm-0.5.4.jar:/home/clx/storm-0.5.4/lib/log4j-1.2.16.jar:/home/clx/storm-0.5.4/lib/tools.macro-0.1.0.jar:/home/clx/storm-0.5.4/lib/jline-0.9.94.jar:/home/clx/storm-0.5.4/lib/commons-lang-2.5.jar:/home/clx/storm-0.5.4/lib/core.incubator-0.1.0.jar:/home/clx/storm-0.5.4/lib/junit-3.8.1.jar:/home/clx/storm-0.5.4/lib/compojure-0.6.4.jar:/home/clx/storm-0.5.4/lib/zookeeper-3.3.2.jar:/home/clx/storm-0.5.4/lib/clojure-contrib-1.2.0.jar:/home/clx/storm-0.5.4/lib/httpcore-4.0.1.jar:/home/clx/storm-0.5.4/lib/commons-logging-1.1.1.jar:/home/clx/storm-0.5.4/lib/commons-io-1.4.jar:/home/clx/storm-0.5.4/lib/ring-core-0.3.10.jar:/home/clx/storm-0.5.4/lib/httpclient-4.0.1.jar:/home/clx/storm-0.5.4/lib/commons-codec-1.3.jar:/home/clx/storm-0.5.4/lib/jzmq-2.1.0.jar:/home/clx/storm-0.5.4/lib/jvyaml-1.0.0.jar:/home/clx/storm-0.5.4/lib/commons-fileupload-1.2.1.jar:/home/clx/storm-0.5.4/lib/slf4j-log4j12-1.5.8.jar:/home/clx/storm-0.5.4/lib/servlet-api-2.5.jar:/home/clx/storm-0.5.4/lib/json-simple-1.1.jar:/home/clx/storm-0.5.4/lib/ring-jetty-adapter-0.3.11.jar:/home/clx/storm-0.5.4/lib/slf4j-api-1.5.8.jar:/home/clx/storm-0.5.4/lib/jetty-util-6.1.26.jar:/home/clx/storm-0.5.4/lib/joda-time-1.6.jar:/home/clx/storm-0.5.4/lib/libthrift7-0.7.0.jar:/home/clx/storm-0.5.4/lib/commons-exec-1.1.jar:/home/clx/storm-0.5.4/lib/clojure-1.2.0.jar:/home/clx/storm-0.5.4/lib/ring-servlet-0.3.11.jar:/home/clx/storm-0.5.4/lib/clj-time-0.3.0.jar:/home/clx/storm-0.5.4/lib/hiccup-0.3.6.jar:/home/clx/storm-0.5.4/lib/clout-0.4.1.jar:/home/clx/storm-0.5.4/lib/jetty-6.1.26.jar:/home/clx/storm-0.5.4/lib/servlet-api-2.5-20081211.jar:/home/clx/storm-starter.jar:/root/.storm:/home/clx/storm-0.5.4/bin storm.starter.WordCountTopology wordcount
0 [main] INFO backtype.storm.StormSubmitter - Jar not uploaded to master yet. Submitting jar...
Exception in thread "main" java.lang.IllegalArgumentException: Nimbus host is not set
at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:30)
at backtype.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:17)
at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:78)
at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:71)
at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:50)
at storm.starter.WordCountTopology.main(WordCountTopology.java:81)
解决方法:在~/.storm/目录新建storm.yaml文件,~代表用户主目录。storm.yaml文件内容:nimbus.host: "10.0.0.24"。重启nimbus后台程序,异常消失。
错误2:启动Supervisor时,出现java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.so.0.0.0: libzmq.so.1: cannot open shared object file: No such file or directory异常。异常内容如下所示:
2011-12-01 11:58:47 worker [ERROR] Error on initialization of server mk-worker
java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.so.0.0.0: libzmq.so.1: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1803)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1728)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1028)
at org.zeromq.ZMQ.<clinit>(ZMQ.java:34)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at zilch.mq__init.load(Unknown Source)
at zilch.mq__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at clojure.lang.RT.loadClassForName(RT.java:1578)
at clojure.lang.RT.load(RT.java:399)
at clojure.lang.RT.load(RT.java:381)
at clojure.core$load$fn__4511.invoke(core.clj:4905)
at clojure.core$load.doInvoke(core.clj:4904)
at clojure.lang.RestFn.invoke(RestFn.java:409)
at clojure.core$load_one.invoke(core.clj:4729)
at clojure.core$load_lib.doInvoke(core.clj:4766)
at clojure.lang.RestFn.applyTo(RestFn.java:143)
at clojure.core$apply.invoke(core.clj:542)
at clojure.core$load_libs.doInvoke(core.clj:4800)
at clojure.lang.RestFn.applyTo(RestFn.java:138)
at clojure.core$apply.invoke(core.clj:542)
at clojure.core$require.doInvoke(core.clj:4869)
at clojure.lang.RestFn.invoke(RestFn.java:422)
at backtype.storm.messaging.zmq$loading__4410__auto__.invoke(zmq.clj:1)
at backtype.storm.messaging.zmq__init.load(Unknown Source)
at backtype.storm.messaging.zmq__init.<clinit>(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at clojure.lang.RT.loadClassForName(RT.java:1578)
at clojure.lang.RT.load(RT.java:399)
at clojure.lang.RT.load(RT.java:381)
at clojure.core$load$fn__4511.invoke(core.clj:4905)
at clojure.core$load.doInvoke(core.clj:4904)
at clojure.lang.RestFn.invoke(RestFn.java:409)
at clojure.core$load_one.invoke(core.clj:4729)
at clojure.core$load_lib.doInvoke(core.clj:4766)
at clojure.lang.RestFn.applyTo(RestFn.java:143)
at clojure.core$apply.invoke(core.clj:542)
at clojure.core$load_libs.doInvoke(core.clj:4800)
at clojure.lang.RestFn.applyTo(RestFn.java:138)
at clojure.core$apply.invoke(core.clj:542)
at clojure.core$require.doInvoke(core.clj:4869)
at clojure.lang.RestFn.invoke(RestFn.java:409)
at backtype.storm.messaging.loader$mk_zmq_context.doInvoke(loader.clj:8)
at clojure.lang.RestFn.invoke(RestFn.java:437)
at backtype.storm.daemon.worker$fn__3074$exec_fn__858__auto____3075.invoke(worker.clj:109)
at clojure.lang.AFn.applyToHelper(AFn.java:187)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(core.clj:540)
at backtype.storm.daemon.worker$fn__3074$mk_worker__3216.doInvoke(worker.clj:78)
at clojure.lang.RestFn.invoke(RestFn.java:513)
at backtype.storm.daemon.worker$_main.invoke(worker.clj:247)
at clojure.lang.AFn.applyToHelper(AFn.java:174)
at clojure.lang.AFn.applyTo(AFn.java:151)
at backtype.storm.daemon.worker.main(Unknown Source)
解决方法1:export LD_LIBRARY_PATH=/usr/local/lib
解决方法2:编辑/etc/ld.so.conf文件,增加一行:/usr/local/lib。再执行sudo ldconfig命令,重启Supervisor,异常消失。
错误3:发布drpc类型的topologies到远程集群时,出现空指针异常,连接drpc服务器失败。
解决方法:在conf/storm.yaml文件中增加drpc服务器配置,启动配置文件中指定的所有drpc服务。内容如下所示:
drpc.servers:
- "drpc服务器ip"
错误4:客户端调用drpc服务时,worker的日志中出现Failing message,而bolt都未收到数据。日志如下所示:
2011-12-02 09:59:16 task [INFO] Failing message [email protected]: source: 1:27, stream: 1, id: {-5919451531315711689=-5919451531315711689}, [foo.com/blog/1, {"port":3772,"id":"5","host":"10.0.0.24"}]
解决方法:主机名,域名,hosts文件配置不正确会引起这类错误。检查并修改storm相关机器的主机名,域名,hosts文件。重启网络服务:service network restart。重启storm,再次调用drpc服务,成功。Hosts文件中必须包含如下内容:
【nimbus主机ip】 【nimbus主机名】 【nimbus主机别名】
【supervisor主机ip】 【supervisor主机名】 【supervisor主机别名】
【zookeeper主机ip】 【zookeeper主机名】 【zookeeper主机别名】
错误5:发布topologies时,出现不能序列化log4j.Logger的异常。
解决方法:使用slf4j代替log4j。
错误6:bolt在处理消息时,worker的日志中出现Failing message。
解决方法:提交Topology时设置适当的消息的超时时间,比消息原本的超时时间更长,如下所示:
conf.setMessageTimeoutSecs(60);//默认为30秒。