1、NetworkWordCount(时间间隔内的计数)
1.1、开启发送数据的服务器(TCPServer)
<span style="font-family:Times New Roman;font-size:18px;">cd /home/jianxin/spark java -jar LoggerSimulation.jar 9999 10 java -jar NetworkWordCount.jar localhost 9999 netstat -nalp | grep 9999 lsof -i:9999</span>
1.2、
<span style="font-family:Times New Roman;font-size:18px;">cd /opt/spark/spark/bin bin/run-example org.apache.spark.examples.streaming.NetworkWordCount 9999 // Spark自带例子 spark-submit --class cn.com.szhcf.streaming.NetworkWordCount --jars /home/jianxin/spark/NetworkWordCount.jar /home/jianxin/spark/NetworkWordCount.jar localhost 9999 </span>
2、HDFSWordCount(读取文件系统中的文件计数)
java -jar HdfsWordCount.jar /home/jianxin/spark/sourceDir spark-submit --master local[*] --class cn.com.szhcf.streaming.HdfsWordCount --jars /home/jianxin/spark/HdfsWordCount.jar /home/jianxin/spark/HdfsWordCount.jar /home/jianxin/spark/sourceDir
3、StatefulNetworkWordCount(截止到目前累加计数)
bin/run-example org.apache.spark.examples.streaming.StatefulNetworkWordCount 192.168.3.21 9999 // Spark自带例子 bin/spark-submit --master spark://bigdata0:7077 --class cn.com.szhcf.streaming.StatefulNetworkWordCount --jars /home/jianxin/spark/StatefulNetworkWordCount.jar /home/jianxin/spark/StatefulNetworkWordCount.jar 192.168.3.21 9999
4.1、WindowCounter(时间窗口计数,间隔3秒钟,3秒钟计算一次)
Usage: WindowCounter <master> <hostname> <port> <interval> <windowLength> <slideInterval> XXX 192.168.3.21 9999 3 3 3 bin/spark-submit --master spark://bigdata0:7077 --class cn.com.szhcf.streaming.WindowCounter --jars /home/jianxin/spark/WindowCounter.jar /home/jianxin/spark/WindowCounter.jar XXX 192.168.3.21 9999 3 3 3
4.2、WindowHotWordSort(时间窗口内热门词排序)
spark-submit --master spark://bigdata0:7077 --class cn.com.szhcf.streaming.WindowCounter --jars /home/jianxin/spark/WindowHotWordSort.jar /home/jianxin/spark/WindowHotWordSort.jar XXX 192.168.3.21 9999 3 3 3
5、Kafka(读取Kafka中的数据计数)
bin/run-example org.apache.spark.examples.streaming.KafkaWordCount bigdata0,bigdata1,bigdata2 console-consumer-56431 flume-kafka-1 1 spark-submit --master spark://bigdata0:7077 --class cn.com.szhcf.streaming.KafkaWordCount --jars /home/jianxin/spark/KafkaWordCount.jar /home/jianxin/spark/KafkaWordCount.jar bigdata0,bigdata1,bigdata2 console-consumer-56431 flume-kafka-1 1
6、Flume(FlumeEventCount)
先启动AvroSink(6667) run-example org.apache.spark.examples.streaming.FlumeEventCount bigdata3 6667(要在bigdata3上启动吗?不用) spark-submit --master spark://bigdata0:7077 --class cn.com.szhcf.streaming.FlumeEventCount --jars /home/jianxin/spark/FlumeEventCount.jar /home/jianxin/spark/FlumeEventCount.jar bigdata3 6667 bigdata3启动HTTPSource(6666)和AvroSource(6667) flume-ng agent -n http_self_to_avro_spark -c conf/ -f /opt/flume/flume/conf/http_self_to_avro_spark
7、HBase
时间: 2024-11-08 00:44:01