Spark 阅读官方文档 Spark Quick Start Spark Programming Guide Spark SQL, DataFrames and Datasets Guide Cluster Mode Overview Spark Standalone Mode 重要的概念:resilient distributed dataset (RDD), a collection of elements partitioned across the nodes of the cluste