Spark GraphX

1 Overview

GraphX is a new component in Spark for graphs and graph-parallel computation. At a high level, GraphX extends the Spark RDD by introducing a new Graph abstraction: a directed multigraph with properties attached to each vertex and edge.
Migrating from Spark 1.1

2 Getting Started

3 The Property Graph

Example Property Graph

The EdgeTriplet class extends the Edge class by adding the srcAttr and dstAttr members which contain the source and destination properties respectively.
4 Graph Operators

The core operators that have optimized implementations are defined in Graph and convenient operators that are expressed as a compositions of the core operators are defined in GraphOps. However, thanks to Scala implicits the operators in GraphOps are automatically available as members of Graph. For example, we can compute the in-degree of each vertex (defined in GraphOps) by the following:
4.1 Summary List of Operators

4.2 Property Operators

mapVertices、mapEdges、mapTriplets

Each of these operators yields a new graph with the vertex or edge properties modified by the user defined map function.

Note that in each case the graph structure is unaffected. This is a key feature of these operators which allows the resulting graph to reuse the structural indices of the original graph.

eg: the first one does not preserve the structural indices and would not benefit from the GraphX system optimizations:
     val newVertices = graph.vertices.map { case (id, attr) => (id, mapUdf(id, attr)) }
     val newGraph = Graph(newVertices, graph.edges)
Instead, use mapVertices to preserve the indices:
     val newGraph = graph.mapVertices((id, attr) => mapUdf(id, attr))
4.3 Structural Operators

reverse、subgraph、mask、groupEdges

reverse: The reverse operator returns a new graph with all the edge directions reversed.

4.4 Join Operators

joinVertices、outJoinVertices

4.5 Neighborhood Aggregation

mapReduceTriplets、maxInDegree、collectNeighbors

Aggregate Messages (aggregateMessages)
Map Reduce Triplets Transition Guide (Legacy)
Computing Degree Information
Collecting Neighbors

4.6 Caching and Uncaching

cache、unpersistVertices

5 Pregel API

6 Graph Builders

7 Vertex and Edge RDDs
7.1 VertexRDDs

7.2 EdgeRDDs

8 Optimized Representation

9 Graph Algorithms

9.1 PageRank

9.2 Connected Components

9.3 Triangle Counting

10 Examples

时间： 2024-10-12 12:02:42

Spark GraphX

Spark GraphX的相关文章

Spark Graphx图计算案例实战之aggregateMessages求社交网络中的最大年纪追求者和平均年纪！

Spark GraphX 入门实例完整scala代码

明风：分布式图计算的平台Spark GraphX 在淘宝的实践

Spark GraphX实例(1)

基于Spark GraphX计算二度关系

Spark GraphX学习笔记

Spark GraphX宝刀出鞘，图文并茂研习图计算秘笈与熟练的掌握Scala语言【大数据Spark

Spark Graphx编程指南

Spark GraphX 属性图操作

大数据技术之_19_Spark学习_05_Spark GraphX 应用解析 + Spark GraphX 概述、解析 + 计算模式 + Pregel API + 图算法参考代码 + PageRank 实例