Wednesday, February 17, 2021

PRO WORKSHOP: Large Graph Neural Network Learning with Kubernetes Spark
Join on Hopin
Jintao Zhang
Jintao Zhang
Square, Software Engineer Machine Learning

Graph neural network (GNN) learning on very large graphs have gained great popularity recently, as critical business insights are hidden in huge knowledge graphs with billions of edges, such as social networks, sale transactions, and etc. Graph node embedding (e.g. Node2Vec) and inductive graph representation learning (e.g. GraphSAGE) has been widely used for fraud detection, cross-sell recommendation, and etc.

The technical challenges mainly come from scalability and cost effectiveness. We have developed a highly scalable and reliable Python library based on Spark and PyTorch for graph neural networks under the Fugue project ( Benchmark tests have proved that it can handle graphs with billions of edges and hundreds of millions of nodes within a few hours. The library can easily support Kubernetes Spark with the help of Fugue, and hence deliver a highly cost effective solution in a flexible and uniform framework.