Shuo Xiang, a data scientist at Pinterest, will deliver a seminar titled “Data Engineering on Spark.”
Date: Friday, March 27
Time: 1 p.m.
Location: Henderson Room, Michigan League
Abstract: We are collecting and processing vast amount of data nowadays and we have witnessed how data-driven R&D could fundamentally change various aspects of our life. In this talk, Shuo Xiang will first introduce common data engineering tasks and the associated challenges that are faced by both industrial companies and academic researchers. Powered by Apache Spark, an emerging data processing engine, he will show how to deliver and scale up data engineering services such as data preprocessing, machine learning and real-time analytics. Finally, he will give examples on parallelizing specific machine learning algorithms on top of Spark.
Bio: Shuo Xiang is a Data Scientist of Pinterest, where he works on data pipeline, machine learning service and visualization. He obtained his PhD from Arizona State University in 2014 for research on feature selection modeling and optimization algorithm. He is a contributor of multiple open-source projects including Apache Spark.
Pinterest is a visual bookmarking tool that helps users discover and save creative ideas. Its mission is “to help people discover the things they love, and inspire them to go do those things in their daily lives.”