Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Thursday, May 19 • 2:10pm - 2:50pm
Deep dive and best practices of Spark streaming

Sign up or log in to save this to your schedule and see who's attending!

In this talk, we will start with the internals of how Spark streaming works and explain how user code is being translated and executed by the Spark streaming engine. Based on these internals, we will then walk over some of the best practices to do efficient state management, efficient joining of streams with historic datasets and achieving high throughput while receiving, processing and writing data. This should help you develop and tune your streaming applications properly by avoiding the common pitfalls.

avatar for Prakash Chockalingam

Prakash Chockalingam

Solutions Architect, Databricks Inc

Thursday May 19, 2016 2:10pm - 2:50pm

Attendees (17)