Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
View analytic
Friday, May 20 • 4:50pm - 5:10pm
Protecting data scientists in healthcare with type safety

Sign up or log in to save this to your schedule and see who's attending!

Healthcare is a veritable zoo of datatypes, implementations, and formats, providing an unrivaled challenge in data integration. Historically, quality control across disparate data has been done by data scientists, in an ad hoc manner, on their analysis platform of choice: Python or R. At Wellframe, we were looking for a concerted solution to handle both complex integrations and the more general problem of connecting data analysis and feature development. We will share our experiences where, as the volume and diversity of our data sources exploded, we realized that Python did not provide the guarantees we required. Toward this end, we have moved all data QC further upstream to our Scala-based infrastructure to let the type system help manage more of the complexity. To accelerate translating insights into features, we have utilized Spark to provide the DataFrames our data scientists know and love, while still being able to take advantage of our hardware. This has turned out to be a mixed blessing: it has increased our pace, but the loss of type safety during analysis allows bugs to be propagated through our system. We will discuss the approaches we are pursuing to improve this from both sides, by migrating from RDD's to Datasets, and moving our analysis from Python to Scala.

Speakers
GR

Gopal Ramachandran

Head of Technology, Wellframe
Gopal is the Head of Technology at Wellframe. Previously, he worked at Massachusetts General Hospital, wherein he was part of the team that worked with Apple on the development of ResearchKit. | | Gopal received his M.D. from Harvard Medical School, and his Ph.D. from MIT.



Friday May 20, 2016 4:50pm - 5:10pm
Markov

Attendees (8)