Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Thursday, May 19 • 11:10am - 11:30am
Know the air you are breathing

Sign up or log in to save this to your schedule and see who's attending!

This talk will demonstrate how to use a publicly available dataset of air quality sensor readings, clean and query the data, and visualize the data enabling the government and public to take appropriate actions. I will use a publicly available dataset from the epa.gov and the The U.S. Department of State Mission China air quality monitoring program. The set consists of data from devices that are sending measurements for San Franciso and Beijing. The air quality data measurements is enriched with extra data in the form of weather data from weather.gov to give the data additional context. We will then visualize the enriched data and see how the data relates to Air Quality. The technology stack leveraged comprises of Mesos, Zookeeper, Marathon, Docker, Riak TS, Kafka, Spark, Zeppelin. 

avatar for Seema Jethani

Seema Jethani

Director of Product Management, Basho Technologies
Hello! I currently lead Product Management at Basho Technologies for Basho's flagship products Riak KV and Riak TS, distributed NoSQL databases.Prior to joining Basho, I held Product Management and Strategy positions at Dell, Enstratius and IBM. I hold an MBA degree from Duke University’s... Read More →

Thursday May 19, 2016 11:10am - 11:30am

Attendees (7)