Loading…
Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Friday, May 20
 

9:50am

Challenges Applying Traditional UI/UX Principles to Machine Learning Products
Software UI and UX design have a few decades of established principles that guide designers in their tasks -- fundamental ideas like predictability, affordance, and graceful recovery from user errors. I have spent almost ten years trying to merge these ideas into the design of novel ML (and ML-assisted) products. Many of the traditional ideas become difficult to apply when you're intrinsically working with poorly understood data. What does it mean for an ML system to be predictable to the user, if the user has a poor understanding of the underlying data? If the ML system is not at least a little bit surprising, it's not doing its job. But how can a user feel confident if a surprising bit of feedback from the ML system is consistent with their own understanding of the world? How can they feel trust in the process when the process is itself usually beyond their understanding? Beyond that, while we have many years of experience on the "psychophysics" of visual perception, we have much less when it comes to user perception of probabilistic or statistical events. What little research we have is often contradictory, e.g. whether people better understand raw probabilities vs odds ratios vs qualitative statements like "very likely / likely / unlikely / etc.". I have spent many years studying people's in-the-field understanding and perceptions of probability in an applied setting, and this has come to shape much of how I design systems today. In many cases, I have found that these concerns motivated changes not just in a product's UX design, but changes in the underlying algorithms themselves. I'll be talking about highlights and best practices from this experience, spanning work from tiny startups and multiple Fortune 50 companies.

Speakers
avatar for Demetri Spanos

Demetri Spanos

CEO / ML Product Design, Marft, Inc.
I've been working on ML-driven products, systems, and research for over 10 years, at every scale: from tiny garage startups to Fortune 500s, premier research universities, and Federal Research labs. I'm especially interested in "flipping" the direction of ML design: currently we take... Read More →


Friday May 20, 2016 9:50am - 10:30am
Markov

9:50am

Coevolution: UX and the Changing Relationship Between Mind and Machine
We are in the midst of a massive experiment in which the relationship between human intelligence and its computer-based analogs is changing at an accelerating pace. The process of both designing technologies such as machine learning (ML) to best meet people’s needs and to adapt ourselves to the new possibilities opened up by these tools is not entirely clear or predictable.The concept of coevolution from biology offers one framework for thinking about these changes. The essential idea is that two, or more, interacting species reciprocally affect each other's evolution. These relationships can range from symbiotic to predator versus prey. Humans have been coevolving with machines for a long time, but in certain areas, the balance of control is fundamentally shifting with advances in areas such as artificial intelligence. For example, we have literally been in the driver’s seat with cars, but that relationship is in the process of a role reversal. In the environments we’re creating, the ability of technology to evolve, in many cases, seems to be outstripping our own. There are many unknowns about how people evolve and adapt along with their processor-based counterparts, including:

  • How might new generations of data analysis tools transform the ways people think about, and solve, problems?
  • As ML and related technologies advance, how will that reshape, or remove, the role of human analysts?
  • How will human-computer interactions and interfaces change as machines become better at mimicking human behavior? 
  • Coevolution can take many forms from adversarial to symbiotic. Will machines eat the proverbial lunch of many human analysts, propel them to a higher level of ability, or some combination of both?
  • Some pairs of species form highly specialized relationships with each other that can be both a real advantage and a tremendous vulnerability. What are the risks of dependence and overspecialization? 
The talk will examine the idea of coevolution in this context and what UX design and data science can do to help people adapt effectively to changing environments.


Speakers
avatar for Hunter Whitney

Hunter Whitney

Sr. Consultant and Author; UX Design and Data Visualization, Hunter Whitney and Associates, Inc.
Hunter Whitney is a consultant, author, and instructor who brings a user experience (UX) design perspective to data visualization. He has advised corporations, start-ups, government agencies, and NGOs to achieve their goals through a strategic design approach to digital products and... Read More →


Friday May 20, 2016 9:50am - 10:30am
Gardner

10:40am

Designing Visualizations of Health that Resonate With Users
Despite being a powerful tool in service of insight, data visualizations often fail mobile users who have come to expect transient and utilitarian mobile experiences. While traditional visualization patterns encourage exploration, mobile users often expect answers to be front and center. This talk will cover how we've adapted our visualization process to meet these platform challenges and address the broad needs of our users who have questions about their health and fitness. I'll use a recent product launch, Fitbit's Reminders to Move as a case study to demonstrate this process.

Speakers
avatar for Alan McLean

Alan McLean

Senior Data Visualization Designer, Fitbit


Friday May 20, 2016 10:40am - 11:00am
Markov

10:40am

Real Time Machine Learning Visualization with Spark
Training models on massive datasets, even on Spark, can be a lengthy process, during which the data scientist has no visibility into how the model is shaping up. The only way to monitor progress is to view the status of the Spark jobs, which provides no information about convergence or other statistics of interest. In this talk, we will discuss how to visualize and monitor the training of machine learning models in real-time with Spark. With this capability, you can monitor machine learning training from one iteration to the next, observe how the model converges during each iteration, visualize the characteristics of the model in real time, and decide if you wish to continue to train the model. In this talk you will learn: How machine learning algorithms are monitored by adding callbacks to K-Means and other algorithms. The Spark task communication infrastructure that has been built, using Akka to deliver messages from the Spark driver to the job submitter. How HTML5 SSE helps to generate real-time progress visualizations

Speakers
avatar for Chester Chen

Chester Chen

Director of Engineering, Alpine Data
Chester Chen is the Director of Engineering and hands on architect at Alpine Data Labs. He manages the analytics platform development as well as contribute to some of the major developments. He has been working with scala on and off since Scala 2.7. He is the founder and organizer... Read More →


Friday May 20, 2016 10:40am - 11:00am
Ada

11:10am

Visualizing the silent force of the Bay
Why does data play a vital role to the life of a kiteboarder surfing beneath the Golden Gate bridge?

The San Francisco bay waters have a silent force that affects all object within it. For the past 9 years, I gained significant experience with struggling to navigate, understand and predict the tidal current conditions when sailing out of Crissy Field. This talk explores user-centered tidal data visualization on mobile devices focusing on the San Francisco Bay. Come to learn about next-level archaic and contemporary ways of environmental data visualization, and how technology can be used to help users safely traverse the bay.

Speakers
avatar for Boriana Viljoen

Boriana Viljoen

Product Designer, Castlight Health
A product designer interested in data visualization, healthcare tech, tidal currents and various watersports :)


Friday May 20, 2016 11:10am - 11:30am
Markov

1:10pm

Designing for Inconsistency
A primary goal of designing any interface is to establish consistent, usable patterns that provide familiar, rapid paths to and from information. Nowhere is this more true than in the world of data products, where the express intent of our work is to deliver the best information, at the right moment, with the greatest possible speed. But data isn’t perfect. Sometimes it isn’t even there. Especially in an early stage startup. How do we get to market while our dataset is still under construction? This talk will share strategies for accommodating variability in data when designing interfaces for data-driven products, highlighting their application in the design process at Ravel Law. We'll take a look at the nuances of legal data, discuss the notion of Primary Task and share lessons learned from the the design of a brand new analytics product.

Speakers
avatar for Brian Studwell

Brian Studwell

Product Designer, Ravel Law
I'm a product designer carrying a Master of Human-Computer Interaction + Design from the University of Washington. I practice at the intersection of human behavior, complex information and emerging technologies imagining new ways people can work, play and learn. I bring experience... Read More →



Friday May 20, 2016 1:10pm - 1:30pm
Ada

1:10pm

Visualizing big data: live coding an interactive dataviz app with opensource tools
Ever considered making your dataviz interactive by linking it to a database? Yeah, but then you’re entering a world of pain: authenticating users, writing the SQL code, making your widgets interact with each other… In this talk, I build a new dataviz app from scratch. Using source data representing a couple hundred million rows in an Amazon Redshift database, I show which open source tools are used to create an interactive, secure Javascript app using D3js and OpenBouquet. I demonstrate how much value is provided by connecting a dataviz to raw source data – even if it’s big. I discuss the choice of tools and, by the end, you’ll feel confident about achieving the same results even without any database skills.

Speakers
avatar for Olivier Balbous

Olivier Balbous

Software Architect, OpenBouquet
Olivier is Chief Software Architect at Squid Solutions. Olivier is also the key driver of R&D of the OpenBouquet API and Javascript SDK. Before joining Squid, he spent 15 years as Engineering Manager designing front and back-end web platforms for companies such as the Banque Populaire... Read More →



Friday May 20, 2016 1:10pm - 1:50pm
Markov

2:10pm

Discoveries in using behavioral sensor data for authentication
There are many ways to prove your identity other than typing in your password or holding up your government issued ID. Human behaviors translated into data points highlight the fact that everyone is inherently unique. By utilizing the overabundance of sensor data coming from the growing number of connected devices used today, we are able to gain a deeper understanding of both how sensor data differs from person to person, and how we can use these unique data points to authenticate users both online and in the real world. As technology is rapidly advancing, so too must security. Join us as we take a look at human behavior from a data POV, and highlight some very interesting trends that we’ve discovered along the way.

Speakers
avatar for John Whaley

John Whaley

Founder/CEO, UnifyID
I work in the broad area of computer systems, especially operating systems, virtualization, computer security, finding and avoiding software defects, algorithms, performance, parallelization, concurrency, scalability, mobility, compilers, program analysis, programming languages, APIs... Read More →



Friday May 20, 2016 2:10pm - 2:50pm
Gardner

2:10pm

Pro-Active: Designing a genuinely helpful SQL interface, that even power-users love
Microsoft’s Clippy showed the world that a computer can use context clues to determine when a person is writing a letter, but that it probably should keep its digital mouth shut. In this talk, we’ll discuss designing a query tool that offers contextualized suggestions (and even warnings) to help users write accurate and performant queries quickly… without making them want to force quit in fury—whether they're less-techy Excel folks or committed command line coders.

Over the course of this fast-paced and fun session we’ll 
  • incorporate learnings from Don Norman, Cliff Nass, and other HCI luminaries (using case studies from self-driving cars and a variety of software examples, both cutting-edge and totally familiar), as well as lessons from user research on beta releases
  • geek out about predictive text (incorporating syntactic, semantic, and social clues) 
  • explore how to provide suggestions or interventions at just the right time and in just the right way to maximize utility and minimize frustration
  • and much more…

Speakers
avatar for Aaron Kalb

Aaron Kalb

Head of Product, Alation
Aaron has spent his career crafting delightful and empowering human-computer interactions, especially through natural language interfaces. After leaving Stanford with a BS and an MS in Symbolic Systems and working at Apple on iOS and Siri (doing engineering, research, and design in... Read More →


Friday May 20, 2016 2:10pm - 2:50pm
Markov

3:00pm

Data Design Challenges for Enterprise IoT Applications: Semantic Sensor Network Ontologies
This presentation will outline the problem of designing for data-intensive applications and suggest some possible solutions.  Enterprise IoT presents a series of non-trivial challenges for designers. Unlike many consumer applications, enterprise IoT systems tend to be unintuitive to non-expert users and designers alike. Designers of these systems are challenged to represent the data that underlies these experiences. Taking a data-first approach to designing these systems will result in better applications and less design angst.    

Speakers
avatar for Zachary Taschdjian

Zachary Taschdjian

Interaction/Product Lead, General Electric Digital
Zac makes tools for interacting with data and using data to drive business and user value, typically for enterprise platforms/products. His specialty is time series and graph data visualization for enterprise IoT applications. His skill set blends interaction design/HCI, visualization... Read More →



Friday May 20, 2016 3:00pm - 3:20pm
Ada

3:00pm

Remixing Design with Data
Software designers and data scientists have not traditionally been situated together within companies to team up to create great products. But when data scientists do join the fold, designers have an opportunity to tap into insights to created well-informed data-driven designs. I’ll discuss a number of recent intersections between design and data, including a number of projects at Platfora where we’ve remixed traditional design methods to create a more data-driven process. The talk will cover the role of data in a designer’s process, advice for building a data-driven product culture, an overview of Platfora research projects that offer a new spin on the tried-and-true research methods:
  • Prioritizing product and usability improvements using the Kano model
  • Understanding usage patterns to gauge feature adoption
  • Validating, challenging, and revising product personas using archetypal behavior derived from telemetry clickstreams

Speakers
avatar for James Mulholland

James Mulholland

Manager, UX Design and Research, Platfora
James loves to bring design and user experience to the world of big data and connected information. His background includes human-computer interaction, data visualization, organizational behavior, branding, and even technical illustration. Before joining Platfora as their first... Read More →



Friday May 20, 2016 3:00pm - 3:20pm
Markov

3:30pm

How Data helps UX evolves at realtor.com
realtor.com, a News Corp company and the fastest growing online real estate service provider in US, invites you to have a closer look on a data-driven design process behind a real-time, informed, and interactive online real estate application. Through case studies, this presentation will provide you an overview of how insight of user activities is analyzed in systematic approaches to drive product renovation. Key learning objectives also include how user experience of realtor.com's data-centric products is enhanced by data visualization techniques.

Speakers
avatar for Ian Lin

Ian Lin

Lead Data Visualization Designer, realtor.com
Ian Lin is the Lead Data Visualization Designer at Move, Inc., a News Corp company and the operator of realtor.com. He is a hybrid designer/developer in UX, Front-End Engineering, and Data Viz. He helps visualizing insight from Data Science/Analytics, streamlines UI workflows with... Read More →


Friday May 20, 2016 3:30pm - 3:50pm
Ada

3:30pm

Urban Heartbeat: Data Experiments with Place
For this talk, I will discuss about how I collected, analyzed, and visualized real-time civic data from open data sources and APIs. As part of a fellowship with Stamen Design and Gray Area Arts Foundation last year, I created a project called Urban Heartbeat. My work explored civic, social, and environmental data at the neighborhood level. I collected and analyzed data from June to August 2015 and analyzed the data in a series of experiments. I performed spatial and content analysis of social media to discover the location of people’s activities in the neighborhood. I used data from DataSF.org, Instagram, Twitter, Foursquare, NextBus, Waze, Factual, Weather Underground, Craigslist, and other sources. The technology used in this project includes D3.js, Firebase, CartoDB, and Node.js (including Node libraries for color analysis and image quantization; geospatial analysis, network analysis, natural language processing, sentiment analysis, and machine learning). The resulting artwork was a generative data installation at the Grand Theater in San Francisco’s Mission District. The art allows passersby to explore their neighborhood via visualizations at the urban scale. My project work has been exhibited in Geneva, Bangalore, Pittsburg, and is currently on display in San Francisco. In 2016, my work has continued and I’m partnering with architects and urban planners in the Bay Area to analyze urban space, make planning decisions, and engage with local communities.

Speakers
avatar for Steve Pepple

Steve Pepple

Product Designer and Developer, OpenGov
Steve Pepple is a Bay Area designer and software developer who works to improve city streets and civic information systems. He a product designer at OpenGov, where he designs software that improves how governments spend money, make decisions, and communicate with citizens. His... Read More →




Friday May 20, 2016 3:30pm - 3:50pm
Gardner

3:30pm

When Visualization Best Practices Fall On Deaf Ears
Data Visualization enthusiasts have, by now, listened to a lot of experts and read a million books which teach the importance of using the right visual encodings for effective user perception. These techniques or best practices are often backed by scientific research. But, what if your customer asks for something, that's exactly from the "dont's" section of your visualization rulebook? What often goes unaccounted for, is that the audience of these dashboards are neither necessarily trained in data visualization, nor are they even aware of the close-knit data visualization community active on social media. And, that's very natural. The audience of your dashboards are often business domain experts, who have much broader problems to solve, than to educate themselves about good data visualization techniques. So, what happens when your "Visualization Gyan" falls on deaf ears? Do you build something that your "audience wants" ? Or, do you decide to use your engineering excellence to to give them "what's right"?

Speakers
avatar for Akash Mukherjee

Akash Mukherjee

Data Products for People Insights, Facebook




Friday May 20, 2016 3:30pm - 3:50pm
Markov

4:00pm

UX Techniques Supporting Varying Levels of Aggregation in Data Selection and Visualization
Some technologies for building data visualizations lend themselves to dynamic applications and interactivity (D3, HighCharts). Other technologies offer a lot of flexibility and precision (SQL), ease-of-use (SAP, Excel), or breadth of visualization types (Tableau, Stata). Doing data exploration at varying levels of aggregation is still a challenge for all of these tools. This talk will explore use cases involving visualizations which require varying levels of aggregation in the same visualization, and some tools, techniques, and technologies to support those visualizations. Examples will include selection techniques in SQL, data preparation scripts to prepare data for D3 visualizations, and using Excel for prototyping and checking conclusions. ClearStory Data has used a combination of Spark, D3, and React to create a web-based application which makes data combination and exploration clear, interactive, and maintainable even for the largest data sets. This talk will also discuss findings specifically relevant to supporting interactivity and clarity in data exploration of varying aggregation levels.

Speakers
avatar for Katherine Ahern

Katherine Ahern

Manager, Analysis and Visualizations, ClearStory Data
Katherine Ahern manages the Analysis and Visualizations group at ClearStory Data, where she focuses on usability for complex analytic workflows, including getting accurate results combining diverse data sources. Before coming to ClearStory she worked on a web-based analytics tool... Read More →



Friday May 20, 2016 4:00pm - 4:40pm
Markov

4:50pm

Importance of rethinking data visualization
Visualizations of quantitative data typically just consists of displaying multiple line, bar, and pie charts in a dashboard leaving the viewer to aggregate and correlate the data in order to synthesize a meaningful story. The effects can vary from somewhat informative to disastrously misleading. The industry is mislead by products with standardized interfaces that aim to present performance information effectively through visualization but often miss the mark due to lack of domain expertise and creativity. This presentation show how much more powerful quantitative visual data can be with the application of domain expertise. The presentation will kick off with a review of Edward Tufte's analysis of the Challenger space shuttle disaster and move onto typical industry monitoring challenges such as displaying system load, analyzing the efficiency of database requests and monitoring latency response.

Speakers
avatar for Kyle Hailey

Kyle Hailey

Technial Evangelist, Delphix
Kyle has worked on IT performance for over 20 years. He was a principal designer at Oracle on the Enterprise Manager performance monitoring interface which was implemented under waterfall methods. Following that he was the designer and product manager of DB Optimizer at Embarcadero... Read More →


Friday May 20, 2016 4:50pm - 5:10pm
Gardner

4:50pm

Towards a virtual reality meta-Earth
We live in an era abundant of data and data science thrives,. While there is sufficient meta-data to reconstruct a representation of the real world, with digital maps being a prime example, data as abstract entities are effectively invisible to the everyday person. In this talk, we how we use meta-data to build a virtual meta-earth: -introduce data sources we employ as reality's blueprint -visualization technologies to bring these data to life -Utilization of scientific models to simuate day light cycles, polution levels -integration with virtual reality, as a prelude to the virtual worlds described in science fictions

Speakers
avatar for Bo Huang

Bo Huang

Principal Software Engineer, SenseEarth.com
Bo Huang has been a game programmer empowering mobile devices all over the world to enjoy Pacman and Time Crisis, to simulating photons refracting through paint producing wide range of glittering appearance as an engineering scientist. He combines realistic rendering technologies... Read More →



Friday May 20, 2016 4:50pm - 5:10pm
Ada