Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas spanned by multiple horizontal data pipelines, platforms, and algorithms. We are unifying data science and data engineering, showing what really works to run businesses at scale.
Recent years have seen a surge in Image Recognition Technology thanks to deep learning. This has led to more and more applications and companies using this technology to understand and organize their images better. This has also been helped by the introduction of several tools and libraries like caffe, torch, DL4J that make it easy to train deep learning systems for image recognition out of the box. Furthermore recent times have also seen introduction of public api's from companies like Microsoft, Google and other startups that make it really easy for developers to leverage the technology in their products. While this allows developers to get started in Image Recognition with minimal knowledge and infrastructure investment, it might be infeasible in cases where the content volume and recurring costs are high or the domain is too specific to the application. In this talk we would show how to deploy image recognition in production on a budget using open source tools. We would illustrate this using Trulia as an example and show how we develop end-to-end image recognition pipeline from images to predictions to applications using open source tools including caffe, celery, django, hadoop, flume, redis etc.