Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Tuesday, May 17 • 1:10pm - 1:50pm
Data & Metadata at the Internet Archive

Sign up or log in to save this to your schedule and see who's attending!

The Internet Archive has many petabytes of archived webpages, books, videos, and images. Recently we've been making a big effort to make our data and metadata more accessible to outside users. I'll show off some of the methods to download stuff from the Archive, and then I'll show some example projects using this data.

avatar for Greg Lindahl

Greg Lindahl

CTO, Presearch Labs
I'm currently working on adding search to the Internet Archive's "Wayback Machine" web archive, but I'm interested in all kinds of data topics.

Tuesday May 17, 2016 1:10pm - 1:50pm

Attendees (8)