Peter Warden, best known for his work analyzing the relationships between over 220 million Facebook users, now sets his sights on a more civic-oriented project.
Peter Warden has put together a collection of free, open-source tools that can be used to cull massive amounts of data on essential information such as location. The “Data Science Toolkit” can turn street addresses into coordinates, extract text from PDFs, Word Documents, Excel Spreadsheets, and various images, filter geographic locations from news articles, and determine political districts based on neighborhood information.
Warden believes the big data revolution will be making these services more affordable and accessible for all. Instead of overpaying a company for their massive servers to perform these tasks, they can now be done on a budget. Maybe most importantly, his toolkit is a VM (Virtual Machine) “completely isolated operating system installation within your normal operating system” that requires no network connection and has no API limitations.
[via Flowing Data]