NeedleBase, by ITA Software*, seems to be a practical way of building a database from unstructured sources. Besides the web scraping (Information Extraction), they have tools for data cleaning/merging, all while maintaining provenance information (i.e. every datum points to its original source). video tutorial
Dapper's Data Mapper is great for making RSS feeds out of mere URLs, e.g. this one I made for Charles Kemp's publications. Semantify might be useful for webmasters.
Freebase has never impressed me.
Tangentially, has anyone used a Web 2.0 application for socially annotating webpages as you visit them (e.g. leaving PostIt notes for your friends to see), or chatting with other people who are visiting them at the same time (social browsing)? I've never seen a good one.
Any thoughts on Flock?
* - soon to be merged into Google, as I found by reading the comments to this post.
Dapper's Data Mapper is great for making RSS feeds out of mere URLs, e.g. this one I made for Charles Kemp's publications. Semantify might be useful for webmasters.
Freebase has never impressed me.
Tangentially, has anyone used a Web 2.0 application for socially annotating webpages as you visit them (e.g. leaving PostIt notes for your friends to see), or chatting with other people who are visiting them at the same time (social browsing)? I've never seen a good one.
Any thoughts on Flock?
* - soon to be merged into Google, as I found by reading the comments to this post.