Semantic Data Extractor
I’m always pleasantly surprised when I hear that, what started as a 10 minutes demonstrator of the semantics attached to HTML, is actually used as a tool by a number of developers.
With a name such “semantic data extractor”, it was a bit of a shame that the tool didn’t highlight the usage of GRDDL or RDFa on pages that use either of these technologies; I have just added detection of both of these to the extractor.
As a bonus, I have also added detection of non-semantic markup: at this time, it will detect purely-wrapping
<span>, and tables with a single row or a single column (which have good chances to be layout tables); if you have suggestions for detecting other non-semantic