Adoption of the Linked Data Best Practices in Different Topical Domains

Friday, 17 October, 2014
ISWC 2014
Max Schmachtenberg, Christian Bizer, Heiko Paulheim (UMA)

The central idea of Linked Data is that data publishers support applications in discovering and integrating data by complying to a set of best practices in the areas of linking, vocabulary usage, and metadata provision. In 2011, the State of the LOD Cloud report analyzed the adoption of these best practices by linked datasets within dierent topical domains. The report was based on information that was provided by the dataset publishers themselves via the Linked Data catalog. In this paper, we revisit and update the ndings of the 2011 State of the LOD Cloud report based on a crawl of the Web of Linked Data conducted in April 2014. We analyze how the adoption of the dierent best practices has changed and present an overview of the linkage relationships between datasets in the form of an updated LOD cloud diagram, this time not based on information from dataset providers, but on data that can actually be retrieved by a Linked Data crawler. Among others, we nd that the number of linked datasets has approximately doubled between 2011 and 2014, that there is increased agreement on common vocabularies for describing certain types of entities, and that provenance and license metadata is still rarely provided by the data sources.