Interlinking DBpedia with other Data Sets


Linked Data is a method to publish data on the Web and to interlink data between different data sources. Linked Data can be accessed using Semantic Web browsers, just as traditional Web documents are accessed using HTML browsers. However, instead of following document links between HTML pages, Semantic Web browsers enable surfers to navigate between different data sources by following RDF links. RDF links can also be followed by robots or Semantic Web search engines in order to crawl the Semantic Web. See PDF DocumentLinked Data – The Story so far and How to publish Linked Data on the Web for more information about Linked Data.


The DBpedia data set is interlinked with various other data sources (see voiD description). The diagram below (Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ ) gives an overview of some of these data sources:



Data Set Description Number of Links
Amsterdam Museum Information about cultural heritage objects related to the city of Amsterdam. 630
BBC Wildlife Finder Information about wildlife biota, habitats, adaptations and ecozones. 450
Book Mashup Provides information about books. 9,000
Bricklink Unofficial Lego marketplace. 10,000
CORDIS Information on all EU programmes and projects. 300
Dailymed Provides information about drugs. 900 Eli Lilly and Company
DBLP Bibliography Provides information about scientific publications. 200 Tim Berners-Lee
DBTune Provides freely available data concerning music. 840
Diseasome Provides information about diseases and genes. 2,300 Asthma
Drugbank Provides information about drugs and genes. 4,800 ZNF3
EUNIS Information on species, habitat types and sites. 11,000
Eurostat (Linked Statistics) Covers a number of areas from economy over demographics to trade and transport data. 250
Eurostat (WBSG) Provides information about European countries and regions. 140 France
CIA World Factbook Provides information about countries. 550 France
flickr wrappr A wrapper around flickr that tries to generate a photo collection for each DBpedia concept. 4,000,000 Brandenburg Gate
Freebase A open-license database about millions of things from various domains. 3,900,000 Tetris
GADM Spatial database of the location of the world's administrative areas. 39,000
GeoNames Provides information about geographic features. 425,000 Cambridge
GeoSpecies Information on biological orders, families, species as well as species occurrence records and related data. 16,000
Global Health Observatory Provides access to statistical data about health problems. 200
Project Gutenberg Provides information about authors and open access to their work. 2,500 John Bunyan
Italian Public Schools Provides information on public schools in Italy. 5,800
LinkedGeoData Spatial knowledge base. 104,000
LinkedMDB Provides information on movies. 14,000
MusicBrainz Provides information about artists and music. 23,000 Portishead
New York Times Links between NYT subject headings and DBpedia concepts. 9,700 South Korea
OpenCyc A open-license version of the Cyc Ontology. 27,000 Woody Allen
OpenEI (Open Energy Info) Provides energy-related information. 680
Revyu Universal reviews. 6
Sider Provides information about side effects of drugs. 2000 Claudication
TCMGeneDIT Information on traditional Chinese medicine, genes and diseases. 900
UMBEL A lightweight, subject concept reference structure derived from Cyc. 900,000 Place
US Census Provides US Census data. 12,600 Los Angeles
WikiCompany Provides information on companies. 8,300
Wikidata Structured data related to Wikipedia items. 5,200,000
WordNet W3C RDF/OWL representation of the Word Net ontology. 470,000 Air France
YAGO Cross-domain knowledge base. 2,900,000 instance links, 41,000,000 type statements

The W3C Linking Open Data Community Project



DBpedia is part of the W3C Linking Open Data community project, an effort to publish and interlink various open data sources. As of September 2011, this effort has built a Web of interlinked data sources that amounts to more than 31 billion RDF triples. Please refer to the project's data sets page for a list of all published data sets.

Linking to DBpedia from Your Dataset

The Silk Link Discovery Framework can be used to generate new links to DBpedia based on user-provided link specifications which are expressed using the Silk Link Specification Language (Silk-LSL).

Linking to DBpedia from Your FOAF Profile


As Wikipedia contains articles about many general-purpose concepts, DBpedia can also be seen as a huge ontology that assigns URIs to plenty of concepts and backs these URIs with with dereferenceable RDF descriptions.


If you have a FOAF profile and you need terms for describing your interests or your location, you might consider using DBpedia URIs. This will allow RDF browsers like Disco, Tabulator, or the OpenLink Data Web Browser, to browse from your FOAF profile into DBpedia. The links also allow clients like the Semantic Web Client Library to answer SPARQL queries over both data sources.


The example below shows an RDF link from RDF DocumentRichard Cyganiak's FOAF profile which states that he is based near Berlin.



You can use the Disco browser to follow this link by clicking here.


DBpedia URIs can also be used to express your interests within your FOAF profile. For example:



Another use case for DBpedia URIs could be to categorize or tag blog posts, wiki pages, or other documents. For example:



An interesting project that allows you to review anything that has a URI is the RevYu project run by Tom Heath. A Rev Yu review about a film in DBpedia could look like this:


@prefix rev: <http://purl.org/stuff/rev#> . 
@prefix foaf: <http://xmlns.com/foaf/0.1/> . 
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 

<> a rev:Review; 
   rdfs:label "Review of Cold Mountain, by Alice";  
   foaf:primaryTopic <http://DBpedia.org/resource/Cold_Mountain_%28film%29> ; 
   rev:text "This movie sucks. Miss it."; 
   rev:rating 1; 
   rev:minRating 1; 
   rev:maxRating 5; 
   rev:reviewer <http://example.com/alice/foaf.rdf#me> .

Inlinks to DBpedia


DBpedia is being linked to from a variety of datasets. The sum of inlinks is 39,012,034.


Data Set Number of Links**
ACM (RKBExplorer) 5
AEMET metereological dataset 82
AGROVOC 993
Airports 9,761
Alpine Ski Racers of Austria 921
Amsterdam Museum 43
BBC Music 23,000
BBC Programmes 23,237
BBC Things 4,556
BBC Wildlife Finder 415
BFS LD 261
BibBase 53
Bible Ontology 371
Brazilian Politicians 1,500
Bricklink 25,797
Chronicling America 10,000
CiteSeer (RKBExplorer) 1
Classical (DBtune) 3,000
Climbing 300
cnr.it 34,706
CORDIS 285,256
CORDIS (RKBExplorer) 16
Courseware (RKBExplorer) 41
DailyMed 2,552
DataGovIE 70
Datos.bcn.cl 568
datos.bne.es 36,431
DBLP (FU Berlin) 100,000
DBLP (RKBExplorer) 2
DBpedia in Portuguese 365,839
dbpedia lite 10,000,000
DBTropes 6,000
Didactalia (GNOSS) 8,824
Discogs in RDF 5,169
Diseasome 1,943
DrugBank 4,220
EARTh 1,862
ECCO-TCP Eighteenth Century Texts Linked Data 50
ECS Southampton (RKBExplorer) 11
education.data.gov.uk 1,697
El Viajero's tourism dataset 3,093
Enipedia – Energy Industry Data 1,365
ERA (RKBExplorer) 543
ESD standards 25
EU: fintrans.publicdata.eu 199,168
EUNIS 5,683
EURES 2,146
Europeana 1,304
Eurostat (FU Berlin) 129
Eurostat (OntologyCentral) 45
EUTC Productions 166
EventMedia 15,420
FAO geopolitical ontology 195
FAO LD 673
farmers-markets-geographic-data-united-states 52
Finnish Municipalities 336
Fishes of Texas 15,241
flickr wrappr 3,400,000
Freebase 3,348,530
GBA Thesaurus 100
GEMET 3,005
GeoLinkedData 51
GeoSpecies Knowledge Base 11,805
GESIS 5,024
gnoss.com 506
Goodwin Family 500
GoogleArt wrapper 1,632
GovTrack 470
GovWILD 5,845
Greek DBpedia 45,000
GTAA 25,844
Hellenic FBD 104,117
Hellenic PD 21,916
Institutions and Bodies of the European Union 154
ISTAT Immigration (LinkedOpenData.it) 319
Italian Museums 2,894
John Peel (DBtune) 1,143
Klappstuhlclub 50
Last.FM (rdfize) 23,000
Lexvo 2,577
LIBRIS 4,669
Lichfield District Council – Spending 17
lingvoj 215
Linked Clean Energy Data (reegle.info) 330
Linked Crunchbase (OntologyCentral) 80
LinkedCT 25,476
Linked EDGAR (OntologyCentral) 50
LinkedGeoData 53,024
LinkedLCCN 10,911
LinkedMDB 30,354
Linked Open Colors 16,000,000
Linked Open Numbers 320
lobid-organisations 3,520
lobid-Resources 5,794
lod.sztaki.hu 13,034
LODE 10,000
Lotico 65
Magnatune (DBtune) 233
MARC Codes List 599
meducator 932
morelab 38
Mortality (EnAKTing) 5
Moseley Folk 18
MusicBrainz (Data Incubator) 76,171
MusicBrainz (DBTune) 64,000
myExperiment 2,586
My Family Lineage 2,254
NASA (Data Incubator) 61
New York Times 10,359
Nomenclator Asturias 2010 78,859
Norwegian Medical Subject Headings (MeSH) 316
NSF (RKBExplorer) 1
NSZL Catalog 6,285
NVD 502
Ocean Drilling – Codices 3,022
Ontos News Portal 6,935
OpenCalais 1,000
Open Corporates 500
OpenData Thesaurus 50
OpenEI.org 52,546
Open Election Data Project 87
Open Library (Talis) 1,633
Openly Local 400
Organisation for Economic Co-operation and Development (OECD) Linked Data 2,613
OS (RKBExplorer) 156
P20 2,500
PBAC 1,607
Pleiades 127
Pokedex (Data Incubator) 493
Poképédia 493
Polythematic Structured Subject Heading System 3,000
ProductDB 193
Product Types Ontology 300,000
Public Library of Veroia 4,197
radatana 30,346
RAE2001 (RKBExplorer) 1
RDFohloh 1,000
Rechtspraak.nl 575
reference.data.gov.uk 22
research.data.gov.uk 3
RESEX (RKBExplorer.com) 11
Revyu 29
Scholarometer 1,000
sears.com 100
SEC (rdfabout) 86
Semantic CrunchBase 250
Semantic XBRL 63
SIDER 2,126
smcjournals 11
Source Code Ecosystem Linked Data 2,100
SSW Thesaurus 300
STITCH 123
STW 3,000
Surge Radio 1,000
TaxonConcept 147,877
TCMGeneDIT Dataset 1,400
Telegraphis 651
Thesaurus W 627
The View From 31
totl.net 500
Transparency International LD 183
transport.data.gov.uk 3,768
Turismo de Zaragoza 5,469
Twarql 981,415
TWC LOGD 2,039
Uberblic.org 1,196
UK Legislation 33
UMBEL 257
UN/LOCODE (RKBExplorer) 240
URIBurner 1,000
VIAF 10,000
VIVO Cornell 58
VIVO Indiana 58
VIVO UF 58
Weather Stations 1,123
Wiki (RKBExplorer) 19
WordNet (RKBExplorer) 38
World Bank LD 380
YAGO 2,625,671
Yahoo Geoplanet RDF 248
yovisto 300
Zhishi.me 193,000


 
There are no files on this page. [Display files/form]
There is no comment on this page. [Display comments/form]

Information

Last Modification: 2014-09-09 10:32:11 by Daniel Fleischhacker