Accessing the DBpedia Data Set over the Web
The DBpedia data set can be accessed online via a SPARQL query endpoint and as Linked Data.
- 1 Querying DBpedia
- 1.1 Public SPARQL Endpoint
- 1.2 Public Faceted Web Service Interface
- 1.3 Demo Query Script for Text Search on Virtuoso
- 1.4 Example queries displayed with the Berlin SNORQL query explorer
- 1.5 Examples rendering DBpedia Data with Fluidops Information Workbench
- 1.6 Examples rendering DBpedia Data with Google Map
- 1.7 Example displaying DBpedia Data with Exhibit
- 1.8 Example displaying DBpedia Data with gFacet
- 2 Linked Data
- 3 Web Traffic on DBpedia
- 4 Semantic Web Crawling Sitemap
The DBpedia data set enables quite astonishing query answering possibilities against Wikipedia data.
There is a public SPARQL endpoint over the DBpedia data set at http://dbpedia.org/sparql. The endpoint is provided using OpenLink Virtuoso as both the back-end database engine and the HTTP/SPARQL server.
There is a list of all DBpedia data sets that are currently loaded into the SPARQL endpoint. This may not always include all available DBpedia data sets.
You can ask queries against DBpedia using:
- the Leipzig query builder at http://querybuilder.dbpedia.org;
- the OpenLink Interactive SPARQL Query Builder (iSPARQL) at http://dbpedia.org/isparql;
- the SNORQL query explorer at http://dbpedia.org/snorql (does not work with Internet Explorer); or
- any other SPARQL-aware client(s).
There is a public Faceted Browser "search and find" user interface at http://dbpedia.org/fct, and a corresponding faceted web service over the DBpedia data set at http://dbpedia.org/fct/service. Usage details can be found in the Virtuoso Facets Web Service documentation.
We published a simple script which was developed as a software study before the development of Relfinder started.
We think that it will help you get familiar with SPARQL + String search on a Virtuoso server which hosts DBpedia.
The demo is deployed here and you can find the source code here.
- People who were born in Berlin before 1900
- German musicians with German and English descriptions
- Musicians who were born in Berlin
- Persons by birthplace (in French, does not work with Internet Explorer) — demo does not appear to work anymore as of 2009-11-09 – the link might be removed in the future
- gFacet is a new approach for browsing RDF data, which combines graph based visualization and faceted filtering techniques. A demo for DBpedia and other Linked Data resources is available online: http://www.visualdataweb.org/gfacet.php
Linked Data is a method of publishing RDF data on the Web and of interlinking data between different data sources.
Linked Data on the Web can be accessed using Semantic Web browsers, just as the traditional Web of documents is accessed using HTML browsers. However, instead of blindly following nondescript links between HTML pages, Semantic Web browsers enable users to navigate between different data sources by following self-described RDF links. This allows the user to start off at one data source, and then move through a potentially endless Web of data sources connected by RDF links. It also allows the robots of Semantic Web search engines to follow these links to crawl the Semantic Web.
The DBpedia data set is served as Linked Data, meaning that all DBpedia URIs are dereferenceable.
Some example Linked Data URIs from the DBpedia data set are listed below. To start surfing the Semantic Web, please enter any of these URIs into the navigation bar of one of the Semantic Web browsers listed above.
|Resource||in OpenLink Data Explorer||in Fluidops IWB||in DISCO||in Marbles||in Tabulator|
|The Lord of the Rings||View||View||View (broken)||View (broken)||View (broken)
|The Beatles||View||View||View (broken)||View (broken)||View (broken)
The WebSci'2010 paper Learning from Linked Open Data Usage: Patterns & Metrics reports on the analysis of DBpedia log files dating from 2009-06-30 to 2009-10-25 (i.e., 118 days; almost 4 months). According to this analysis, during that period the average number of hits per day for DBpedia URIs as well as the SPARQL endpoint were:
- DBpedia URIs: 561,277 hits per day
- DBpedia SPARQL endpoint: 177,734 queries per day
Semantic Web Crawling: a Sitemap Extension defines an extension for the Sitemap protocol targeted at the efficient discovery and use of RDF data. Data publishers can state where RDF is located and provide alternative means to access it. Semantic Web clients and Semantic Web crawlers can use this information to access required RDF data in the most efficient way for the task they have to perform.
The DBpedia project supports this sitemap extension. The DBpedia sitemap, pointing at the SPARQL endpoint, the downloads, and some example instances, is found here. There is also a VoID description of the DBpedia datasets.