About

DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself.

The DBpedia Knowledge Base

Knowledge bases are playing an increasingly important role in enhancing the intelligence of Web and enterprise search and in supporting information integration. Today, most knowledge bases cover only specific domains, are created by relatively small groups of knowledge engineers, and are very cost intensive to keep up-to-date as domains change. At the same time, Wikipedia has grown into one of the central knowledge sources of mankind, maintained by thousands of contributors.

The DBpedia project leverages this gigantic source of knowledge by extracting structured information from Wikipedia and by making this information accessible on the Web under the terms of the Creative Commons Attribution-ShareAlike 3.0 License and the GNU Free Documentation License.

The English version of the DBpedia knowledge base

The English version of the DBpedia knowledge base describes 4.58 million things, out of which 4.22 million are classified in a consistent ontology, including 1,445,000 persons, 735,000 places (including 478,000 populated places), 411,000 creative works (including 123,000 music albums, 87,000 films and 19,000 video games), 241,000 organizations (including 58,000 companies and 49,000 educational institutions), 251,000 species and 6,000 diseases.

In addition, we provide localized versions of DBpedia in 125 languages. All these versions together describe 38.3 million things, out of which 23.8 million are localized descriptions of things that also exist in the English version of DBpedia. The full DBpedia data set features 38 million labels and abstracts in 125 different languages, 25.2 million links to images and 29.8 million links to external web pages; 80.9 million links to Wikipedia categories, and 41.2 million links to YAGO categories. DBpedia is connected with other Linked Datasets by around 50 million RDF links. Altogether the DBpedia 2014 release consists of 3 billion pieces of information (RDF triples) out of which 580 million were extracted from the English edition of Wikipedia, 2.46 billion were extracted from other language editions. Detailed statistics about the DBpedia datasets in 24 popular languages are provided at Dataset Statistics.

The DBpedia knowledge base has several advantages over existing knowledge bases: it covers many domains; it represents real community agreement; it automatically evolves as Wikipedia changes, and it is truly multilingual. The DBpedia knowledge base allows you to ask quite surprising queries against Wikipedia, for instance “Give me all cities in New Jersey with more than 10,000 inhabitants” or “Give me all Italian musicians from the 18th century”. Altogether, the use cases of the DBpedia knowledge base are widespread and range from enterprise knowledge management, over Web search to revolutionizing Wikipedia search.

 

 

The DBpedia Data Provision Architecture

The DBpedia RDF Data Set is hosted and published using OpenLink Virtuoso. The Virtuoso infrastructure provides access to DBpedia's RDF data via a SPARQL endpoint, alongside HTTP support for any Web client's standard GETs for HTML or RDF representations of DBpedia resources.

Illustration of Current DBpedia Data Provision Architecture

Architectural Evolution

Though the DBpedia RDF Data has always been housed in Virtuoso, which has supported all desired means of access since the DBpedia project began, early DBpedia releases used Pubby Linked Data Deployment services in front of the Virtuoso SPARQL endpoint.

 

As the project gained traction, the HTTP demands on Pubby's out-of-process Linked Data Publishing services increased, and the natural option was to take advantage of Virtuoso's SPASQL (SPARQL inside SQL) and other Linked Data Deployment features, by moving these services in-process with Virtuoso.

 

Illustration of Deprecated Architecture

Learn about DBpedia

If you like what our project does but are still new to DBpedia there are a few articles that can help you get started:

About DBpedia internationalization: