Facts & Figures

The English version of the DBpedia knowledge base currently describes 4.58 million things, out of which 4.22 million are classified in a consistent ontology (see here), including 1,445,000 persons, 735,000 places (including 478,000 populated places), 411,000 creative works (including 123,000 music albums, 87,000 films and 19,000 video games), 241,000 organizations (including 58,000 companies and 49,000 educational institutions), 251,000 species and 6,000 diseases.

We provide localized versions of DBpedia in 125 languages. All these versions together describe 38.3 million things, out of which 23.8 million are localized descriptions of things that also exist in the English version of DBpedia. The full DBpedia data set features 38 million labels and abstracts in 125 different languages, 25.2 million links to images and 29.8 million links to external web pages; 80.9 million links to Wikipedia categories, and 41.2 million links to YAGO categories. DBpedia is connected with other Linked Datasets by around 50 million RDF links.

Altogether the DBpedia 2014 release consists of 3 billion pieces of information (RDF triples) out of which 580 million were extracted from the English edition of Wikipedia, 2.46 billion were extracted from other language editions.

Detailed statistics about the DBpedia data sets in 28 popular languages are provided at Dataset Statistics page.

You can download the new DBpedia datasets from the download page.  As usual, the new dataset is also available as Linked Data and via the DBpedia SPARQL endpoint from here.

More information about DBpedia is found here as well as in the new overview article about the project or take a look at our the dbpedia2pager.

Credits

Lots of thanks to:

  1. Daniel Fleischhacker (University of Mannheim) and Volha Bryl (University of Mannheim) for improving the DBpedia extraction framework, for extracting the DBpedia 2014 data sets for all 125 languages, for generating the updated RDF links to external data sets, and for generating the statistics about the new release.

  2. All editors that contributed to the DBpedia ontology mappings via the Mappings Wiki.

  3. The whole DBpedia Internationalization Committee for pushing the DBpedia internationalization forward.

  4. Dimitris Kontokostas (University of Leipzig) for improving the DBpedia extraction framework and loading the new release onto the DBpedia download server in Leipzig.

  5. Heiko Paulheim (University of Mannheim) for re-running his algorithm to generate additional type statements for formerly untyped resources and identify and removed wrong statements.

  6. Petar Ristoski (University of Mannheim) for generating the updated links pointing at the GADM database of Global Administrative Areas. Petar will also generate an updated release of DBpedia as Tables soon.

  7. Aldo Gangemi (LIPN University, France & ISTC-CNR, Italy) for providing the links from DOLCE to DBpedia ontology.

  8. Kingsley Idehen, Patrick van Kleef, and Mitko Iliev (all OpenLink Software) for loading the new data set into the Virtuoso instance that serves the Linked Data view and SPARQL endpoint.

  9. OpenLink Software altogether for providing the server infrastructure for DBpedia.

  10. Michael Moore (University of Waterloo, as an intern at the University of Mannheim) for implementing the anchor text extractor and contribution to the statistics scripts.

  11. Ali Ismayilov (University of Bonn) for implementing Wikidata extraction, on which the interlanguage link generation was based.

  12. Gaurav Vaidya (University of Colorado Boulder) for implementing and running Wikimedia Commons extraction.

  13. Andrea Di Menna, Jona Christopher Sahnwaldt, Julien Cojan, Julien Plu, Nilesh Chakraborty and others who contributed improvements to the DBpedia extraction framework via the source code repository on GitHub.

  14. All GSoC mentors and students for working directly or indirectly on this release.