Datasets

Datasets 

description & objective

 

alias

 

 

alias nmw

 

 

anchor text

 

Texts used in links to refer to Wikipedia articles from other Wikipedia articles.

 

article categories

 

Links from concepts to categories using the SKOS vocabulary.

 

article templates

 

Templates used in an article (top-level).

 

category labels

 

Labels for Categories.

 

citation data

 

Raw data extracted from Wikipedia citation templates.

 

citation links

 

Links from a citation to a DBpedia article using the dbp:isCitedBy property.

 

description

 

 

description nmw

 

 

disambiguations

 

Links extracted from Wikipedia disambiguation pages. Since Wikipedia has no syntax to distinguish disambiguation links from ordinary links, DBpedia has to use heuristics. This dataset has Wikipedia redirects resolved.

 

disambiguations unredirected

 

Links extracted from Wikipedia disambiguation pages. Since Wikipedia has no syntax to distinguish disambiguation links from ordinary links, DBpedia has to use heuristics.

 

external links

 

Links to external web pages about a concept.

 

file information

 

 

freebase links

 

This file contains the back-links (owl:sameAs) to the Freebase dataset.

 

french population

 

French dataset about population.

 

genders

 

Dataset trying to identify the gender of a resource

 

geo coordinates

 

Geographic coordinates extracted from Wikipedia.

 

geo coordinates mappingbased

 

Geographic coordinates extracted from Wikipedia originating from mapped infoboxes in the mappings wiki.

 

homepages

 

Links to homepages of persons, organizations etc.

 

image annotations

 

Annotations of image regions.

 

image galleries

 

An image gallery for a resource.

 

images

 

Main image and corresponding thumbnail from Wikipedia article.

 

infobox properties

 

Information that has been extracted from Wikipedia infoboxes. Note that this data is in the less clean 'property' namespace. The Mapping-based Properties in the 'ontology' namespace, should always be preferred over this data.

 

infobox properties unredirected

 

Information that has been extracted from Wikipedia infoboxes. Note that this data is in the less clean 'property' namespace. The Mapping-based Properties in the 'ontology' namespace, should always be preferred over this data.

 

infobox property definitions

 

All properties predicates used in infoboxes.

 

infobox test

 

 

instance types

 

Contains triples of the form $object rdf:type $class from the mapping-based extraction.

 

instance types sdtyped dbo

 

The SDType heuristic can extract probable type information in large, cross-domain databases on noisy data. This is its result for DBpedia which supplements the normally gathered instance types. This set is its result for DBpedia where the inferred type has an equivalent in the DBpedia ontology.

 

instance types transitive

 

Contains transitive rdf:type $class based on the DBpedia ontology.

 

interlanguage links

 

Dataset linking a DBpedia resource to the same resource in other languages and in Wikidata.

 

kml files

 

Description of KML files from Commons.

 

labels

 

Titles of all Wikipedia Articles in the corresponding language.

 

labels nmw

 

 

long abstracts

 

Full abstracts of Wikipedia articles, usually the first section.

 

mappingbased literals

 

High-quality data extracted from Infoboxes using the mapping-based extraction (Literal properties only). The predicates in this dataset are in the ontology namespace. Note that this data is of much higher quality than the Raw Infobox Properties in the 'property' namespace.

 

mappingbased objects

 

High-quality data extracted from Infoboxes using the mapping-based extraction (Object properties only). The predicates in this dataset are in the 'ontology' namespace. Note that this data is of much higher quality than the Raw Infobox Properties in the 'property' namespace.

 

mappingbased objects disjoint domain

 

Errors detected in the mapping based properties (disjoint domain).

 

mappingbased objects disjoint range

 

Errors detected in the mapping based properties (disjoint range).

 

mappingbased properties reified

 

 

mappingbased properties reified qualifiers

 

 

ontology subclassof

 

 

out degree

 

Number of links emerging from a Wikipedia article and pointing to another Wikipedia article.

 

page ids

 

Dataset linking a DBpedia resource to the page ID of the Wikipedia article the data was extracted from.

 

page length

 

Numbers of characters contained in a Wikipedia article's source.

 

page links

 

Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms.

 

page links unredirected

 

Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms.

 

persondata

 

Information about persons (date and place of birth etc., extracted from the English and German Wikipedia, represented using the FOAF vocabulary.

 

persondata unredirected

 

Information about persons (date and place of birth etc., extracted from the English and German Wikipedia, represented using the FOAF vocabulary.

 

pnd

 

 

properties

 

 

raw

 

 

raw reified

 

 

raw reified qualifiers

 

 

raw unredirected

 

 

redirects

 

Dataset containing redirects between articles in Wikipedia.

 

references

 

 

revision ids

 

Dataset linking a DBpedia resource to the revision ID of the Wikipedia article the data was extracted from.

 

revision meta

 

Dataset containing additional revision information.

 

revision uris

 

Dataset linking DBpedia resource to the specific Wikipedia article revision used in this DBpedia release.

 

sameas all wikis

 

 

sameas external

 

 

sameas wikidata

 

 

short abstracts

 

Short Abstracts (max. 500 characters long, of Wikipedia articles.

 

skos categories

 

Information which concept is a category and how categories are related using the SKOS Vocabulary.

 

specific mappingbased properties

 

Infobox data from the mapping-based extraction, using units of measurement more convenient for the resource type, e.g. square kilometres instead of square metres for the area of a city.

 

template parameters

 

Dataset describing names of template parameters.

 

topical concepts

 

Resources that describe a category

 

topical concepts unredirected

 

Resources that describe a category

 

transitive redirects

 

Dataset containing transitively resolved redirects between articles in Wikipedia.

 

uri same as iri

 

The owl:sameAs links between the IRI and URI format of DBpedia resources. Only extracted when IRI and URI are actually different.

 

wikidata duplicate iri split

 

 

wikidata r2r mapping errors

 

 

wikipedia links

 

Dataset linking DBpedia resource to corresponding article in Wikipedia.