ChangeLog
DBpedia 3.3 (07/2009)
- update to Media Wiki dumps generated in May 2009
- more accurate abstract extraction
- labels and abstracts in 80 languages (see
http://downloads.dbpedia.org/3.3/)
- infobox extraction bugfixes
- new links to Dailymed, Diseasome, Drugbank, Sider, TCM
- updated Open Cyc links
DBpedia 3.2 (11/2008)
- update to Media Wiki dumps generated in October 2008
- new DBpedia Ontology introduced.
- new infobox extraction framework to populate the ontology.
- added initial infobox to ontology mappings.
- datatype extraction code improved.
- abstract extraction code improved.
- new Freebase to DBpedia RDF links.
- updated Open Cyc to DBpedia RDF links.
DBpedia 3.1 (08/2008)
- update to Media Wiki dumps generated in June 2008
- YAGO mapping improvements:
- YAGO itself has improved their algorithms
- the DBpedia-YAGO mapping is now generated by a YAGO converter and should have much better quality compared to the previous release
- Geo Extractor improvements:
- Geo-coordinates are now also provided in the
W3C Geospatial Vocabulary using Geo RSS Simple encoding. The
Basic Geo (WGS84 lat/long) Vocabulary is still supported due to its ease of use in SPARQL queries.
- International wikis are now included in the geo-coordinate extraction
- Support for further geo templates such as the proposed
'Coordinate' format
- Geo-coordinates are now also provided in the
- Bugs fixed:
- #1871653 Too long URIs by infoboxes extractor cause import problems
- #1964434 illegal \ char in URIs
- #1970387 Filter out references
- #1964632 Illegal URIs
- #1947512 timespan extraction
- Fixed unwanted My SQL connection pooling and corrected database names for infobox and image extractors
- Fixed internal encoding of international page IDs
DBpedia 3.0 (02/2008)
DBpedia 3.0 comes with the following changes (includes those changes between DBpedia 2.0 and DBpedia 3.0RC):
- multi-language improvements: extractors now applied to up to 14 different languages (not all extractors work on all languages)
- redirects data set available
- image copyright issues:
- the image extractor tries not to extract non-free images anymore (however, we cannot guarantee that it will not still happen)
- most of the extracted image URLs now contain an additional triple: $image dc:rights $wikiPageDescribingRights; always link back to the corresponding wiki page if you use images in your DBpedia based applications
- experimental (and still buggy) alternative DBpedia class hierarchy system:
- close to the Wikipedia category system but with several filters applied to it (categories which are bad candidates for OWL classes are to some degree filtered out, circles in the hierarchy are removed, administrative categories removed, etc.)
- improvements in extraction code:
- package structure in extraction code improved
- new Global Extractor Interface for non-article dependent extractions
- URI Exception for erroneous URIs
- new Linked Data Sets available:
- Links to Cyc
- Links to the flickr wrappr
- Links to Wikicompany
- Bugs fixed:
- #1818011: Labels for resources with colon character
- #1793163: HTML linebreaks are lost
- #1829160: Incorrect assignment of pages to categories
- #1819301: Missing plural redirects
- #1814938: Duplicates in pagelinks
- #1797810: Persondata dump should be labeled as German
- #1813011: Extra label in category wiki links
- #1871653: Too long URIs by infoboxes extractor cause import problems
- #1817019: Incorrect capitalization for XML Schema Datatypes
- #1730445: DBpedia browser page title = "テレビプロデューサー"
- #1724322: rudi völler – 404 links
- #1722279: Language code within Chinese Abstracts
- URIs with leading digit escaped by _
- Person Data Extractor: wrong date format (leading 0)
- Triples with over-sized erroneous URIs will not be extracted
- Incorrect assignment of pages to categories
- ... and many more ...
- Feature Requests incorporated:
- Extraction from Disambiguation Pages
- Extraction from Redirect Pages
- #1860862 Ordering of given- and surname in Personendaten Extractor
DBpedia 2.0 (09/2007)
- Improved the Data Quality
- Third Classification Schema Added: concepts are now also classified by associating them to Word Net synsets
- Geo-Coordinates: data set contains geo-coordinates for geographic locations using the W3C Basic Geo Vocabulary
- RDF Links to other Open Data Sets: The data set now contains 440,000 external RDF links into the
- Geonames,
- Musicbrainz,
- Word Net,
- World Factbook,
- Euro Stat,
- Book Mashup,
- DBLP Bibliography, and
- Project Gutenberg data sets.
DBpedia 1.0 (03/2007)
Initial Release of the DBpedia Data Sets, including:
- better short abstracts (stuff like unnecessary brackets has been removed from the abstracts)
- new extended abstracts for each concept (up to 3000 characters long)
- abstracts in 10 languages (German, French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, and Chinese)
- 2.8 million new links to external Web pages
- Cleaner infobox data
- 10,000 additional RDF links into the Geonames database.
- 9000 new RDF links between books in DBpedia and data about them provided by the RDF Book Mashup
- 200 RDF links between computer scientists in DBpedia and their publications in the DBLP database
- New classification information for geographic places using DBpedia terms and Geonames feature codes
There are no files on this page.
[Display files/form]
There is no comment on this page.
[Display comments/form]
Information
Last Modification:
2009-07-02 20:26:31 by Georgi Kobilarov
