DBpedia Data ID Unit
You can support DBpedia Groups via Donations to the DBpedia Association
Git Hub: https://github.com/dbpedia/dataId
Mailing list: Subscribe | mailto:email@example.com
The DBpedia Data ID Unit is a DBpedia Group with the goal of describing LOD datasets via RDF files, to host and deliver these metadata files together with the dataset in a uniform way, create and validate such files and deploy the results for the DBpedia and its local chapters. Established vocabularies like DCAT, VoID, Prov-O and SPARQL Service Description are to be reused for maximum compatibility. This way, we hope to establish a uniform and accepted way to describe and deliver dataset metadata for arbitrary LOD datasets and to put existing standards into practice.
How to participate
You can join the Data ID unit by writing your name and affiliation under members. At the moment discussion will take place in the DBpedia discussion mailing list.
Your contribution can be manifold, i.e. implement a service for Data ID, generate statistics with your own tool and link them, add custom properties, i.e. dataid:lodStatsLink or dataid:sparqlesLink (or whatever you might need).
A simple but important contribution is also to add a Data Id Turtle file to your data to describe your data yourself.
A number of established vocabularies to describe information about datasets exist and are recommended to use by W3C. They can be used to indicate where and how the dataset is distributed, what category it belongs to, what other datasets are linked, where example resources can be found, who published it under which license and much more. However, there is no best practice on where this metadata should be published, how it should be maintained and what it is supposed to contain. Distributing this metadata with the dataset can greatly ease the maintenance of dataset entries in data repositories like http://datahub.io/, semantic search and dataset usage.
DBpedia Groups are reporting to relevant other community groups to get feedback, e.g. W3C groups, OKFN or Wikimedia.
Furthermore, summary reports are sent to associated industry partners of DBpedia (sign-up via firstname.lastname@example.org )
This group will report to:
- Government Linked Data Working Group
- Public LOD
- Datahub and Discussion list
- LOD2 Project
- LIDER Project
- Creating a Data ID file for the DBpedia project as a whole: In the process of creating this file, upcoming development and modeling questions will be solved iteratively on the go. The result will be deployed for the main DBpedia project, as well as the local chapters.
- Data ID generator: A generator app will be developed that can be used to generate a Data ID file from metadata entered into a form.
- Validator Service for the File: RDFUnit will be used to establish a validator service for Data ID files. This service will consume a dataset URI, access the Data ID file for this datasetand validate it for compliance to the established format.
- Compliance with Data Hub: We will try to either establish a service that automatically transfers the DBpedia Data ID metadata to http://datahub.io/ or prefereably get the datahub.io team to allow for automatic retrieval of Data ID files by datahub.io in regular intervals.
- Statistical module: A statistical module will be developed that automatically generates statistical data about the dataset (like triple count, SPARQL service uptime, Ontology usage and links to other datasets) after Input of a Data ID file. LODStats and SparqlES will be used to facilitate this task.
- Spread the word: Implement the resulting practice by establishing Data ID for as many datasets as possible to finally have a universally accepted way of dataset description that is delivered by the datasets themselves.
You can take a look at the data model of the Data ID here:
The model integrates DCAT, Vo ID, Prov-O and SPARQL Service Description. Extensions can be made for typical use cases. Please refer to the mailing lists for more information.
Data ID generator
A visual, easy to use tool to create Data I Ds can be found here:
Last Modification: 2014-09-03 10:48:36 by Martin Bruemmer