Monday, July 17, 2017 - 1:42pm
After our 2nd Community Meeting in the US, we delighted the Irish DBpedia Community with the 9th DBpedia Community Meeting, which was co-located with the Language, Data and Knowledge Conference 2017 in Galway at the premises of the NUI Galway.
First and foremost, we would like to thank John McCrae (Insight Centre for Data Analytics, NUI Galway) and the LDK Conference for co-hosting and support the event.
The focus of this Community Meeting was the Irish DBpedia and Linked Data Community in Ireland. Therefore we invited local data scientists as well as European DBpedia enthusiasts to discuss the state of Irish Linked Data.
The meeting started with two compelling keynotes by Brian Ó Raghallaigh, Dublin City University and Logainm.ie, and Sharon Flynn, NUI Galway and Wikimedia Ireland. Brian presented Logainm.ie, a data use case about placenames in Ireland with a special focus on linked Logainm and machine-readable data.
His insightful presentation was followed by Sharon Flynn talking about Wikimedia in Ireland and the challenges of “this monumental undertaking” with particular reference to the Wikimedia Community in Ireland.
For more details on the content of the presentations, follow the links to the slides.
Eoin McCuirc started the DBpedia Showcase Session “MY sweet LOD”, an insightful presentation on Linked Open Data in Ireland from the perspective of a statistics office.
Shortly after, Ronald Stamper, Chairman of Measur Ltd. elaborated on semantic normal form, ontologies and the perils of paradigm change.
Ben De Meester, from Ghent University, presented the first DBpedia Showcase about Declarative Data Transformation for Linked Data Generation.
Followed by another showcase by Alan Meehan, presenting the SUMMR Interlink Validation tool which validates Interlinks from a source dataset to multiple targets.
Closing the Showcase Session, Frederico Araujo Durao, Insight Centre for Data Analytics – University College Cork (UCC), presented a demo of his linked data browser.
For further details of the presentations follow the links to the slides.
As a regular part of the DBpedia Community Meeting we have two parallel sessions in the afternoon where DBpedia newbies can learn about what DBpedia is and how to use the DBpedia data sets.
Participants who wanted to learn DBpedia basics joined the DBpedia Tutorial Session by Markus Freudenberg (DBpedia Release Manager). The DBpedia Association Hour provided a platform for the community to discuss and give feedback.
Additionally, Sebastian Hellmann and Julia Holze, members of the DBpedia Association, updated the participants about the growing number of the DBpedia Association members, the formalized DBpedia language chapters, the established DBpedia Community Committee and they informed about technical developments such as the DBpedia API.
The afternoon session started with the DBpedia 2016-10 release update by Markus Freudenberg (DBpedia Release Manager). Following this, Kevin Chekov Feeney, (Trinity College Dublin) presented the software alignment in the ALIGNED project. He talked about “Generating correct-by-construction semantic datasets from unstructured, semi-structured and badly structured data sources”.
At this point, we also like to thank the ALIGNED project for the development of DBpedia as a project use case and for covering parts of the travel cost.
Chaired by Rob Brennan and Bianca Pereira, the speakers in the last session presented new Irish Linked Data Projects, for example GeoHive, BIOOPENER and the TCD Open Linked Data Engagement Fund Project. The following panel session gave DBpedia and Linked Data enthusiasts a platform for exchange and discussion. Outcome of this session was the creation of a roadmap for the Irish Linked Data with all participants.
Following, you find a list of all presentations of this session:
Closing this session John McCrae announced that the next edition of the Language, Data and Knowledge (LDK) Conference is scheduled for 2019 in Germany. We at the DBpedia Association are now looking forward to welcome the LDK Community in Leipzig!
The Community Meeting slowly came to an end with our social evening event, which was held at the PorterShed in Galway. The evening session revolved around the topic How to exploit data commercially? and featured two short impulse talks. Paul Buitelaar started the session by presenting “Kibi”, which is an Open Source platform for Data Intelligence based on the search engine Elasticsearch. Finally, Sebastian Hellmann talked about “Improving the Utility of DBpedia by co-designing a public and commercial DBpedia API” (slides).
Summing up, the 9th DBpedia Community Meeting brought together more than 45 DBpedia enthusiasts from Ireland and Europe who engaged in vital discussions about Linked Data, DBpedia use cases and services.
You can find feedback about the event on Twitter via #DBpediaGalway17.
We would like to thank Bianca Pereira and Caoilfhionn Lane from Insight Centre for Data Analytics, NUI Galway, as well Rob Brennan from ADAPT Research Centre, Trinity College Dublin, for devoting their time to curating the program organizing the meeting.
Special thanks go to LDK 2017 for hosting the meeting.
We are looking forward to the next DBpedia Community Meeting which will be held in Amsterdam, Netherlands. Co-located with the SEMANTiCS17, the Community will get together on the 14th of September on the DBpedia Day.
Your DBpedia Association
Tuesday, July 11, 2017 - 11:46am
Sören Auer and the DBpedia Board members prepared a survey to assess the direction of the DBpedia Association. We wanted to know what the DBpedia Community thinks about DBpedia’s strategic priorities and how the funds of the DBpedia Association are be spent. Between February 2017 and April 2017, a total of 40 members of the DBpedia Community actively participated in the survey and voted as follows:
1. What should be the priorities of the DBpedia Association in the next year?
To overview the various priorities which were mentioned, the following digest illustrates the answers in four different groups. The most frequent answer was: to increase the data quality, followed by the enlargement of the DBpedia Community through broader dissemination.
2. What should be the priorities of the DBpedia Association in the next three years?
In contrast to question one, this one is based on the priorities the DBpedia Association focuses on during the next three years. As well as in the previous overview, the specified priorities are divided into four categories.
The chart above depicts the several main interests in DBpedia. The majority of participants have an “academic & professional” (45.7%) interest in DBpedia, followed by “professional” (28.6%) and “academic” (20.0%) interests. Only 2.9% of the answers are student-related interests.
4. How should the funds of the association be used?
With respects to “How should the funds of the association be used?”, most attendees chose “service provisioning”. The “development of new DBpedia features” was the second most popular choice. Nevertheless, also “Community building” and “release production” scored many votes.
5. How should the DBpedia Association collaborate with national/language chapters?
6. Should DBpedia open itself to contain and curate more data not directly extracted from Wikipedia?As the chart above clearly depicts, more than half of the participants are in favor of DBpedia comprising datasets not directly derived or extracted from Wikipedia. In contrast, 34.3% have the oppositional opinion and appreciate DBpedia focussing solely on data extraction from Wikipedia.
7. Which of the following features do you consider most important?
The following diagram gives a review of particular features and their importance from the participants point of view. As the result of question one reveals, data quality is considered the most important issue by the survey participants (23.7%). The second most important features, with 17.2% each, are: the provision of datasets extracted from the Wikipedia article text, substantial collaboration/integration with WikiData and a provision of better search, respectively an exploration of user interfaces.
8. Any other question, feedback, opinion, ideas or suggestion you would like to send to the association.
Thank you for your input and your participation! Your priorities and opinions are of vital importance for the success of DBpedia in the future. We will discuss the implementation of your answers during our next DBpedia Board Meetings in order to find a reasonable strategic direction of the DBpedia Association for the next years.
Tuesday, July 4, 2017 - 1:53pm
We are happy to announce the new DBpedia Release.
This release is based on updated Wikipedia dumps dating from October 2016.
This release took us longer than expected. We had to deal with multiple issues and included new data. Most notable is the addition of the NIF annotation datasets for each language, recording the whole wiki text, its basic structure (sections, titles, paragraphs, etc.) and the included text links. We hope that researchers and developers, working on NLP-related tasks, will find this addition most rewarding. The DBpedia Open Text Extraction Challenge (next deadline Mon 17 July for SEMANTiCS 2017) was introduced to instigate new fact extraction based on these datasets.
We want to thank anyone who has contributed to this release, by adding mappings, new datasets, extractors or issue reports, helping us to increase coverage and correctness of the released data. The European Commission and the ALIGNED H2020 project for funding and general support.
You want to read more about the New Release? Click below for further details.[expander_maker id=”1″ more=”Read more” less=”Read less”]
Altogether the DBpedia 2016-10 release consists of 13 billion (2016-04: 11.5 billion) pieces of information (RDF triples) out of which 1.7 billion (2016-04: 1.6 billion) were extracted from the English edition of Wikipedia, 6.6 billion (2016-04: 6 billion) were extracted from other language editions and 4.8 billion (2016-04: 4 billion) from Wikipedia Commons and Wikidata.
In addition, adding the large NIF datasets for each language edition (see details below) increased the number of triples further by over 9 billion, bringing the overall count up to 23 billion triples.
In case you missed it, what we changed in the previous release (2016-04)
South Azerbaijani (azb), Upper Sorbian (hsb), Limburgan (li), Minangkabau (min), Western Mari (mrj), Oriya (or), Ossetian (os)
The DBpedia community added new classes and properties to the DBpedia ontology via the mappings wiki. The DBpedia 2016-04 ontology encompasses:
The editor community of the mappings wiki also defined many new mappings from Wikipedia templates to DBpedia classes. For the DBpedia 2016-10 extraction, we used a total of 5887 template mappings (DBpedia 2015-10: 5800 mappings). The top language, gauged by the number of mappings, is Dutch (648 mappings), followed by the English community (606 mappings).[/expander_maker]
The work on the DBpedia 2016-10 release was financially supported by the European Commission through the project ALIGNED – quality-centric, software and data engineering.
Have fun with the new DBpedia 2016-10 release!
Wednesday, June 7, 2017 - 1:57pm
We are happy to announce that the 9th DBpedia Community meeting will be held in Galway, Ireland on June 21st 2017. DBpedia will be part of the Language, Data and Knowledge conference (LDK) in Galway. This new biennial conference series aims at bringing together researchers from across disciplines. The DBpedia Meeting is part of the conference and is scheduled for the last day.
Only few seats are left: So come and get your ticket to be part of the 9th DBpedia Community meeting in Galway.
Please check our schedule for the 9th DBpedia Community meeting here: http://wiki.dbpedia.org/meetings/Galway2017
The social event will be held in the evening (starting at 6pm) at the PorterShed around the topic How to exploit data commercially? featuring several short impulse talks. We still have some remaining slots and would welcome you to present your success stories as well as use cases, but also tell us about your problems regarding the commercialisation of data. If you are interested in presenting, please email email@example.com.
Sponsors and Acknowledgments
|LDK2017||For hosting the meeting.|
|Institute for Applied Informatics||For supporting the DBpedia Association.|
|OpenLink Software||For continuous hosting of the main DBpedia Endpoint.|
|ADAPT research centre||For supporting the DBpedia Association.|
|ALIGNED – Software and Data Engineering||For funding the development of DBpedia as a project use-case and covering part of the travel cost.|
|PorterShed||For hosting the evening event.|
In case you want to sponsor the 9th DBpedia Community Meeting, please contact the DBpedia Association via firstname.lastname@example.org.
We are looking forward to meeting you in Galway!
Your DBpedia Association
Thursday, June 1, 2017 - 1:31pm
In conjunction with Springer Nature, DBpedia offers a 3 months internship at Springer Nature in London, UK and at DBpedia in Leipzig, Germany.
|Main Employer||DBpedia Association|
|Deadline||June 30th, 2017|
|Duration||3 months/full-time, internship will starts in the second half of 2017|
|Location||50% in London (UK) and 50% in Leipzig (GER)|
|Type of students desired||Undergraduate, Graduate (Junior role)|
|Compensation||You will receive a stipend of 1300€ per month and additional reimbursement of your travel and visa costs (total up to 1000€)|
The student intern will be responsible for assisting with mappings for DBpedia at SpringerNature. Your tasks include and are not restricted to improving the quality of the extraction mechanism of DBpedia scholarly references/wikipedia citations to Springer Nature URIs and Text mining of DBpedia entities from Springer Nature publication content.
We are looking forward to meet all the whiz kids out there.
Friday, May 5, 2017 - 10:16am
We are very excited to announce this year’s final students for our projects at the Google Summer of Code program (GSoC).
Google Summer of Code is a global program focused on bringing more student developers into open source software development. Stipends are awarded to students to work on a specific DBpedia related project together with a set of dedicated mentors during summer 2017 for the duration of three months.
For the past 5 years DBpedia has been a vital part of the GSoC program. Since the very first time many Dbpedia projects have been successfully completed.
In this years GSoC edition, DBpedia received more than 20 submissions for selected DBpedia projects. Our mentors read many promising proposals, evaluated them and now the crême de la crême of students snatched a spot for this summer. In the end 7 students from around the world were selected and will jointly work together with their assigned mentors on their projects. DBpedia developers and mentors are really excited about this 7 promising student projects.
List of students and projects:
You want to read more about their specific projects? Just click below… or check GSoC pages for details.[expander_maker id=”1″ more=”Read more” less=”Read less”] Ismael Rodriguez – Project Description: Although the DBPedia Extraction Framework was adapted to support RML mappings thanks to a project of last year GSoC, the user interface to create mappings is still done by a MediaWiki installation, not supporting RML mappings and needing expertise on Semantic Web. The goal of the project is to create a front-end application that provides a user-friendly interface so the DBPedia community can easily view, create and administrate DBPedia mapping rules using RML. Moreover, it should also facilitate data transformations and overall DBPedia dataset generation. Mentors: Anastasia Dimou, Dimitris Kontokostas, Wouter Maroy
Ram Ganesan Athreya – Project Description:The requirement of the project is to build a conversational Chatbot for DBpedia which would be deployed in at least two social networks.There are three main challenges in this task. First is understanding the query presented by the user, second is fetching relevant information based on the query through DBpedia and finally tailoring the responses based on the standards of each platform and developing subsequent user interactions with the Chatbot.Based on my understanding, the process of understanding the query would be undertaken by one of the mentioned QA Systems (HAWK, QANARY, openQA). Based on the response from these systems we need to query the DBpedia dataset using SPARQL and present the data back to the user in a meaningful way. Ideally, both the presentation and interaction flow needs to be tailored for the individual social network.I would like to stress that although the primary medium of interaction is text, platforms such as Facebook insist that a proper mix between chat and interactive elements such as images, buttons etc would lead to better user engagement. So I would like to incorporate these elements as part of my proposal.
Mentor: Ricardo Usbeck
Nausheen Fatma – Project discription: Knowledge base embeddings has been an active area of research. In recent years a lot of research work such as TransE, TransR, RESCAL, SSP, etc. has been done to get knowledge base embeddings. However none of these approaches have used DBpedia to validate their approach. In this project, I want to achieve the following tasks: i) Run the existing techniques for KB embeddings for standard datasets. ii) Create an equivalent standard dataset from DBpedia for evaluations. iii) Evaluate across domains. iv) Compare and Analyse the performance and consistency of various approaches for DBpedia dataset along with other standard datasets. v)Report any challenges that may come across implementing the approaches for DBpedia. Along the way, I would also try my best to come up with any new research approach for the problem.
Mentors: Sandro Athaide Coelho, Tommaso Soru
Akshay Jagatap – Project Description: The project aims at defining embeddings to represent classes, instances and properties. Such a model tries to quantify semantic similarity as a measure of distance in the vector space of the embeddings. I believe this can be done by implementing Random Vector Accumulators with additional features in order to better encode the semantic information held by the Wikipedia corpus and DBpedia graphs.
Mentors: Pablo Mendes, Sandro Athaide Coelho, Tommaso Soru
Luca Virgili – Project Description: In Wikipedia a lot of data are hidden in tables. What we want to do is to read correctly all tables in a page. First of all, we need a tool that can allow us to capture the tables represented in a Wikipedia page. After that, we have to understand what we read previously. Both these operations seem easy to make, but there are many problems that could arise. The main issue that we have to solve is due to how people build table. Everyone has a particular style for representing information, so in some table we can read something that doesn’t appear in another structure. In this paper I propose to improve the last year’s project and to create a general way for reading data from Wikipedia tables. I want to review the parser for Wikipedia pages for trying to understand more types of tables possible. Furthermore, I’d like to build an algorithm that can compare the column’s elements (that have been read previously by the parser) to an ontology so it could realize how the user wrote the information. In this way we can define only few mapping rules, and we can make a more generalized software.
Mentors: Emanuele Storti, Domenico Potena
Shashank Motepalli – Project Description: DBpedia tries to extract structured information from Wikipedia and make information available on the Web. In this way, the DBpedia project develops a gigantic source of knowledge. However, the current system for building DBpedia Ontology relies on Infobox extraction. Infoboxes, being human curated, limit the coverage of DBpedia. This occurs either due to lack of Infoboxes in some pages or over-specific or very general taxonomies. These factors have motivated the need for DBTax.DBTax follows an unsupervised approach to learning taxonomy from the Wikipedia category system. It applies several inter-disciplinary NLP techniques to assign types to DBpedia entities. The primary goal of the project is to streamline and improve the approach which was proposed. As a result, making it easy to run on a new DBpedia release. In addition to this, also to work on learning taxonomy of DBTax to other Wikipedia languages.
Mentors: Marco Fossati, Dimitris Kontokostas
Krishanu Konar – Project Description: Wikipedia, being the world’s largest encyclopedia, has humongous amount of information present in form of text. While key facts and figures are encapsulated in the resource’s infobox, and some detailed statistics are present in the form of tables, but there’s also a lot of data present in form of lists which are quite unstructured and hence its difficult to form into a semantic relationship. The project focuses on the extraction of relevant but hidden data which lies inside lists in Wikipedia pages. The main objective of the project would be to create a tool that can extract information from wikipedia lists, form appropriate RDF triplets that can be inserted in the DBpedia dataset.
Mentor: Marco Fossati [/expander_maker]
Congrats to all selected students! We will keep our fingers crossed now and patiently wait until early September, when final project results are published.
The competition for GSoC slots is always on a very high level and DBpedia only has a limited amount of slots available for students. In case you weren’t among the selected, do not give up on DBpedia just yet. There are plenty of opportunities to prove your abilities and be part of the DBpedia experience. You, above all, know DBpedia by heart. Hence, contributing to our support system is not only a great way to be part of the DBpedia community but also an opportunity to be vital to DBpedia’s development. Above all, it is a chance for current DBpedia mentors to get to know you better. It will give your future mentors a chance to support you and help you to develop your ideas from the very beginning.
Go on you smart brains, dare to become a top DBpedia expert and provide good support for other DBpedia Users. Sign up to our support page or check out the following ways to contribute:
We are looking forward to working with you!
Have a great weekend!
Thursday, March 9, 2017 - 1:36pmDo you want to stay informed about upcoming DBpedia events, releases and technical developments? Through the DBpedia newsletter you get the possibility to be always up to date and to provide feedback to us. Four times per year we will inform the DBpedia community about meetings, new collaborations and other topics related to DBpedia. So … Continue reading STAY TUNED AND SIGN UP FOR THE DBPEDIA NEWSLETTER
Friday, March 3, 2017 - 1:34pm
DBpedia will participate for a fifth time in the Google Summer of Code program (GSoC) and now we are looking for students who will share their ideas with us. We are regularly growing our community through GSoC and can deliver more and more opportunities to you. We got excited with our new ideas, we hope you will get excited too!
Google Summer of Code is a global program focused on bringing more student developers into open source software development. Funds will given to students (BSc, MSc, PhD) to work for three months on a specific task. At first open source organizations announce their student projects and then students should contact the mentor organizations they want to work with and write up a project proposal for the summer. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.
If you are a GSoC student who wants to apply to our organization, please check our guideline here: http://wiki.dbpedia.org/gsoc2017
Here you can see the Google Summer of Code 2017 timeline:
|March 20th, 2017||Student applications open (Students can register and submit their applications to mentor organizations.)|
|April 3rd, 2017||Student application deadline|
|May 4th, 2017||Accepted students are announced and paired with a mentor.|
|May 30th, 2017||Coding officially begins!|
|August 21st, 2017||Final week: Students submit their final work product and their final mentor evaluation|
|September 6th, 2017||Final results of Google Summer of Code 2017 announced|
We are looking forward to your input.
Your DBpedia Association
Wednesday, February 22, 2017 - 10:11am
Sören Auer and the DBpedia Board members prepared a survey to assess the direction of the DBpedia Association. We would like to know what you think should be our priorities and how you would like the funds of the association to be used.
Your opinion counts – so please contribute actively in developing a better DBpedia. If you use DBpedia and want us to keep going forward, we kindly invite you to vote here: https://goo.gl/forms/rDqLcwL823Ok09Uw2
We will publish the results in anonymized, aggregated form on the DBpedia website.
Your DBpedia Association
Thursday, January 19, 2017 - 4:35pm
As previous years, we would like your input for DBpedia related project ideas for GSoC 2017.
For those who are unfamiliar with GSoC (Google Summer of Code), Google pays students (BSc, MSc, PhD) to work for 3 months on an open source project. Open source organizations announce their student projects and students apply for projects they like. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.
Here you can see the Google Summer of Code 2017 timeline: https://developers.google.com/open-source/gsoc/timeline
or please check: http://wiki.dbpedia.org/gsoc2016
If you have a cool idea for DBpedia or want to co-mentor an existing cool idea go here (All mentors get a free Google T-shirt and get the chance to go Google HQs in November.).
DBpedia applied for the fifth time to participate in the Google Summer of Code program. Here you will find a list of all projects and students from GSoC 2016: http://blog.dbpedia.org/2016/04/26/dbpedia-google-summer-of-code-2016/
Looking forward to your input.
Your DBpedia Association