Lookup Linking

Lookup linking is a text enrichment method involving the DBpedia Lookup service.

The DBpedia Lookup service is an entity retrieval service for DBpedia entities that resolves keywords to resource identifiers. Thus it can be used to enrich plain text documents or tables with DBpedia URIs. Linking your labels to resource identifiers opens the door to the linked data world for the task at hand.

Let’s have a look at the following example: 

We have an imaginary CSV table with employee data but we forgot to collect the area codes of the adresses. Without the area codes we are unable to call our employees! Note that this table might as well be 1000 or more entries long. Prepare for a Google-Marathon this weekend!

NamePhoneCityArea Code
John45678Berlin???
Mary98765New York???
Mike45645Manchester???
Steve56578Moscow???
Debbie67832Barcelona???

The DBpedia Lookup service accepts keywords and returns resource identifiers in the DBpedia knowledge graph. This graph contains a vast amount of city data – including the area codes. Linking our city labels to DBpedia resources and retrieving the codes from the knowledge graph can be done with a few lines of code in the following steps

After retrieving the data and writing the results back into our CSV the result could look like this:

NamePhoneCityCity URIArea Code
John45678Berlinhttp://dbpedia.org/resource/Berlin030
Mary98765New Yorkhttp://dbpedia.org/resource/New_York_City212
Mike45645Manchesterhttp://dbpedia.org/resource/Manchester0161
Steve56578Moscowhttp://dbpedia.org/resource/Moscow???
Debbie67832Barcelonahttp://dbpedia.org/resource/Berlin+43(E) 93(B)

Problem almost solved! Data from the DBpedia Knowledge Graph can still be incomplete or inconsistent in some cases. However, even if our script only solves 95% of our entries correctly, it will still save us hours of manual search.

Time to enjoy the weekend!

Result Formats

The DBpedia Lookup Service can return search results in either XML or JSON. Most Java and especially Javascript based applications can work more easily with JSON formatted data. Thus it is recommended to adjust the result format according to your use case.

Minimum Relevance Filtering

Each result of any DBpedia Lookup search is given a score based on label matches and other factors. Sometimes a label cannot be matched to a resource properly and only a few results are returned. Let’s assume that these results are only matched because of a similar label and not because they describe the entity you were actually looking for. In this case it might be better to reject those results as they would lead to a bad link. The DBpedia Lookup Service lets you specify a minimum score (or minimum relevance). All results with a score less than the specified minimum score are discarded and not suggested as a result.