DBpedia and (Open-)Cyc

May 19, 2015, Categories: Annotation and/or Information Extraction




Cyc (when compared to DBpedia) seems to follow a rather top-down approach – first more abstract concepts and entities are represented and later Cyc started to include also more domain knowledge. This seems to be reasonable, since domain knowledge changes faster and there is much more of it. On the other hand, domain knowledge is usually, what people need to solve real problems within their domains. DBpedia contains primarily domain knowledge, hence a combination of both – Cyc and DBpedia – could really be a winning team.

We plan to work on how DBpedia and OpenCyc can be fruitfully combined. For now, you can download an OWL/RDF version of OpenCyc from: http://sw.cyc.com/2006/07/opencyc_1_0.zip

A first relation between OpenCyc and DBpedia concepts was established committed OpenCyc community (especially Vijay Alilaghatta) with the following dataset: http://sw.cyc.com/2006/07/wikipedia_links.csv.zip If you have any questions regarding these links please contact Vijay Alilaghatta (vijay@cycfoundation.org), who kindly contributed these under the terms of the GFDL.

It contains three columns: the Cyc term, the Wikipedia title and an integer confidence value between 1 and 999 (with 999 representing the highest confidence).

We generated a DBpedia dataset establishing owl:sameAs links for relations with confidence 999 between Cyc and DBpedia with the following small awk script:

awk -F'","' -v RS='"\n"'  '// {
      print "<http://dbpedia.org/resource/"$2"> <http://www.w3.org/2002/07/owl#sameAs> <http://sw.cyc.com/2006/07/27/cyc/"$1"> . "}'
  wikipedia_links.csv > links_cyc.nt

The resulting DBpedia dataset can be downloaded from: http://wiki.dbpedia.org/files/links_cyc.nt.bz2

More information about OpenCyc can be found at: http://www.opencyc.org