Encoding scheme of DBpedia URIs


The encoding scheme of DBpedia URIs tries to be as close as possible to the Wikipedia title encoding scheme:


  • The alphanumeric characters “a” through “z”, “A” through “Z” and “0” through “9” remain the same.
  • The following special characters remain the same.
    • . (dot)
    • - (dash)
    • * (star)
    • / (slash)
    • : (colon)
    • _ (underscore)
    • , (comma)
    • & (ampersand)
  • The space character ' ' is converted into an underscore character '_'.
    • Multiple underscores are collapsed into one.
  • All other characters are first converted into one or more bytes using UTF-8 encoding. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte.

Please note that within the Internationalization efforts, the encoding scheme might change.


 
There are no files on this page. [Display files/form]
There is no comment on this page. [Display comments/form]

Information

Last Modification: 2011-08-12 13:40:56 by Max Jakob