Description
Background
There are two ways of declaring namespace prefixes in RDF 1.1 Turtle:
(a) @prefix bla: <...> .
(as in RDF 1.0, dot at the end!)
(b) PREFIX bla: <...>
(introduced in RDF 1.1, no dot at the end!)
At the moment, CoNLLRDFUpdater seems to support (a) only. This is not a problem if it only processes data produced by CoNLLStreamExtractor or Apache Jena, but it is if it is processing CoNLL-RDF data produced by other converters.
Action
Support syntax (b) in CoNLLRDFUpdater.
Note that the same difference in syntax also applies to @base
and BASE
, so these need to be updated, as well. (tbc: Are these currently included in prefix preprocessing.)
Preliminary workaround
Until this is solved, it is possible to convert all input data to nt notation before processing it, e.g., using rapper
:
$> cat my-file.ttl | rapper -i turtle '#' | run.sh CoNLLRDFUpdater ...
However, note that rapper will emit triples only, without preserving comments or spaces, so CoNLLRDFUpdater will not split the input. Also note that the simple RDF 1.1 to RDF 1.0 conversion that rapper provides with -o turtle
will not work with CoNLLRDFUpdater because it will insert empty lines between groups of triples with the same subject.