Skip to content

RDF 1.1 Turtle prefixes in CoNLLRDFUpdater #80

Open
@chiarcos

Description

@chiarcos

Background

There are two ways of declaring namespace prefixes in RDF 1.1 Turtle:
(a) @prefix bla: <...> . (as in RDF 1.0, dot at the end!)
(b) PREFIX bla: <...> (introduced in RDF 1.1, no dot at the end!)

At the moment, CoNLLRDFUpdater seems to support (a) only. This is not a problem if it only processes data produced by CoNLLStreamExtractor or Apache Jena, but it is if it is processing CoNLL-RDF data produced by other converters.

Action

Support syntax (b) in CoNLLRDFUpdater.
Note that the same difference in syntax also applies to @base and BASE, so these need to be updated, as well. (tbc: Are these currently included in prefix preprocessing.)

Preliminary workaround

Until this is solved, it is possible to convert all input data to nt notation before processing it, e.g., using rapper:

 $> cat my-file.ttl | rapper -i turtle '#' | run.sh CoNLLRDFUpdater ...

However, note that rapper will emit triples only, without preserving comments or spaces, so CoNLLRDFUpdater will not split the input. Also note that the simple RDF 1.1 to RDF 1.0 conversion that rapper provides with -o turtle will not work with CoNLLRDFUpdater because it will insert empty lines between groups of triples with the same subject.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions