Skip to content

Need to improve GIS/gazetteer encoding support in TEI #1474

Open
@martindholmes

Description

@martindholmes

The support for GIS in the TEI Guidelines is rudimentary at best, and there is an increasing need for TEI files to incorporate GIS information in a way that allows interchange; this means clarifying how geographic information encoded in TEI relates to other geographic encoding systems. The <geoDecl> element definition is long outdated, and its description in the Guidelines is cursory:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HDGDECL

We need to be able to do much more than specify "a commonly used code name for the datum employed". Nowadays, I think we need to be able to:

  • Encode location (feature) geometries, such as (using GeoJSON terminology for example purposes):
    • Point
    • LineString
    • Polygon
    • MultiPoint
    • MultiLineString
    • MultiGeometry
    • GeometryCollection
  • Specify how these geometries have been encoded (GeoJSON, WKT, GML, KML, etc.)
  • Specify the Coordinate Reference System (CRS, = SRS, or Spacial Reference System) used. For this I believe we should use a data.pointer attribute called @crs pointing to OGC CRS URNs, just as GeoJSON does. The most commonly-used URNs should be provided as suggested values for the attribute.

We need to provide recommended ways of encoding this information using the <location> and <geo> elements. I believe we should also enable default values for them to be encoded at higher levels, such as the <place> and <listPlace> elements; descendants would inherit values from their ancestors, so that (for instance) a CRS defined on a <listPlace> would be deemed to apply to any descendant <place>, <location> and <geo> elements unless overridden. This suggests an attribute class, perhaps att.geo. We could retain the use of @decls to point to <geoDecl> elements in the header too; the old geoDecl/@datum could be retained for backwards compatibility, but the new attributes suggested below could be added.

The Guidelines provides one example of the use of a specific encoding standard (GML) for a feature:

<place xml:id="locLyon" type="city">
 <placeName notBefore="1400">Lyon</placeName>
 <placeName notAfter="0056">Lugdunum</placeName>
 <location>
  <geo>
   <gml:Polygon>
    <gml:exterior>
     <gml:LinearRing> 45.256 -110.45 46.46 -109.48 43.84 -109.86 45.8 -109.2
           45.256 -110.45 </gml:LinearRing>
    </gml:exterior>
   </gml:Polygon>
  </geo>
 </location>
</place>

But other than the use of the GML elements in their namespace, there's no demonstrated method by which it's specified that this is GML. This GML itself appears to be incomplete, since (unlike the case of GeoJSON, which defaults to EPSG:4326), there's no default CRS for GML, so it's not clear what this encoding actually means; at the very least, @srsName should appear on <gml:Polygon> (if my understanding of GML is correct).

I would like to see a system where, through the addition of att.geo, we are able to do something like this:

<place xml:id="place01">
  <placeName>My house</placeName>
    <desc>The house where I live</desc>
  <location crs="urn:ogc:def:crs:OGC:1.3:CRS84" featureType="Point" geoEncoding="GeoJSON">
    <geo>[-121.99939727783202,49.05722008695]</geo>
  </location>
</place>

The attributes @crs and @geoEncoding would be available on <listPlace>, <place>, <location> and <geo>, while @featureType might be more constrained (or perhaps not; if a gazetteer contained only points, why not specify this at the <listPlace> level?). Values for @featureType and @geoEncoding would consist of curated lists of suggested values, based on a survey of what is widely used and likely to be stable; we would want to support (for instance) all the GeoJSON feature types, as well as types such as LinearRing used explicitly in OpenLayers and KML, perhaps Track and MultiTrack used on OGC KML.

The overall objective would be to provide encodings which are capable of unambiguous interpretation and transformation to and from standard GIS file formats.

I would recommend starting a small TEI SIG to develop this work, with the objective of creating pilot ODDs and a small Guidelines chapter where this can be described in detail outside of the Names and Dates context.

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions