Better Translation Technology
Andrzej Zydron, CTO XTM International
Better Translation Technology
DITA Localization
Better Translation Technology
In the beginning
Technical documentation was without form, and darkness was upon the face of
the page:
– Manual typesetting
– RTF
– WordPerfect
– MS Word
– FrameMaker
– Ventura Publisher
– Pagemaker
– SGML
3
Better Translation Technology
In the beginning
Lack of standards
•Proprietary solutions
•Problems with character encoding
•Expensive to design
•Expensive to build
•Expensive to maintain
•Expensive to localize
4
Better Translation Technology
Along came XML
Let there be light:
– XML born in 1997 from SGML/HTML
– Review of lessons learned from SGML
– Easier to implement
– Removed unnecessary complexity
– Declared standard encoding - Unicode
5
Better Translation Technology
DITA
Standards, Standards, Standards
DITA:
Advent of standards to
technical documentation
6
Better Translation Technology
DITA is not perfect!
Better Translation Technology
DITA - the good
Extremely well thought out XML document architecture:
– modularity
– fine level of granularity
– reuse
– bookmap
– standardized elements
– Write once, translate once, reuse many times
– Multiple output formats, multiple places, multiple docs:
• PDF, HTML, mobile, web, paper etc.
8
Better Translation Technology
DITA Localization
Practical considerations:
– Controlled Authoring:
• Consistency
• Terminology
– Delivery for localization:
• All at once in one big heap
• JIT - individual topics when ready
– Translation Consistency:
• Translation Memory
• Terminology
9
Better Translation Technology
DITA Localization - the good
Modularity:
– Translate a topic once
– Reuse many times!
• No need to retranslate
– Just in time translation
• Translate as soon as source is ready
• Dramatic improvement in time to market
• All documentation in all languages is ready concurrently
10
Better Translation Technology
DITA Localization - the good
• Decide how you want to translate:
– Whole document as one using bookmap
– Individual topics navigated according to bookmap
– Individual topics as and when ready
• Handling last minute engineering changes
– JIT translation
– Many TMS systems not good at handling this
– Automatically Update already translated segments
11
Better Translation Technology
DITA Localization - the <bad/><ugly/>
The bad and downright ugly (the three villains!):
– Word Substitution
• CONREF
• KEYREF
• DITAVAL
– Specialization
– Conditional processing
12
Better Translation Technology
DITA: square peg, round hole
• Do not try and force DITA to do what it is not designed for!
• DITA = Modular technical documentation
• Small, discrete topics
• No more than one page of text per topic
• Use the Open Toolkit
• Do not get overambitious with substitutions
– What works for English and Mandarin will not work for other languages
13
Better Translation Technology
DITA: Object Oriented Documentation
• DITA is an attempt to use OO design for XML documentation
• Very tempting for computer scientists
• We did it for computer programming
• Why not documentation?
• Problems arise with the nature of documentation
• Problems arise with the nature of human language
14
Better Translation Technology
Language – why humans mess things up!
What language is this?
What is he saying?
15
Better Translation Technology
Understanding the nature of English
• Why is English different from most other languages?
• English is a fusion language: a creole
– 60% Old Chaucerian English + 40% French
• Other Creoles with a high number of speakers:
– French (Vulgar Latin + Frankish)
– Swahili (Bantu + Arabic)
– Urdu (Hindi + Arabic)
– Mandarin
• (Many Sino-Tibetan languages)
16
Better Translation Technology
Understanding the nature of English
• Primitive morphology
– Nouns:
• Singular, plural, possessive
– ship, ships, ship’s, ships’
– No Gender
• a ship, the ship, the ships
– No adjectival agreement
• green ship, green ships
• We can substitute nouns and noun phrases without causing grammatical errors
• This is not true of most other languages
• English does not work like most other languages
• Your documentation WILL be translated sooner or later
17
Better Translation Technology
DITA Localization
Avoid word substitution (CONREF, KEYREF, DITAVAL):
– Linguistic issues
– Adjectival agreement
– Grammatical case
• Presenting the new Ford <keyword keyref=”model”> for 2014.
– very bad idea!
• Focus, Fiesta, Mondeo
• Nowy Focus, Nowa Fiesta, Nowe Mondeo
• Akin to saying ‘Presenting the Ford new Focus’
• Nowym Focus’em, Nową Fiestą, Nowym Mondeo
– May work for alphanumeric words
18
Better Translation Technology
DITA Localization
Only use substitution for linguistically complete sentences
– Warnings
– Cautions
– Notes
Avoid substitution for individual words or noun phrases
19
Better Translation Technology
Specialization
• Specialize at your peril!
– A double edged sword
• Increases exponentially difficulty:
– Authoring
– Publishing
– Localization
• New elements/attributes
– How are they to be treated
– For localization: completely new document type
20
Better Translation Technology
DITA and OAXAL
• OAXAL - Open Architecture for XML Authoring and Localization
• DITA Authoring and Localization in a Standards context:
– DITA is an Open Standard
– Why use proprietary software for Authoring and Localization of DITA?
Better Translation Technology
OAXAL
http://wiki.oasis-open.org/oaxal/FrontPage
Better Translation Technology
OAXAL Stack
Better Translation Technology
OAXAL Interaction
Better Translation Technology
OAXAL Source Lifecycle
Better Translation Technology
OAXAL Translation Lifecycle
26
Better Translation Technology
DITA Localization - considerations
• Choosing the right TMS/CAT System
– Can it handle XML properly:
• Entity references e.g. ‘&amp;’
• Encoding
• Validation
– Does it understand DITA
– Does it understand ditamap/bookmap
– Can you navigate using the bookmap
– Can it handle specialization
– Does it handle JIT
– Can it handle last minute changes
27
Better Translation Technology
How to reduce you translation costs
• Write less!
– Ford of Europe reduced translation costs by 50% in 2005
– It costs as much to translate into one language as it does to write the
original
• Use more graphics
– Integrate with CAD/CAM systems
– But beware text in graphics – use callouts
• People may actually start using your documentation
• KISS
• Manage your own translation assets: e.g. invest in your own TMS
– Save an additional 20% on average on cost and 50% on turnaround
Better Translation Technology
Less is More
Better Translation Technology
Contact Details
• Postal address:
– PO Box 2167
– Gerrards Cross
– Bucks SL9 8XF
– United Kingdom
• Phone: +44 1753 480 467
• Fax: +44 1753 480 465
• Andrzej Zydroń – azydron@xtm-intl.com

More Related Content

PDF
sete linguagens em sete semanas
PPTX
Finding Translations: Localization and Internationalization in Rails
PDF
Single-Sourcing and Localization stc16
PDF
Software Localization: What You Need to Know to Effectively Go Global
PPT
Xm lforthe smallerpublisher-andywilliams
PPTX
Remote agile testing webinar slides.
PDF
Agile Localization: Oxymoron or Heroic Achievement?
PDF
Challenges in Building NLP Applications in Nepali Language
sete linguagens em sete semanas
Finding Translations: Localization and Internationalization in Rails
Single-Sourcing and Localization stc16
Software Localization: What You Need to Know to Effectively Go Global
Xm lforthe smallerpublisher-andywilliams
Remote agile testing webinar slides.
Agile Localization: Oxymoron or Heroic Achievement?
Challenges in Building NLP Applications in Nepali Language

Viewers also liked (14)

PPTX
Localization and DITA: What you Need to Know - LocWorld32
PPT
Putting DITA Localization into Practice
PPTX
The tipping point
PPTX
Interverbum falcon-10oct14-az
PPTX
The Tipping Point
PPTX
Xtm webinar presentation xtm system overview
PPT
DITA and Translation Best Praticices
PPT
Open Standards
PPTX
Dos and donts
PPTX
Understanding linport
PDF
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Localization and DITA: What you Need to Know - LocWorld32
Putting DITA Localization into Practice
The tipping point
Interverbum falcon-10oct14-az
The Tipping Point
Xtm webinar presentation xtm system overview
DITA and Translation Best Praticices
Open Standards
Dos and donts
Understanding linport
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Ad

Similar to DITA for Localization (20)

PDF
Managing Localization from End-to-end - Going Global with DITA
PDF
(Almost) Four Years On: Metrics, ROI, and Other Stories from a Mature DITA CM...
PDF
The future of translation by Roberto Silva
PDF
Anna Zaretskaya - ESR 1 UMA
PPTX
Engineering Your Product Information for Local Markets
PPTX
Lean and Collaborative Content - Workshop
PPTX
Diversity In Localization (Olga Melnikova)
PPTX
Trends In Technology: Worldware 2010
PDF
DITA and Localization: Bringing the Best Together
PPT
2010 06-u maryland-crowd_sourcing-workshop-v2010-06-16-10h44
PDF
The Intricacies of DITA Content Localization
PDF
Routledge Encyclopedia Of Translation Technology Sinwai Chan
PPT
Company and sevices overview
PPT
CAT TOOLS.ppt
PPT
Corp preso it 04012014
PDF
Evaluation Of Translation Technology 1st Edition Walter Daelemans Vronique Hoste
PDF
DITA Interoperability
PPTX
(Recent) technology trends and bridges to gap in the localization industry
PPTX
TAUS webinar The Big Picture View On The Translation Industry, March 2013
PPTX
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
Managing Localization from End-to-end - Going Global with DITA
(Almost) Four Years On: Metrics, ROI, and Other Stories from a Mature DITA CM...
The future of translation by Roberto Silva
Anna Zaretskaya - ESR 1 UMA
Engineering Your Product Information for Local Markets
Lean and Collaborative Content - Workshop
Diversity In Localization (Olga Melnikova)
Trends In Technology: Worldware 2010
DITA and Localization: Bringing the Best Together
2010 06-u maryland-crowd_sourcing-workshop-v2010-06-16-10h44
The Intricacies of DITA Content Localization
Routledge Encyclopedia Of Translation Technology Sinwai Chan
Company and sevices overview
CAT TOOLS.ppt
Corp preso it 04012014
Evaluation Of Translation Technology 1st Edition Walter Daelemans Vronique Hoste
DITA Interoperability
(Recent) technology trends and bridges to gap in the localization industry
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
Ad

DITA for Localization

  • 1. Better Translation Technology Andrzej Zydron, CTO XTM International Better Translation Technology DITA Localization
  • 2. Better Translation Technology In the beginning Technical documentation was without form, and darkness was upon the face of the page: – Manual typesetting – RTF – WordPerfect – MS Word – FrameMaker – Ventura Publisher – Pagemaker – SGML
  • 3. 3 Better Translation Technology In the beginning Lack of standards •Proprietary solutions •Problems with character encoding •Expensive to design •Expensive to build •Expensive to maintain •Expensive to localize
  • 4. 4 Better Translation Technology Along came XML Let there be light: – XML born in 1997 from SGML/HTML – Review of lessons learned from SGML – Easier to implement – Removed unnecessary complexity – Declared standard encoding - Unicode
  • 5. 5 Better Translation Technology DITA Standards, Standards, Standards DITA: Advent of standards to technical documentation
  • 7. Better Translation Technology DITA - the good Extremely well thought out XML document architecture: – modularity – fine level of granularity – reuse – bookmap – standardized elements – Write once, translate once, reuse many times – Multiple output formats, multiple places, multiple docs: • PDF, HTML, mobile, web, paper etc.
  • 8. 8 Better Translation Technology DITA Localization Practical considerations: – Controlled Authoring: • Consistency • Terminology – Delivery for localization: • All at once in one big heap • JIT - individual topics when ready – Translation Consistency: • Translation Memory • Terminology
  • 9. 9 Better Translation Technology DITA Localization - the good Modularity: – Translate a topic once – Reuse many times! • No need to retranslate – Just in time translation • Translate as soon as source is ready • Dramatic improvement in time to market • All documentation in all languages is ready concurrently
  • 10. 10 Better Translation Technology DITA Localization - the good • Decide how you want to translate: – Whole document as one using bookmap – Individual topics navigated according to bookmap – Individual topics as and when ready • Handling last minute engineering changes – JIT translation – Many TMS systems not good at handling this – Automatically Update already translated segments
  • 11. 11 Better Translation Technology DITA Localization - the <bad/><ugly/> The bad and downright ugly (the three villains!): – Word Substitution • CONREF • KEYREF • DITAVAL – Specialization – Conditional processing
  • 12. 12 Better Translation Technology DITA: square peg, round hole • Do not try and force DITA to do what it is not designed for! • DITA = Modular technical documentation • Small, discrete topics • No more than one page of text per topic • Use the Open Toolkit • Do not get overambitious with substitutions – What works for English and Mandarin will not work for other languages
  • 13. 13 Better Translation Technology DITA: Object Oriented Documentation • DITA is an attempt to use OO design for XML documentation • Very tempting for computer scientists • We did it for computer programming • Why not documentation? • Problems arise with the nature of documentation • Problems arise with the nature of human language
  • 14. 14 Better Translation Technology Language – why humans mess things up! What language is this? What is he saying?
  • 15. 15 Better Translation Technology Understanding the nature of English • Why is English different from most other languages? • English is a fusion language: a creole – 60% Old Chaucerian English + 40% French • Other Creoles with a high number of speakers: – French (Vulgar Latin + Frankish) – Swahili (Bantu + Arabic) – Urdu (Hindi + Arabic) – Mandarin • (Many Sino-Tibetan languages)
  • 16. 16 Better Translation Technology Understanding the nature of English • Primitive morphology – Nouns: • Singular, plural, possessive – ship, ships, ship’s, ships’ – No Gender • a ship, the ship, the ships – No adjectival agreement • green ship, green ships • We can substitute nouns and noun phrases without causing grammatical errors • This is not true of most other languages • English does not work like most other languages • Your documentation WILL be translated sooner or later
  • 17. 17 Better Translation Technology DITA Localization Avoid word substitution (CONREF, KEYREF, DITAVAL): – Linguistic issues – Adjectival agreement – Grammatical case • Presenting the new Ford <keyword keyref=”model”> for 2014. – very bad idea! • Focus, Fiesta, Mondeo • Nowy Focus, Nowa Fiesta, Nowe Mondeo • Akin to saying ‘Presenting the Ford new Focus’ • Nowym Focus’em, Nową Fiestą, Nowym Mondeo – May work for alphanumeric words
  • 18. 18 Better Translation Technology DITA Localization Only use substitution for linguistically complete sentences – Warnings – Cautions – Notes Avoid substitution for individual words or noun phrases
  • 19. 19 Better Translation Technology Specialization • Specialize at your peril! – A double edged sword • Increases exponentially difficulty: – Authoring – Publishing – Localization • New elements/attributes – How are they to be treated – For localization: completely new document type
  • 20. 20 Better Translation Technology DITA and OAXAL • OAXAL - Open Architecture for XML Authoring and Localization • DITA Authoring and Localization in a Standards context: – DITA is an Open Standard – Why use proprietary software for Authoring and Localization of DITA?
  • 25. Better Translation Technology OAXAL Translation Lifecycle
  • 26. 26 Better Translation Technology DITA Localization - considerations • Choosing the right TMS/CAT System – Can it handle XML properly: • Entity references e.g. ‘&amp;’ • Encoding • Validation – Does it understand DITA – Does it understand ditamap/bookmap – Can you navigate using the bookmap – Can it handle specialization – Does it handle JIT – Can it handle last minute changes
  • 27. 27 Better Translation Technology How to reduce you translation costs • Write less! – Ford of Europe reduced translation costs by 50% in 2005 – It costs as much to translate into one language as it does to write the original • Use more graphics – Integrate with CAD/CAM systems – But beware text in graphics – use callouts • People may actually start using your documentation • KISS • Manage your own translation assets: e.g. invest in your own TMS – Save an additional 20% on average on cost and 50% on turnaround
  • 29. Better Translation Technology Contact Details • Postal address: – PO Box 2167 – Gerrards Cross – Bucks SL9 8XF – United Kingdom • Phone: +44 1753 480 467 • Fax: +44 1753 480 465 • Andrzej Zydroń – [email protected]