the xml localisation interchange file format

54
An Introduction to XLIFF Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC The XML Localisation Interchange File Format

Upload: teo

Post on 17-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

An Introduction to XLIFF Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC. The XML Localisation Interchange File Format. Agenda. Overview of XLIFF Definition, goals, and benefits of XLIFF Brief history of XLIFF Architecture Main features of XLIFF - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The XML Localisation Interchange File Format

An Introductionto XLIFF

Tony JewtushenkoOracle Corporation - Principal Product Manager

Chair – OASIS XLIFF TC

The XML Localisation Interchange File Format

Page 2: The XML Localisation Interchange File Format

Slide 2

Agenda

• Overview of XLIFF Definition, goals, and benefits of XLIFF

Brief history of XLIFF

• ArchitectureMain features of XLIFF

• The Real WorldUse cases and Tools support for XLIFF

• Current State of AffairsPost XLIFF 1.1 – what’s next…

Page 3: The XML Localisation Interchange File Format

Slide 3

XLIFF Overview

A glance at the definitions, goals and benefits of the XML Localisation Interchange File Format.

Page 4: The XML Localisation Interchange File Format

Slide 4

What is XLIFF?

A specification

for the lossless interchange of localizable data and its related information,

which is tool-neutral,

has been formalized as an XML vocabulary,

and features an extensibility mechanism.

Page 5: The XML Localisation Interchange File Format

Slide 5

XLIFF TC’s Charter

“The purpose of the OASIS XLIFF TC is to define, through XML vocabularies, an extensible specification for the interchange of localization information. The specification will provide the ability to mark up and capture localizable data and interoperate with different processes or phases without loss of information. The vocabularies will be tool-neutral, support the localization-related aspects of internationalization and the entire localization process. The vocabularies will support common software and content data formats. The specification will provide an extensibility mechanism to allow the development of tools compatible with an implementer's own proprietary data formats and workflow requirements.”

Page 6: The XML Localisation Interchange File Format

Slide 6

Why XLIFF is Needed?

Localization offers the following challenges:

• Insufficient interoperability between tools.

• Lack of support for overall localization workflow.

• Necessity of localization tools developers to deal with many formats.

• Large number of proprietary intermediate formats.

Page 7: The XML Localisation Interchange File Format

Slide 7

Advantages – Localization Customer

• Single format for adjunct processing (e.g. quality control in terms of spell checking).

• Less dependency on vendors which are able to work with special formats.

• Tighter control on what goes to localization (Pre-filtering of what to translate or not).

• Controlled information flow (author/developer notes, item properties, etc.).

• ID-based leveraging.• All advantages of XML-based processing.

Page 8: The XML Localisation Interchange File Format

Slide 8

Advantages – Tools Vendor

• Focus on development of core functionality rather treatment of source format.

• Allow usage of tools in new contexts.

• All advantages of XML-based processing.

Page 9: The XML Localisation Interchange File Format

Slide 9

Advantages – Service Provider

• Single format for adjunct processing (e.g. quality control in terms of spell checking).

• Less dependency on specific localization tools.

• Controlled information flow (author/developer notes, item properties, etc.).

• Allow usage of tools in new contexts.

• All advantages of XML-based processing.

• Open and standard solution for proprietary formats.

Page 10: The XML Localisation Interchange File Format

Slide 10

Advantages – Technology (1/2)

• For a given utility, only one implementation is necessary (e.g. not one spell checker for RTF, and another one for HTML).

• Increases usability of utilities (i.e. all formats with XLIFF filters can be used with XLIFF-enabled utilities).

Page 11: The XML Localisation Interchange File Format

Slide 11

Advantages – Technology (2/2)

• All advantages of XML-based processing:– Use of its internationalization features.– Better interoperability and cross-platform support.– Powerful rendering options (XSL-FO, CSS).– Powerful transformation options (XSLT).– Greater integration with Web services.

• Access to existing, and often open-source, XML implementation (lower costs).

Page 12: The XML Localisation Interchange File Format

Slide 12

Genesis of XLIFF

• Founded: Sept 2000

• Founding Members: Novell, Oracle and Sun

• Initially named “DataDefinition” group

Page 13: The XML Localisation Interchange File Format

Slide 13

XLIFF Timeline

• September 2000 - DataDefinition Kickoff

• December 2000 - first face to face

• March 2001 - second face to face

• End March 2001 - draft 1.0 spec and DTD published

• June 2001 - White Paper published

• December 2001 - OASIS XLIFF Technical Committee Proposal submitted

• April 2002 – XLIFF 1.0 Specification approved by formal vote as an OASIS Committee Specification

• May 2003 – XLIFF 1.1 Specification approved by formal vote as an OASIS Committee Specification

• August/Sept 2003 – XLIFF 1.1 Peer Review

• November 2003 – Revised XLIFF 1.1 Specification approved as OASIS Committee Specification

• November 2003 – XLIFF 1.1 Specification submitted for public review

Page 14: The XML Localisation Interchange File Format

Slide 14

OASIS: Standards Body Home of XLIFF

• OASIS: Organization for the Advancement of Structured Information Standards

• World’s largest independent, non-profit organization dedicated to the standardisation of XML applications and Web Services

• More than 150 member companies plus individuals

• Operates XML.ORG Registry, the open community clearinghouse of XML application schemas clearinghouse of XML application schemas

• Technical work on XML interoperability includes XML conformance and XML Registries/Repositories

• General XML technical resource

Page 15: The XML Localisation Interchange File Format

Slide 15

Drivers Behind XLIFF

Alchemy SoftwareBowne Global SolutionsConvey SoftwareEktron, Inc ENLASO Corp (RWS)GlobalsightHPLotus/IBMLionbridgeLRCMoravia IT

NovellOraclePASS EngineeringMicrosoftSAPSDL InternationalSun MicrosystemsTektronixTRADOS

Page 16: The XML Localisation Interchange File Format

Slide 16

Present OASIS XLIFF TC• TC Officers:

– TC Chair: Tony Jewtushenko, Oracle Corporation– TC Secretary: Peter Reynolds, Bowne Global Solutions– TC Editor: Yves Savourel

• Current Members of TC: • Mat Lovatt, Oracle• Enda McDonnell, Individual• Eiju Akahane, IBM• Gerard Cattin des Bois, Microsoft Corporation• Doug Domeny, Individual• Milan Karasek, Moravia IT• Christian Lieske, SAP• David Pooley, SDL International• John Reid, Novell• Reinhard Schaler, Limerick Localisation Research Centre• Bryan Schnabel, Individual• Shigemichi Yazawa, Individual• Andrzej Zydron, Individual• Magnus Martikainen, TRADOS Inc.• Florian Sachse, Individual

Page 17: The XML Localisation Interchange File Format

Slide 17

XLIFF TC in the Community

• Shared interests with OASIS Translation Web Services Technical Committee– XLIFF may be used as data container for WS

• Shared interests with the OSCAR SIG at LISA– Segmentation and word-count.– Content markup (inline codes).

• Shared interests with the W3C i18n WG– Localization directives.– Best practices.– In the localization aspects of the W3C. recommendations.– Web services.

Page 18: The XML Localisation Interchange File Format

Slide 18

Architecture

A look at XLIFF’s main features and how they work together.

Page 19: The XML Localisation Interchange File Format

Slide 19

Extract-Localize-Merge Paradigm

• Separate data related to localization from parts not related to localization.

• Merge translated data with codes at the end of the process to create the final document.

• Skeleton file is optional, so this paradigm is also optional

Page 20: The XML Localisation Interchange File Format

Slide 20

A Birds-Eyes View

An XLIFF document can capture anything needed for a localization project:

1. Localizable objects (e.g. text strings) in source and target languages.

2. Supplementary information (e.g. glossaries, or material to recreate the original format).

3. Administrative information (e.g. workflow data).

4. Custom data (e.g. initialization information for tools).

Page 21: The XML Localisation Interchange File Format

Slide 21

The XLIFF Document

• An XLIFF document is designed to store the extracted data related to localization.

• Each given source container (e.g. a file, a database table, and so forth) corresponds to a <file> element in XLIFF.

• Each XLIFF document can include several <file> elements.

• A whole localization project can possibly be stored in a single XLIFF document.

Page 22: The XML Localisation Interchange File Format

Slide 22

Bilingual Model

• Each <file> element is designed to store one source language and one target language.

• The rational is that the translation of different target language is done by different people most of the time.

• However, languages in <alt-trans> element can be different. For example, proposed matches in national Portuguese when translating into Brazilian Portuguese.

Page 23: The XML Localisation Interchange File Format

Slide 23

Localizable Objects

• XLIFF allows not only text string as localizable object but also other object types such as graphics.

• Supplementary information can be represented in a generic way through inline codes (e.g. formatting of text).

• Relationship between object can be captured (e.g. all items in a menu).

Page 24: The XML Localisation Interchange File Format

Slide 24

An XLIFF Snippet…

A simple menu represented as XLIFF

Page 25: The XML Localisation Interchange File Format

Slide 25

Supplementary Info

• XLIFF provides “hooks” for storing supplementary information (for example to glossaries or translation memories which should be used).

• The supplementary information can be referenced (i.e. reside outside of the document), or embedded within the document.

Page 26: The XML Localisation Interchange File Format

Slide 26

Administrative Info

XLIFF provides mechanisms for capturing administrative information:

• For relating source material to XLIFF documents.

• For storing workflow data.

• For providing pre-translation entries.

• For keeping track of changes.

Page 27: The XML Localisation Interchange File Format

Slide 27

Administrative Info – Pre-Leveraging

A set of proposed translation can be included for each <trans-unit> element, using the <alt-trans> element.

<trans-unit id='1'> <source xml:lang='en'>The text</source> <alt-trans quality-match='high' origin='MTsystem'> <target xml:lang='fr'>Le texte</target> </alt-trans></trans-unit>

Page 28: The XML Localisation Interchange File Format

Slide 28

Custom Data in XLIFF 1.0

In XLIFF 1.0, we use the <prop> element and the ts attribute to store user-defined information (*note: these features are deprecated in XLIFF 1.1)

<trans-unit id='1' ts='ctx:23a7'> <prop-group> <prop prop-type='myType' >Some property data</prop> </prop-group> <source>Text</source></trans-unit>

Page 29: The XML Localisation Interchange File Format

Slide 29

XLIFF 1.1 Custom Data

In XLIFF 1.1, we have the ability to customise XLIFF by extending:– Elements– Attributes– Attribute Values

Page 30: The XML Localisation Interchange File Format

Slide 30

Extending Elements

– Extension points in the following elements: • <header>, <group>, <tool>, <trans-unit>, <alt-trans>,

and  <bin-unit>.

– content of each custom element can be any valid XML content:

• empty content, PCDATA, mixed content, and so forth

– Custom elements defined in private namespace schema

Page 31: The XML Localisation Interchange File Format

Slide 31

Example of Extending Elements in XLIFF 1.1

<xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1' xmlns:sup='http://www.ChaucerState.ac.pg/Frm/XLFSup-v1'> <file original='passus-1.doc' source-language='enm‘

datatype='plaintext'> <group> <sup:SourceInfo> <sup:Book>Piers Plowman, Passus 1</sup:Book> <sup:Author>William Langland</sup:Author> </sup:SourceInfo> <sup:WorkInfo Task='transcription' Context='Middle-

English:1360'/> <trans-unit id='1'> <source xml:lang='enm'>What this mountaigne bymeneth</source> <target xml:lang='en'>What this mountain means</target> <sup:Reference Type='strophe'>1-a</sup:Reference> </trans-unit> </group> </file></xliff>

Page 32: The XML Localisation Interchange File Format

Slide 32

Extending Attributes

• Attributes of a namespace different than XLIFF can be included in these XLIFF elements: – <file>, <group>, <trans-unit>, <source>, <target>,

<tool>, <bin-unit>, <bin-source>, <bin-target>, <alt-trans>, <mrk>, <g>, <x/>, <bx/>, <ex/>, <bpt>, <ept>, <ph>, and <it>

• No specific location where to insert the non-XLIFF attributes

• No limit to the number of non-XLIFF attributes that can be used in an XLIFF document

Page 33: The XML Localisation Interchange File Format

Slide 33

Example of Extending AttributesAttributes from the HTML vocabulary extend the

<group> and <trans-unit> <xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1' xmlns:htm='http://www.w3.org/TR/REC-html40'> <file original='table.htm' source-language='en' datatype='html'>

<group restype='table' htm:border='1' htm:cellpadding='5‘ htm:cellspacing='0' htm:width='100%'><group restype='row'>

<trans-unit id='1' htm:valign='top' htm:width='30%'> <source>Text of row 1 column 1</source>

</trans-unit> <trans-unit id='1' htm:valign='top' htm:width='30%'>

<source>Text of row 1 column 2</source></trans-unit>

</group> <group restype='row'>

<trans-unit id='1' htm:valign='top' htm:width='30%'><source>Text of row 2 column 1</source>

</trans-unit><trans-unit id='1' htm:valign='top' htm:width='30%'>

<source>Text of row 2 column 2</source></trans-unit>

</group></group>

</file></xliff>

Page 34: The XML Localisation Interchange File Format

Slide 34

Extending Attribute Values

• Attributes where the list of values can be extended are the following: context-type, count-type, ctype, datatype, mtype, restype, size-unit, state, unit, priority, and purpose

• User-defined values must start with a “x-” prefix

• There is no specified mechanism to validate individual user-defined values, beyond starting with “x-”

Page 35: The XML Localisation Interchange File Format

Slide 35

Example of Extending Attribute Values

• The following excerpt shows how the user-defined value x-for-engineer can be utilized in a document:...<group>

<context-group name='EngineersData'>

<context context-type='x-for-engineers'>Data...</context>

</context-group>

</group>

...

Page 36: The XML Localisation Interchange File Format

Slide 36

Embedding XLIFF (XLIFF 1.1)

• Can embed an entire or part of an XLIFF doc in other XML doc

• XML defined by XML Schema (XSD) that includes an <any> element in the definition of the element where the XLIFF data can be inserted

Page 37: The XML Localisation Interchange File Format

Slide 37

Deprecated or changed 1.0

• reformat – feature changed

• tool attribute becomes tool element

• new tool-id attribute

• ts, prop / prop-group - deprecated

• header was required, now optional

• default –can specify default values for given scope

Page 38: The XML Localisation Interchange File Format

Slide 38

Data Validation

• In 1.0, validation by DTD

• In 1.1, validation by XML Schema – XSD

• XSD provides better control over XML document: – Structure – structured order can be specified– Content – support for standard datatypes like

date– Semantics – can specify range of valid values

or pattern– Support for namespace

Page 39: The XML Localisation Interchange File Format

Slide 39

The Real World

A look at some concrete examples on how XLIFF can be used in localization projects.

Page 40: The XML Localisation Interchange File Format

Slide 40

Streamlining L10n Files Exchanges

Localization Customer

LocalizationPreprocessorLocalizationPreprocessor

Pre-translatedProprietary Format File

Localization Vendor

Customer Supported

Localization Tool

INCCSV

DOCDBLANG

SHLMDB

CATCFG

.INI.TXT

ZINCDOCLANG

MSGAGENT

ICSFILXLIFF

HTML.XSL

XML

INSNLM

ASDHGFF

VBNPARA

CATXRDB

PROP.JAVA

C++

HLPRC

MCEN

XSFTFD

LDIMENU

PCT.EXE

..DLL

Localization Customer

INCCSV

DOCDBLANG

SHLMDB

CATCFG

.INI.TXT

ZINCDOCLANG

MSGAGENT

ICSFILXLIFF

HTML.XSL

XML

INSNLM

ASDHGFF

VBNPARA

CATXRDB

PROP.JAVA

C++

HLPRC

MCEN

XSFTFD

LDIMENU

PCT.EXE

..DLL

Localization Vendor

VendorLocalization Process

Localization Customer Localization Vendor

Any tools based on XLIFF Industry

Standard

INCCSV

DOCDBLANG

SHLMDB

CATCFG

.INI.TXT

ZINCDOCLANG

MSGAGENT

ICSFILXLIFF

HTML.XSL

XML

INSNLM

ASDHGFF

VBNPARA

CATXRDB

PROP.JAVA

C++

HLPRC

MCEN

XSFTFD

LDIMENU

PCT.EXE

..DLL

XLIFFLocalizationPreprocessorLocalizationPreprocessor

Page 41: The XML Localisation Interchange File Format

Slide 41

Basic Use Case – without XLIFF

Tool ResourceFilters

DeveloperApplications TranslatorCustomer

SpecificTool (s)

Native File 2(e.g., JavaFiles)

Native File 1(e.g., HTML)

Native File 3(e.g., Java Properties)

Native File n

Publisher/CustomerDomain

LocalisationDomain

Page 42: The XML Localisation Interchange File Format

Slide 42

Basic Use Case –with XLIFF

XLIFF compliant DeveloperApplications

TranslatorXLIFFCompliantEditor

XLIFF file(s) containingHTML, Java, Properties, etc translatable resources

Non XLIFF compliant DeveloperApplications

- OR -

Publisher/CustomerDomain

LocalisationDomain

Direct toXLIFF authoring

HTML

Java Properties

RC Data

Pre-processing

Page 43: The XML Localisation Interchange File Format

Slide 43

Simple Automated Localisation Use Case

Developer Translator

GenerateXLIFF

Pseudo Translate / Test

LocalizationEngineer

XLIFF Translation Kit

Leverage

TranslationRepository

DefectReport

XLIFF Editor

Update

XLIFF Translation Kit

Translate

RequiresTranslation

100%Translated

0% Translated

100%Translated

Page 44: The XML Localisation Interchange File Format

Slide 44

Automated Localisation with CAT Use Case

Developer Translator

GenerateXLIFF

Pseudo Translate / Test

LocalizationEngineer

XLIFF Translation Kit

100% match

TranslationRepository

DefectReport

XLIFF Editor

XLIFF Translation Kit

Translate

RequiresTranslation

100%Translated

0% Translated

100%Translated

Fuzzymatch

TranslationMemory

MachineTranslation

MachineTranslate

Update

Page 45: The XML Localisation Interchange File Format

Slide 45

Benefits: Use of XML Technologies

• XSL can be used to perform many tasks on XLIFF documents, for example:– Display translatable content in Web browser.– Generate statistics (e.g. number of localizable

objects).

• Availability of many XML engines makes using XLIFF easy.– Content-related checks (e.g. that certain characters

do not appear as textual contents) can be performed with ordinary Web browsers.

Page 46: The XML Localisation Interchange File Format

Slide 46

XML-Enabled Translation Tools

• Any XML-enabled translation tool can work with an XLIFF document, as long as the text to translate is initially copied in the <target> elements. However, this does not mean it supports all XLIFF features, but just permits translation of <target> content.

• Many tools cannot handle conditional translation (for example: <trans-unit translate="no">). Then, you need to add extra elements temporarily.

Page 47: The XML Localisation Interchange File Format

Slide 47

3rd Party Tools Support for XLIFF• ENSALO Corp (formerly “RWS Group”) : Extraction

Utility for RC Data and Java Properties to XLIFF 1.1 http://dotnet.goglobalnow.net/

Various Utilities: http://www.translate.com/shared/tools • XML-Intl : XLIFF Editor http://www.xml-intl.com • Heartsome XLIFF Editor: http://www.heartsome.net• Alchemy Software - Catalyst 5.0 – Visual XLIFF 1.1

Editor http://www.alchemysoftware.ie• PASS: Passolo: Visual XLIFF Editor: http://www.passolo

.com• Trados: No direct XLIFF support, but can edit XLIFF

files using modified INI

Page 48: The XML Localisation Interchange File Format

Slide 48

More Tools Support for XLIFF• Bowne Global Solutions: Elcano, Online Translation

Service has a web service based connector for XLIFF files http://elcano.bowneglobal.com

• IBM: Domino Global Workbench Version 6 (http://www6.software.ibm.com/devcon/devcon/docs/dwkbbet6.htm);

I18n Components for Unicode (ICU): http://oss.software.ibm.com/developerworks/oss/icu/project/userguide/ResourceManagement.html#XLIFF_usage

• Macromedia Flash: http://livedocs.macromedia.com/flash/mx2004/main/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Flash_MX_2004_Documentation&file=13_mul19.htm

Page 49: The XML Localisation Interchange File Format

Slide 49

More Tools Support for XLIFF• Sun : Internal XLIFF Editor as described in this article:

http://www.sun.com/developers/gadc/technicalpublications/articles/xliff.html

• Novell: XMsgTool: http://labs.novell.de/ndk/doc/msgtool/index.html?page=/ndk/doc/msgtool/msg__enu/data/aec0nh0.html

• Open Source XSLT Tools: http://sourceforge.net/project/showfiles.php?group_id=42949&release_id=67485

• Oracle: HTMLDB: a rapid web application development tool for the Oracle database: http://otn.oracle.com/products/database/htmldb/index.html

HyperHub: Internal tool for editing XLIFF based translation archives

Page 50: The XML Localisation Interchange File Format

Slide 50

Future Support for XLIFF Announced:• Apple Corp: Apple’s resource editor AppleGlot• Idiom: Worldserver V.6.0• SDL International: SDLX support for XLIFF currently

in development. See http://www.sdlx.com for more information.

Page 51: The XML Localisation Interchange File Format

Slide 51

Current State of Affairs

A look at the work under way at the OASIS XLIFF TC, the future, etc.

Page 52: The XML Localisation Interchange File Format

Slide 52

Current State of Affairs – To Do

• Specification of canonical representation in XLIFF of common formats (e.g. Windows resources, Java properties), so all XLIFF representations are the same regardless which tool created the document.

• Translation/Localization tools that support XLIFF out-of-the-box (not just as another XML format).

• Open Source filters: https://sourceforge.net/projects/xliffroundtrip/

• Segmentation Sub Committee: Representing Segmentation metadata in XLIFF content to improve effectiveness of Translation Memory.

Page 53: The XML Localisation Interchange File Format

Slide 53

More Information

• The XLIFF TC Web Site: http://www.xliff.org

• Presenter: – XLIFF TC Chair: Tony Jewtushenko (Oracle)

([email protected])

• Significant Contributors to this Presentation:– Christian Lieske, (SAP)

([email protected])

– Yves Savourel (RWS Group)([email protected])

Page 54: The XML Localisation Interchange File Format

Slide 54

Thank You...

Questions?