copyright (c) [2000]. roger l. costello. all rights reserved. 1 xml schemas roger l

97
1 (c) [2000]. Roger L. Costello. All Rights Reserved. XML Schemas http://www.w3.org/TR/xmlschema-1/ http://www.w3.org/TR/xmlschema-2/ Roger L. Costello XML Technologies

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

1Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

XML Schemas

http://www.w3.org/TR/xmlschema-1/ http://www.w3.org/TR/xmlschema-2/

Roger L. Costello

XML Technologies

Page 2: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

2Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Thanks

• Special thanks to the following people for their help in answering my endless barrage of questions:– Henry Thompson– Andrew Layman– Noah Mendelsohn

– David Beech– Rick Jelliffe

• Many thanks the following people who have carefully reviewed this tutorial and notified me of bugs, typos, etc.– Oliver Becker

– Kit Lueder– David Wang

Page 3: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

3Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Motivation• People are dissatisfied with DTDs

– It's not XML

• So, you write your XML document in one language and you specify the grammar of that document using another language (DTD) --> bad, inconsistent

– Limited datatype capability

• DTDs support a very limited capability for specifying datatypes. You can't, for example, say "I want this <elevation> element to be of datatype "int"

• Desire a set of datatypes compatible with those found in databases

– DTD supports 10 datatypes; XML Schemas supports 37+ datatypes

Page 4: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

4Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Highlights of XML Schemas• XML Schemas are a tremendous advancement over DTDs:

– Enhanced datatypes • 37+ versus 10• Can create your own datatypes• Can define the lexical representation

– Example "This element can contain strings of this form: ddd-dddd, where 'd' represents a 'digit'".

– Written in XML• enables use of XML tools

– Object-oriented• Can extend or restrict a type (derive new type definitions on the basis of old ones)

– Can express sets - the child elements may occur in any order– Can specify element content as being unique (keys on content) and uniqueness within a region– Can define multiple elements with the same name but different content– Can define elements with null content– Can create equivalent elements - e.g., the "subway" element is equivalent to the "train" element.

Page 5: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

5Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Problem

• Convert BookCatalogue.dtd to the XML Schema notation– for this first example we will make a straight,

one-to-one conversion, i.e., Title, Author, Date, ISBN, and Publisher will hold strings, just like is done in the DTD (next slide)

– We will gradually modify the example to give more custom types to these elements

Page 6: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

6Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

BookCatalogue.dtd

<!-- A book catalogue contains zero or more books --><!ELEMENT BookCatalogue (Book)*><!-- A Book has a Title, Author, Date, ISBN, and a Publisher --><!ELEMENT Book (Title, Author, Date, ISBN, Publisher)><!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)><!ELEMENT Date (#PCDATA)><!ELEMENT ISBN (#PCDATA)><!ELEMENT Publisher (#PCDATA)>

Page 7: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

7Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, Author, Date, ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> </type> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue1.xsdxsd = Xml-Schema Definition

(explanations onsucceeding pages)

Page 8: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

8Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, Author, Date, ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> </type> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue1.xsd

This (default)namespacedeclarationsays that allthese elementscome from thisnamespace (theXML Schemanamespace).

By virtue of thefact that they areassociated withan element thatcomes from theXML Schemanamespace, theseattributes alsocome from theXML Schema namespace.(Recall that a default namespacedoesn’t apply to attributes.)

Page 9: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

9Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, Author, Date, ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> </type> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue1.xsd

Says that theelements andtypes definedin this schemaare in thisnamespace.This will beused in instancedocuments toindicate that theelements contained in theinstance document comefrom thisnamespace.

Page 10: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

10Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, Author, Date, ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> </type> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue1.xsd

Here we arereferencing aBook element.Where is thatBook elementdefined? In what namespace?The cat: prefixindicates whatnamespace thiselement is defined in. cat:has been set tobe the same as thetargetNamespace.

Page 11: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

11Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

xmlns:cat

• The schema element has an attribute xmlns:cat. The cat: namespace prefix, as we noted, is used for one element referencing another element within the schema.

• Clearly, this attribute will be schema-dependent, e.g., if I write a schema for cars I might have xmlns:car.

• Obviously, there is no way for xml-schema.dtd to account for all the possibilities. Thus, in the schema we supplement xml-schema.dtd with the internal subset declaration:

<!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]>

Page 12: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

12Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, Author, Date, ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> </type> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue1.xsd

The annotation/info elements areoptional. Theircontent is intended forhuman consumption, i.e., it's a comment.

Page 13: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

13Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Referencing a schema in an XML instance document

<?xml version="1.0"?><BookCatalogue xmlns ="http://www.somewhere.org/BookCatalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/BookCatalogue http://www.somewhere.org/BookCatalogue/BookCatalogue1.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>July, 1998</Date> <ISBN>94303-12021-43892</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> ...</BookCatalogue>

In the BookCatalogue element (the root element) I declare that the schemaLocation attributecomes from the XML Schema Instance namespace (xsi). The value of schemaLocation is apair of values - a namespace and the URI to a schema. When the XML parser processes thisXML document it will use the schemaLocation pair of values to determine the XML Schemathat it conforms to. It will retrieve the schema at the URI specified in schemaLocation (inthis example, BookCatalogue.xsd) and then it will open up this schema document to confirmthat its targetNamespace value matches the namespace value shown in schemaLocation. In this case it does. I declare (using a default namespace) that all the elements in this XMLinstance document comes from the same namespace.

Page 14: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

14Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Referencing a schema in an XML instance document

BookCatalogue.xml BookCatalogue.xsd

targetNamespace="A"schemaLocation="A A/BookCatalogue.xsd"

- defines elements for namespace A

- uses elements from namespace A

Page 15: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

15Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Note about schemaLocation

• schemaLocation is just a hint to the XML Parser• "The choice of which schema to use ultimately lies with the

consumer. If you as a consumer wish to rely on the schemaLocation idiom, then you should purchase/use processors that will honor that for you. The reason that some other processors might not provide that service to you is that they are designed to run in environments where it is impractical or undesirable to allow the document author to force reference to and use of some particular schema document." (Noah Mendelsohn)

• For this tutorial I will assume that we are using an XML Parser which uses the schemaLocation idiom.

Page 16: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

16Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Note multiple levels of checking

BookCatalogue.xml BookCatalogue1.xsd xml-schema.dtd

Does the xml documentconform to the rules laidout in the xml-schema?

Is the xml-schema a validxml document, i.e., does itconform to the rules laidout in the xml-schema DTD?

Do Lab1

Page 17: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

17Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Alternate Schema

• On the following slide is an alternate (equivalent) way of representing the schema shown previously.

Page 18: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

18Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> </element> </type> </element></schema>

BookCatalogue2.xsd

Page 19: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

19Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> </element> </type> </element></schema>

BookCatalogue2.xsd

Anonymous type (no name)

Do Lab 2

Page 20: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

20Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Named Types

• The following slide shows an alternate (equivalent) schema which uses named types.

Page 21: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

21Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*" type="cat:CardCatalogueEntry"/> </type> </element> <type name="CardCatalogueEntry"> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type></schema>

BookCatalogue3.xsd

Page 22: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

22Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Problem

• Defining the Date element to be of type string is unsatisfactory (it allows any string value to be input as the content of the Date element, including non-date strings). We would like to constrain the allowable content that Date can have. Modify the XML-Schema to restrict the content of the Date element to just date values.

• Similarly, constrain the content of the ISBN element to content of this form: ddddd-ddddd-ddddd or d-ddd-ddddd-d or d-dd-dddddd-d, where 'd' stands for 'digit'

Page 23: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

23Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <datatype name="ISBNType" source="string"> <pattern value="\d{5}-\d{5}-\d{5}"/> <pattern value="\d{1}-\d{3}-\d{5}-\d{1}"/> <pattern value="\d{1}-\d{2}-\d{6}-\d{1}"/> </datatype> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="date"/> <element name="ISBN" type="cat:ISBNType"/> <element name="Publisher" type="string"/> </type> </element> </type> </element></schema>

BookCatalogue4.xsd

Page 24: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

24Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<datatype name="ISBNtype" source="string"> <pattern value="\d{5}-\d{5}-\d{5}"/> <pattern value="\d{1}-\d{3}-\d{5}-\d{1}"/> <pattern value="\d{1}-\d{2}-\d{6}-\d{1}"/></datatype>

"Elements of this datatype will have string content. The stringmust conform to one of the three patterns listed. - The first pattern is: 5 digits followed by a dash followed by 5 digits followed by another dash followed by 5 more digits.- The second pattern is: 1 digit followed by a dash followed by 3 digits followed by another dash followed by 5 digits followed by another dash followed by 1 more digit.- The third pattern is: 1 digit followed by a dash followed by 2 digits followed by another dash followed by 6 digits followed by another dash followed by 1 more digit."

These patterns are specified using Regular Expressions. In a few slideswe will see more of the Regular Expression syntax.

Page 25: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

25Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Built-in Datatypes

• Primitive Datatypes

– string

– boolean

– float

– double

– decimal

– timeInstant

– timeDuration

– recurringInstant

– binary

– uri

• Atomic, built-in

– "Hello World"

– {true, false}

– 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN – 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN

– 7.08

– 1999-12-04T20:12:00.000

– 1Y2M3DT10H30M12.3S

– same format as timeInstant

– 010010111001

– http://www.somewhere.org

Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

Page 26: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

26Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Built-in Datatypes (cont.)• Generated datatypes

– language – NMTOKEN– NMTOKENS– Name– NCName– QNAME– ID– IDREF– IDREFS– ENTITY– ENTITIES– NOTATION– integer– non-negative-integer– positive-integer– non-positive-integer– negative-integer– date– time

• Subtype of primitive datatype– any valid xml:lang value, e.g., EN, FR, ... – "house"– "house barn yard"– "hello-there"– part (no namespace qualifier)– book:part– Token, must be unique– Token which matches an ID– List of IDREF

– 456– zero to infinity– one to infinity– negative infinity to zero– negative infinity to negative one– 1999-05-21– 13:20:00.000

Do Lab 3

Page 27: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

27Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Creating your own Datatypes

• We can create new datatypes from existing datatypes (called source or basetype) by specifying values for one or more of the optional facets

• Example. The string primitive datatype has nine optional facets - pattern, enumeration, length, maxlength, maxInclusive, maxExclusive, minlength, minInclusive and minExclusive.

Page 28: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

28Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Example

<datatype name="TelephoneNumber" source="string"> <length value="8"/> <pattern value="\d{3}-\d{4}"/></datatype>

This creates a new datatype called 'TelephoneNumber'.Elements of this type can hold string values, but thestring length must be exactly 8 characters long and thestring must follow the pattern: ddd-dddd, where 'd' represents a 'digit'.

Page 29: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

29Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Facets of the Integer Datatype

• Facets:– maxInclusive– maxExclusive– minInclusive– minExclusive

Page 30: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

30Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Example

<datatype name= "EarthSurfaceElevation" source="integer"> <minInclusive value="-1246"/> <maxInclusive value="29028"/></datatype>

This creates a new datatype called 'EarthSurfaceElevation'.Elements of this type can hold an integer.However, the integer must have a valuebetween -1246 and 29028, inclusive.

Page 31: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

31Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

General Form of Datatype/Facet Usage

<datatype name= "name" source= "source"> <facet value= "value"/> <facet value= "value"/> ...</datatype>

Facets: - minInclusive - maxInclusive - minExclusive - maxExclusive - length - minlength - maxlength - pattern - enumeration ...

Sources: - string - boolean - float - double - decimal - timeInstant - timeDuration - recurringInstant - binary - uri ...

Do Lab 4

Page 32: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

32Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Regular Expressions

• Recall that the string datatype has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions:

Regular Expression - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x

Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

Page 33: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

33Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Regular Expressions (cont.)

• Regular Expression– [a-c]x– [-ac]x– [ac-]x– [^0-9]x– \Dx– Chapter\s\d– (ho){2} there– (ho\s){2} there– .abc– (a|b)+x

• Example– ax, bx, cx

– -x, ax, cx

– ax, cx, -x– any non-digit char followed by x

– any non-digit char followed by x

– Chapter followed by a blank followed by a digit

– hoho there

– ho ho there– any char followed by abc

– ax, bx, aax, bbx, abx, bax,...

Page 34: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

34Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Regular Expressions (cont.)

• a{1,3}x• a{2,}x• \w\s\w

• ax, aax, aaax• aax, aaax, aaaax, …• word (alphanumeric

plus dash) followed by a space followed by a word

Page 35: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

35Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Example R.E.

[1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]

0 to 99 100 to 199 200 to 249 250 to 255

This regular expression restricts a string to havevalues between 0 and 255.… Such a R.E. might be useful in describing anIP address ...

Page 36: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

36Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

IP Datatype Definition<datatype name="IP" source="string"> <pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3} ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"> <annotation> <info> Datatype for representing IP addresses. Examples, 129.83.64.255, 64.128.2.71, etc. This datatype restricts each field of the IP address to have a value between zero and 255, i.e., [0-255].[0-255].[0-255].[0-255] Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line. </info> </annotation> </pattern></datatype>

Page 37: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

37Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Regular Expression Parser

• Want to test your skill in writing regular expressions? Go to: http://www.xfront.org/xml-schema/– Dan Potter has created a nice tool which allows

you to enter a regular expression and then enter a string. The parser will then determine if your string conforms to your regular expression.

Do Lab 5

Page 38: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

38Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Derived Types

• We can do a form of subclassing the type definitions. We call this "derived types"– derive by extension: extend the parent type with

more elements– derive by restriction: constrain the parent type

by constraining some of the elements to have a more restricted range of values, or a more restricted number of occurrences.

Page 39: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

39Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <type name="Publication"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/> </type> <type name="Book" source="cat:Publication" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> <element name="BookCatalogue"> <type> <element name="CatalogueEntry" minOccurs="0" maxOccurs="*" type="cat:Book"/> </type> </element></schema>

BookCatalogue5.xsd

Page 40: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

40Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<type name="Publication"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/></type><type name="Book" source="cat:Publication" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></type>

Elements of type Book will have 5 child elements - Title, Author, Date, ISBN, and Publisher.

Page 41: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

41Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Using Derived Types

• If we declare an element to be of type Publication then in the XML instance document that element's content can be either a Publication or a Book (since Book is a Publication).

Page 42: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

42Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Catalogue" xmlns:cat="http://www.somewhere.org/Catalogue"> <type name="Publication"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/> </type> <type name="Book" source="cat:Publication" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> <element name="Catalogue"> <type> <element name="CatalogueEntry" minOccurs="0" maxOccurs="*" type="cat:Publication"/> </type> </element></schema>

BookCatalogue6.xsd

Page 43: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

43Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><Catalogue xmlns ="http://www.somewhere.org/Catalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/Catalogue http://www.somewhere.org/Catalogue/BookCatalogue6.xsd"> <CatalogueEntry> <Title>Staying Young Forever</Title> <Author>Karin Granstrom Jordan, M.D.</Author> <Date>December, 1999</Date> </CatalogueEntry> <CatalogueEntry xsi:type="Book"> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </CatalogueEntry> <CatalogueEntry xsi:type="Book"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </CatalogueEntry></Catalogue>

BookCatalogue2.xml

Page 44: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

44Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<CatalogueEntry xsi:type="Book"> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher></CatalogueEntry>

"The CatalogueEntry element is declared to be of type Publication. Bookis derived from Publication. Therefore, Book is a Publication. Thus, thecontent of CatalogueEntry can be a Book. However, to indicate that the content is not the source type, but rather a derived type, we need to specifythe derived type that is being used. The attribute 'type' comes from theXML Schema Instance (xsi) namespace."

Page 45: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

45Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Why is xsi:type Needed?

• Why, in an instance document, do we need to indicate the derived type being used?

• Suppose that there are several types derived (by extension) from Publication, and some of the extended element declarations are optional. In the instance document, if xsi:type was not used, it might get very difficult for an XML Parser to determine which derived type is being used.

Do Lab 6

Page 46: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

46Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Derive by Restriction<type name="Publication"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/></type><type name= "SingleAuthorPublication" source="cat:Publication" derivedBy= "restriction"> <restrictions> <element name="Author" type="string" maxOccurs="1"/> </restrictions></type>

Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date.There must be exactly one Author element.

Page 47: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

47Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Restricting Derivations

• Sometimes we may want to create a type and disallow all derivations of it, or just disallow extensions of it, or restrictions of it.

<type name="Publication" final="#all" …> This type cannot be extended nor restricted

<type name="Publication" final="restriction" …> This type cannot be restricted

<type name="Publication" final="extended" …> This type cannot be extended

Page 48: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

48Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

exact Types

• If you define a type to be exact, then other types may derive from it. However, in the instance document derived types may not be used in its stead.

Page 49: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

49Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<type name="Publication"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/> </type> <type name="Book" source="cat:Publication" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> <element name="Catalogue"> <type> <element name="CatalogueEntry" minOccurs="0" maxOccurs="*" type="cat:Publication"/> </type> </element>

<CatalogueEntry xsi:type="Book"> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </CatalogueEntry>

Schema:

Instance doc:

This permits Publication elements, as well as types derived from Publicationto be used as a child of Catalogue, e.g., Book

Page 50: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

50Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<type name="Publication" exact="extension"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/> </type> <type name="Book" source="cat:Publication" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> <element name="Catalogue"> <type> <element name="CatalogueEntry" minOccurs="0" maxOccurs="*" type="cat:Publication"/> </type> </element>

<CatalogueEntry xsi:type="Book"> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </CatalogueEntry>

Schema:

Instance doc:

This prohibits the use of types derived by extension to be used as children of CatalogueEntry, e.g., this is not allowed

Page 51: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

51Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

exact Types

• exact="extension"– Prohibits types derived by extension to be used in its

stead in instance documents

• exact="restriction"– Prohibits types derived by restriction to be used in

its stead in instance documents

• exact="#all"– Prohibits all derived types to be used in its stead in

instance documents

Page 52: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

52Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Equivalence

• Oftentimes in daily conversation there are several ways to express something.

– In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town".

• Thus, "T" and "subway" are equivalent. Which one is used may depend upon what part of the state you live in, what mood you're in, or any number of factors.

• We would like to be able to express this "equivalence" capability in XML Schemas.

– We would like to be able to declare in the schema an element called "subway", an element called "T", and state that "T" is equivalent to "subway". Thus, instance documents can use either "subway" or "T", depending on their preference.

Page 53: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

53Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

equivClass

• We can create an element (called the exemplar) and then create other elements which state that they are equivalent to the exemplar.

<element name="subway" type="string"/> <element name="T" equivClass="boston:subway" type="string"/>

subway is the exemplarT is equivalent.

So what's the big deal? - Where ever the exemplar can be used in an instance document, any equivalent element can also be used!

Page 54: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

54Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<element name="subway" type="string"/> <element name="T" equivClass="boston:subway" type="string"/> <element name="transportation"> <type> <element ref="boston:subway"/> </type> </element>

<transportation> <subway>Red Line</subway> </transportation>

Schema:

Instance doc:

Note: the type of every element of an equivalence class must be the same as or derivedfrom the type of the exemplar element.

<transportation> <T>Red Line</T> </transportation>

Alternative Instance doc:

Page 55: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

55Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Enabling Interoperability

• One very hard problem in today's world is how to enable different domains to communicate effectively when they don't all speak the same language.

• The equivClass is a good step towards enabling interoperability.– It makes explicit the equivalence of one

element with another element.

Page 56: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

56Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Enabling Interoperability

• Imagine that you create an application. The application is designed to input XML documents. Suppose that within the XML document the application expects to see a <subway> element.

• Now suppose that it receives an XML document, but it contains a <T> element.

• The application can go to the XML Schema and see that <T> is equivalent to <subway>. Thus, it accepts and understands the XML document, even though it is using a different vocabulary than what the application was written to. Wow!

• Further, over time new element declarations can be added to the schema. For example, <element name="train" equivClass="boston:subway" type="string"/>

• Without any change to the application, the application will be able to process XML documents with a <train> element by simply consulting the schema.

• Mike Los coined the term "interoperability schema" to describe schemas used in the fashion described above.

Do Lab 7

Page 57: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

57Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Abstract Elements

• You can declare an element to be abstract– Example. <element name="Publication" type="cat:Pub" abstract="true"/>

• An abstract element is a kind of template/placeholder.• If an element is abstract then in the XML instance

document that element may not appear.– Example. <Publication> may not appear in an

instance document.• However, elements that are equivalent to the abstract

type may appear in its place.

Page 58: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

58Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Catalogue" xmlns:cat="http://www.somewhere.org/Catalogue"> <type name="Pub"> <element name="Title" type="string" maxOccurs="*"/> <element name="Author" type="string" maxOccurs="*"/> <element name="Date" type="date"/> </type> <element name="Publication" type="cat:Pub" abstract="true"/> <element name="Book" equivClass="cat:Publication"> <type source="cat:Pub" derivedBy="extension"> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </type> </element> <element name="Magazine" equivClass="cat:Publication"> <type source="cat:Pub" derivedBy="restriction"> <restrictions> <element name="Author" type="string" maxOccurs="0"/> </restrictions> </type> </element> <element name="Catalogue"> <type> <element ref="cat:Publication" minOccurs="0" maxOccurs="*"/> </type> </element></schema>

BookCatalogue7.xsd

Since the Publicationelement is abstract,only equivalentelements can appearas children of Catalogue.

The Book andMagazine elementsare equivalent to thePublication element.

Page 59: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

59Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><Catalogue xmlns ="http://www.somewhere.org/Catalogue" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.somewhere.org/Catalogue http://www.somewhere.org/Catalogue/BookCatalogue7.xsd"> <Magazine> <Title>Natural Health</Title> <Date>December, 1999</Date> </Magazine> <Book> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book></Catalogue>

BookCatalogue3.xml

An XML Instance Document of BookCatalogue7.xsd

Page 60: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

60Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Attributes

• On the next slide I show a version of the BookCatalogue DTD that uses attributes. On the following slide I show how this is implemented using XML Schemas.

Page 61: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

61Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<!-- A book catalogue contains zero or more books --><!ELEMENT BookCatalogue (Book)*><!-- A Book has a Title, one or more Authors, a Date, an ISBN, and a Publisher --><!ELEMENT Book (Title, Author+, Date, ISBN, Publisher)><!-- A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either ”true" or ”false". If no value is supplied it defaults to ”false". Reviewer contains thename of the reviewer. It defaults to "" if no value is supplied --><!ATTLIST Book Category (autobiography | non-fiction | fiction) #REQUIRED InStock (true | false) ”false" Reviewer CDATA ""><!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)><!-- A Date may have a Month. It must have a Year. --><!ELEMENT Date (Month?, Year)><!ELEMENT ISBN (#PCDATA)><!ELEMENT Publisher (#PCDATA)><!ELEMENT Month (#PCDATA)><!ELEMENT Year (#PCDATA)>

BookCatalogue2.dtd

Page 62: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

62Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <annotation> <info>A book catalogue contains zero or more books</info> </annotation> <type> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Book"> <annotation> <info>A Book has a Title, one or more Authors, a Date, an ISBN, and a Publisher</info> </annotation> <type> <element ref="cat:Title"/> <element ref="cat:Author" maxOccurs="*"/> <element ref="cat:Date"/> <element ref="cat:ISBN"/> <element ref="cat:Publisher"/> <attributeGroup ref="BookAttributes"/> </type> </element>

cont. --->

Page 63: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

63Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<attributeGroup name="BookAttributes"> <annotation> <info> A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either "yes" or "no". If no value is supplied it defaults to "no". Reviewer contains the name of the reviewer. It defaults to "" if no value is supplied. </info> </annotation> <attribute name="Category" minOccurs="1"> <datatype source="string"> <enumeration value="autobiography"/> <enumeration value="non-fiction"/> <enumeration value="fiction"/> </datatype> </attribute> <attribute name="InStock" type="boolean" default="false"/> <attribute name="Reviewer" type="string" default=""/> </attributeGroup> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

BookCatalogue8.xsd

Page 64: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

64Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<attribute name="Category" minOccurs="1"> <datatype source="string"> <enumeration value="autobiography"/> <enumeration value="non-fiction"/> <enumeration value="fiction"/> </datatype></attribute>

"The default value for maxOccurs is 1. Thus, this attributeis REQUIRED."

Page 65: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

65Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Alternate Schema

• On the next slide is another way of expressing the last example.

Page 66: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

66Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> <attribute name="Category" minOccurs="1"> <datatype source="string"> <enumeration value="autobiography"/> <enumeration value="non-fiction"/> <enumeration value="fiction"/> </datatype> </attribute> <attribute name="InStock" type="boolean" default="false"/> <attribute name="Reviewer" type="string" default=""/> </type> </element> </type> </element></schema>

BookCatalogue9.xsd

Page 67: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

67Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Note about Attributes

• The attribute declarations always come last, after the element declarations.

Do Lab 8

Page 68: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

68Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

group Element

• The group element enables you to group together element declarations.

• Note: the group element is just for grouping together element declarations, no attribute declarations allowed.

Page 69: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

69Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:cat CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/BookCatalogue" xmlns:cat="http://www.somewhere.org/BookCatalogue"> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <group ref="cat:BookElements"/> <attribute name="Category" minOccurs="1"> <datatype source="string"> <enumeration value="autobiography"/> <enumeration value="non-fiction"/> <enumeration value="fiction"/> </datatype> </attribute> <attribute name="InStock" type="boolean" default="false"/> <attribute name="Reviewer" type="string" default=""/> </type> </element> </type> </element> <group name="BookElements" order="seq"> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </group></schema>

BookCatalogue10.xsd

Page 70: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

70Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Expressing Alternates

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:sig CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:sig="http://www.somewhere.org/Examples"> <element name="signature"> <type> <group order="choice"> <element ref="sig:name"/> <group order="seq"> <element ref="sig:name"/> <element name="date" type="date"/> </group> </group> </type> </element> <element name="name" type="string"/></schema>

<!ELEMENT signature (name | (name, date))>DTD:

XML Schema:

Page 71: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

71Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Expressing Repeats

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:bi CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:bi="http://www.somewhere.org/Examples"> <element name="binary-digit"> <type> <group order="choice" maxOccurs="*"> <element name="zero" type="bi:zero-digit"/> <element name="one" type="bi:one-digit"/> </group> </type> </element> <datatype name="zero-digit" source="string"> <enumeration value="0"/> </datatype> <datatype name="one-digit" source="string"> <enumeration value="1"/> </datatype></schema>

<!ELEMENT binary-digit (zero | one)*>DTD:

XML Schema:

Page 72: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

72Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Expressing Any Order

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> <element name="Book"> <type> <group order="all"> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="date"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </group> </type> </element></schema>

XML Schema:

Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this cannot be done with DTDs).

order="all" means that Book must contain all five child elements, but they may occur in any order.Note: minOccurs, maxOccurs are both fixed to a value of "1".

Do Lab 9

Page 73: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

73Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Empty Element

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> <element name="image"> <type content="empty"> <attribute name="href" type="uri"/> </type> </element></schema>

<image href="http://www.xfront.org/Rog.gif"/>

Schema:

Instancedoc:

Do Lab 10

Page 74: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

74Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Uniqueness

• DTDs provided the ID attribute datatype for uniqueness (i.e., an ID value must be unique throughout the entire document, and the XML parser enforces this).

• XML Schema has much enhanced uniqueness capabilities:

– enables you to define element content to be unique.

– enables you to define non-ID attributes to be unique.

– enables you to define a combination of element content and attributes to be unique.

– enables you to distinguish between unique and key.

– enables you to declare the range of the document over which something is unique

Page 75: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

75Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

unique vs key

• Key: an element/attribute (or combination thereof) which is defined to be a key must – always be present (minOccurs must be greater

than zero)– be non-nullable (i.e., nullable="false")– be unique

• Key implies unique, but unique does not imply key

Page 76: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

76Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><Library xmlns="http://www.somewhere.org/Library"> <BookCatalogue> <Book Category = "fiction" InStock = "yes" Reviewer = "John Doe"> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book Category = "non-fiction" InStock = "yes" Reviewer = "John Doe"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookCatalogue> <CheckoutRegister> <Person>Roger Costello</Person> <Book titleRef="Illusions The Adventures of a Reluctant Messiah" categoryRef="fiction"/> </CheckoutRegister></Library>

+ = key

+ = keyref

Page 77: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

77Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:lib CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Library" xmlns:lib="http://www.somewhere.org/Library"> <element name="Library"> <type> <element name="BookCatalogue"> <type> <element name="Book" minOccurs="0" maxOccurs="*"> <type> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> <attribute name="Category" minOccurs="1"> <datatype source="string"> <enumeration value="autobiography"/> <enumeration value="non-fiction"/> <enumeration value="fiction"/> </datatype> </attribute> <attribute name="InStock" type="boolean" default="false"/> <attribute name="Reviewer" type="string" default=""/> </type> </element> </type> </element>

Page 78: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

78Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<element name="CheckoutRegister"> <type> <element name="Person" type="string"/> <element name="Book"> <type content="empty"> <attribute name="titleRef" type="string"/> <attribute name="categoryRef" type="string"/> </type> </element> </type> </element> </type> </element> <key name="bookKey"> <selector>Library/BookCatalogue/Book</selector> <field>Title</field> <field>@Category</field> </key> <keyref name="bookRef" refer="bookKey"> <selector>Library/CheckoutRegister/Book</selector> <field>@titleRef</field> <field>@categoryRef</field> </keyref></schema>

BookCatalogue11.xsd

Page 79: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

79Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Defining a Key<key name="bookKey"> <selector>Library/BookCatalogue/Book</selector> <field>Title</field> <field>@Category</field></key>

"I define that the combination of the contents of Title plus the valueof the attribute Category is to be unique throughout an XML instancedocument. The Title element and Category attribute lie withinthe XPath expression shown in the selector element."

Note: you can have one or more field elements. The key will be the combinationof all the fields.

Page 80: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

80Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Defining a Reference to a Key

<keyref name="bookRef" refer="bookKey"> <selector>Library/CheckoutRegister/Book</selector> <field>@titleRef</field> <field>@categoryRef</field><keyref>

"I define that the values in the two attributes, titleRef and categoryRef,combined, are a reference to a key defined by bookKey. The twoattributes are located at the XPath expression found in the selectorelement."

Note: if there are 2 fields in the key, then there must be 2 fields in the keyref,if there are 3 fields in the key, then there must be 3 fields in the keyref, etc.Further, the fields in the keyref must match in type and position to the key.

Page 81: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

81Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Defining Uniqueness<unique name="bookUnique"> <selector>Library/BookCatalogue/Book</selector> <field>Title</field> <field>@Category</field></unique>

"I define that the combination of the contents of Title plus the valueof the attribute Category is to be unique throughout an XML instancedocument. The Title element and Category attribute lie withinthe XPath expression shown in the selector element."

Page 82: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

82Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Specifying scope of uniqueness in XML Schemas?

- The key/keyref/unique elements may be placed anywhere in your schema.

- Where you place them determines the scope of the uniqueness.

- In our example we placed the key/keyref elements at the top-level (direct child of the schema element). Thus, we are stating that in an instance document the uniqueness is with respect to the entire document.

- On the other hand, if we were to place the key/keyref elements as a child of the Book element then the uniqueness will have a scope of just the Book element. Thus, over the entire instance document there may be repeats, but within any Book element it will be unique.

Page 83: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

83Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

any Element

• The any element allows any well-formed XML.

<element name="free-form"> <type> <any/> </type></element>

The free-form element can contain any well-formed XML (WFXML)."

<free-form> <comment source="Roger">This is great!</comment></free-form>

Page 84: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

84Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

anyAttribute

• The anyAttribute allows any attribute

<element name="free-form"> <type content="empty"> <anyAttribute/> </type></element>

The free-form element can have any attribute.

<free-form comment="This is great"/>

Page 85: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

85Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

(Almost) any Element/Attribute

<any namespace="##other"/> allows any well-formed XML element, provided the element is in another namespace than the one we're defining.

<any namespace="http://www.somewhere.com"/> allows any well-formed XML element, provided it's from the specified namespace.

<any namespace="##targetNamespace"/> allows any well-formed XML element, provided it's from the namespace that we're defining.

<anyAttribute namespace="##other"/> allows any attribute, provided the attribute is in another namespace than the one we're defining.

<anyAttribute namespace="http://www.somewhere.com"/> allows any attribute, provided it's from the specified namespace.

<anyAttribute namespace="##targetNamespace"/> allows any attribute, provided it's from the namespace that we're defining.

Page 86: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

86Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Validating an XML Instance Document

• Validation can apply to the entire XML instance document, or to a single element.

Page 87: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

87Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<Library xmlns:book="http://www.somewhere.org/Examples" xmlns:person="http://www.somewhere-else.org/Person" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation= "http://www.somewhere.org/Examples http://www.somewhere.org/ Examples/Book.xsd http://www.somewhere-else.org/Person http://www.somewhere-else.org/Person/Person.xsd"> <BookCatalogue> <book:Book> <book:Title>Illusions The Adventures of a Reluctant Messiah</book:Title> <book:Author>Richard Bach</book:Author> <book:Date>1977</book:Date> <book:ISBN>0-440-34319-4</book:ISBN> <book:Publisher>Dell Publishing Co.</book:Publisher> </book:Book> <book:Book> <book:Title>The First and Last Freedom</book:Title> <book:Author>J. Krishnamurti</book:Author> <book:Date>1954</book:Date> <book:ISBN>0-06-064831-7</book:ISBN> <book:Publisher>Harper &amp; Row</book:Publisher> </book:Book> </BookCatalogue> <Employees> <person:Employee> <person:Name>John Doe</person:Name> <person:SSN>123-45-6789</person:SSN> </person:Employee> <person:Employee> <person:Name>Sally Smith</person:Name> <person:SSN>000-11-2345</person:SSN> </person:Employee> </Employees></Library>

Library.xml

Validating usingtwo schemas

Page 88: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

88Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Assembling a Schema from Multiple Schema Documents

• The include element allows you to bring in schema definitions from other schema– They must all have the same namespace– The net effect of include is as though you had typed all the definitions in

directly into the containing schema

<schema …> <include schemaLocation="Book.xsd"/> <include schemaLocation="Person.xsd"/> …</schema>

Book.xsd Person.xsd

Library.xsd

Page 89: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

89Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"[<!ATTLIST schema xmlns:lib CDATA #IMPLIED>]><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:lib="http://www.somewhere.org/Examples"> <include schemaLocation="http://www.somewhere.org/Examples/Book.xsd"/> <include schemaLocation="http://www.somewhere.org/Examples/Person.xsd"/> <element name="Library"> <type> <element name="BookCatalogue"> <type> <element ref="lib:Book" minOccurs="0" maxOccurs="*"/> </type> </element> <element name="Employees"> <type> <element name="lib:Employee" minOccurs="0" maxOccurs="*"/> </type> </element> </type> </element></schema>

Library.xsd

Page 90: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

90Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

import Element

• The import element allows you to reference elements in another namespace

<schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" xmlns:html="http://www.w3.org/1999/XHTML"> <import namespace="http://www.w3.org/1999/XHTML" schemaLocation="http://www.w3.org/XHTML/xhtml.xsd"/> ... <element ref="html:table"> …</schema>

Page 91: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

91Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Declaration vs Definition

• You make declarations of things that will be used in an XML instance document.

• You make definitions of things that are just used in the schema document; a definition creates a new type

Declarations:- element declarations- attribute declarations

Definitions:- type definitions- attribute group definitions- model group definitions

Page 92: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

92Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Symbol Space

• Each type definition creates a new symbol space• Top-level element declarations are in the top-level

symbol space

Page 93: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

93Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples"> <element name="title" type="string"/> <type name="ChapterType"> <element name="title" type="string"/> </type> <type name="SectionType"> <element name="title" type="string"/> </type></schema>

3 different titles

Page 94: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

94Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

Version Management

• The schema element has an optional attribute, version, which you may use to indicate the version of your schema (for private version documentation of the schema)

<?xml version="1.0"?><!DOCTYPE schema SYSTEM "xml-schema.dtd"><schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.somewhere.org/Examples" version="1.0"> ...</schema>

Page 95: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

95Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

<type> or <datatype>?

• When do you use the type element and when do you use the datatype element?– Use the type element when there are elements

and/or attributes– Use the datatype element when it is a primitive

type (string, integer, etc)

Page 96: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

96Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

null Content

• You can indicate in a schema that an element may be null in the instance document. Empty content vs null:

– Empty: an element with a type of content="empty" is constrained to have no content.

– null: an instance document element may indicate no value is available by setting an attribute - xsi:null - equal to 'true'

<element name="PersonName"> <type> <element name="forename" type="NMTOKEN"/> <element name="middle" type="NMTOKEN" nullable="true"/> <element name="surname" type="NMTOKEN"/> </type></element>

<PersonName> <forename>John</forename> <middle xsi:null="true"/> <surname>Doe</surname></PersonName>

XML Schema:

XML instancedocument:

The content of middle canbe a NMTOKEN value or,we can indicate that itscontent is undefined.

Page 97: Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas   Roger L

97Copyright (c) [2000]. Roger L. Costello. All Rights Reserved.

ur-type

• The ur-type is the source for all types which do not specify a value for the source attribute. It is the type for all elements which do not specify a type.– Example: <element name="foo"/>

<type name="ur-type" content="mixed"> <any min="0" max="*"/> <anyAttribute/></type>