lecture #2 xml

69
Department of Computer Engineering D o k u z E y l ü l U n i v e r s i t y Department of Computer Engineering Lecture #2 XML Introduction Dr.Adil Alpkocak Dokuz Eylul University Dept of Computer Engineering CME2002 Data Organization and Management

Upload: adil-alpkocak

Post on 19-Mar-2017

70 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

Lecture #2XML Introduction

Dr.Adil AlpkocakDokuz Eylul University

Dept of Computer Engineering

CME2002 Data Organization and Management

Page 2: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Today

• Field and record organization• XML introduction

2

Page 3: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Files

A file can be seen as 1. A stream of bytes (no structure), or2. A collection of records with fields

3

Page 4: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

#1. File as stream bytes

• File is viewed as a sequence of bytes:

• Data semantics is lost: there is no way to get it apart again.

4

87359CarrollAlice in wonderland38180FolkFile Structures ...

Page 5: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

#2: File as a Collection of Records

DefinitionsFile: Collection of records.Record: a collection of related fields.Field: the smallest logically meaningful unit

of information in a file.Key: a subset of the fields in a record used to

identify (uniquely) the record.

e.g. In the example file of books:– Each line corresponds to a record.– Fields in each record: ISBN, Author, Title

5

Page 6: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Different Notation

column row Table

Field Record File

Attribute Tuple Relation

Terms in the same column are synonyms, and can be used interchangeably.

Page 7: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Record Keys• Primary key: a key that uniquely identifies

a record.• Secondary key: other keys that may be

used for search– Author name– Book title– Author name + book title

• Note that in general not every field is a key (keys correspond to fields, or a combination of fields, that may be used in a search).

7

Page 8: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Record Structures1. Fixed-length records.2. Fixed number of fields.3. Begin each record with a length

indicator.4. Use an index to keep track of

addresses.5. Place a delimiter at the end of the

record.

8

Page 9: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

1.Fixed-length recordsTwo ways of making fixed-length

records:1. Fixed-length records with fixed-

length fields.

2. Fixed-length records with variable-length fields.

9

87359 Carroll Alice in wonderland03818 Folk File Structures

87359|Carroll|Alice in wonderland| unused38180|Folk|File Structures| unused

Page 10: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Fixed-Length Records• e.g.

struct PERSON {char last[11];char first[11];char address[16];char city[16];char state[3];char zip[10];

} person;

Will produce a fixed size record of size 67 bytes.

10

Page 11: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Fixed-Length Records• The fixed length record structure,

however, does NOT imply, the fixed -length field structure.

• Fixed-length records are frequently used as “containers” to hold variable numbers of variable-length fields.

• Fixed-length record structures are among the most commonly used methods for organizing files.

11

Page 12: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Variable-length records2. Fixed number of fields:

3. Record beginning with length indicator:

4. Use an index file to keep track of record addresses:

– The index file keeps the byte offset for each record; this allows us to search the index (which have fixed length records) in order to discover the beginning of the record.

5. Placing a delimiter: e.g. end-of-line char12

87359|Carroll|Alice in wonderland|38180|Folk|File Structures| ...

3387359|Carroll|Alice in wonderland|2638180|Folk|File Structures| ..

Page 13: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Field Structures1. Fixed-length fields

87359Carroll Alice in wonderland38180Folk File Structures

2. Begin each field with a length indicator058735907Carroll19Alice in wonderland053818004Folk15File Structures

3. Place a delimiter at the end of each field87359|Carroll|Alice in wonderland|38180|Folk|File Structures|

4. Store field as keyword = value ISBN=87359|AU=Carroll|TI=Alice in wonderland|ISBN=38180|AU=Folk|TI=File Structures

13

Page 14: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Field Structures:1. Fixing the Length of FieldsThis method relies on creating fields

of predictable fixed size.

struct PERSON {char last[11];char first[11];char address[16];char city[16];char state[3];char zip[10];

} person

14

Page 15: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Field Structures:2. Beginning Each Field with a Length Indicator

This method requires that each field data be preceded with an indicator of its length (in bytes).

E.G. 04Ames04Mary09123 Maple10StillWater02OK0574075One of the disadvantages of this method is that it

is more complex since it requires extracting of numbers and strings from a single string representing a record.

15

Page 16: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t yField Structures:3. Separating Fields with

Delimiters• This method requires that the fields

be separated by a selected special character or a sequence of characters called a delimiter.

• E.G. If “|” is used as a delimiter then a sample record would look like this:

Ames|Mary|123Maple|StillWater|OK|574075|

16

Page 17: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Field Structures:4. Using a “keyword = value”

expression This method requires that each field data be

preceded with the field identifier (keyword).last=Ames first=Mary address=123 Maplecity=StillWater state=OK zip=574075

Can be used with the delimiter method to mark the field ends.last=Ames|first=Mary|address=123 Maple|City=StillWater|state=OK|zip=574075

17

Page 18: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t yField Structures:4. Using a “keyword = value”

expression• Advantages:

• each field provides information about itself• good format for dealing with missing fields

• Disadvantages:• In some application a lot of space may be

wasted on field keywords (up 50%).

18

Page 19: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XMLEXtensible Markup Language

Page 20: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML

• is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

Page 21: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

What is XML?• XML stands for EXtensible Markup Language • XML is a markup language much like HTML • XML was designed to describe data • XML tags are not predefined. You must

define your own tags • XML uses a Document Type Definition

(DTD) or an XML Schema (XSD) to describe the data

• XML with a DTD or XSD is designed to be self-descriptive

• XML is a W3C Recommendation.

Page 22: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Simple XML example

<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 23: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

The Main Difference Between XML and HTML

XML was designed to carry data.• XML and HTML were designed with

different goals:– XML was designed to describe data and

to focus on what data is.– HTML was designed to display data and

to focus on how data looks. • HTML is about displaying information,

while XML is about describing data.• XML is not a replacement for HTML.

Page 24: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML is Used to Exchange Data

With XML, data can be exchanged between incompatible systems

• In the real world, computer systems and databases contain data in incompatible formats.

• One of the most time-consuming challenges for developers has been to exchange data between such systems over the Internet.

• Converting the data to XML can greatly reduce this complexity and create data that can be read by many different types of applications.

Page 25: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Can be Used to Share Data

With XML, plain text files can be used to share data

• Since XML data is stored in plain text format, XML provides a software- and hardware-independent way of sharing data.

• This makes it much easier to create data that different applications can work with.

• It also makes it easier to expand or upgrade a system to new operating systems, servers, applications, and new browsers

Page 26: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Can be Used to Store Data

With XML, plain text files can be used to store data

• XML can also be used to store data in files or in databases.

• Applications can be written to store and retrieve information from the store, and generic applications can be used to display the data

Page 27: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Can Make your Data More Useful

With XML, your data is available to more users.

• Since XML is independent of hardware, software and application, you can make your data available to other than only standard HTML browsers.

• Other clients and applications can access your XML files as data sources, like they are accessing databases.

• Your data can be made available to all kinds of "reading machines" (agents), and it is easier to make your data available for blind people, or people with other disabilities.

Page 28: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Can be Used to Create New Languages

XML is the mother of WAP and WML

• The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.

• For a list of XML based languages refer to wikipedia pages.

Page 29: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Syntax Rules

• The syntax rules of XML are very simple and very strict.

• The rules are very easy to learn, and very easy to use.

• Because of this, creating software that can read and manipulate XML is very easy.

Page 30: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

An Example XML Document

XML documents use a self-describing and simple syntax

<?xml version="1.0" encoding="ISO-8859-1"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 31: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

All XML Elements Must Have a Closing Tag

With XML, it is illegal to omit the closing tag

• In XML all elements must have a closing tag, like this:

<p>This is a paragraph</p><p>This is another paragraph</p> 

Page 32: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Tags are Case Sensitive

Unlike HTML, XML tags are case sensitive

• With XML, the tag <Letter> is different from the tag <letter>.

• Opening and closing tags must therefore be written with the same case:<Message>This is incorrect</message>

<message>This is correct</message>

Page 33: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements Must be Properly Nested

Improper nesting of tags makes no sense to XML

• In HTML some elements can be improperly nested within each other like this:

• In XML all elements must be properly nested within each other like this:

<b><i>This text is bold and italic</b></i>

<b><i>This text is bold and italic</i></b>

Page 34: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Documents Must Have a Root Element

All XML documents must contain a single tag pair to define a root element

• All other elements must be within this root element.

• All elements can have sub elements (child elements). Sub elements must be correctly nested within their parent element:

<root> <child> <subchild>.....</subchild> </child></root>

Page 35: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Attribute Values Must be Quoted

With XML, it is illegal to omit quotation marks around attribute values

<?xml version="1.0" encoding="ISO-8859-1"?><note date=12/11/2002><to>Tove</to><from>Jani</from></note>

<?xml version="1.0" encoding="ISO-8859-1"?><note date="12/11/2002"><to>Tove</to><from>Jani</from></note>

Page 36: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

With XML, White Space is Preserved

• With XML, the white space in your document is not truncated.

• This is unlike HTML. With HTML, a sentence like this:Hello              my name is Tove,

will be displayed like this:Hello my name is Tove,

• because HTML reduces multiple, consecutive white space characters to a single white space

Page 37: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

With XML, CR / LF is Converted to LF

With XML, a new line is always stored as LF

• In Windows applications, a new line is normally stored as a pair of characters: carriage return (CR) and line feed (LF).

• The character pair bears some resemblance to the typewriter actions of setting a new line. In Unix applications, a new line is normally stored as a LF character.

• Macintosh applications use only a CR character to store a new line.

Page 38: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Comments in XML

• The syntax for writing comments in XML is similar to that of HTML.

<!-- This is a comment -->

Page 39: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

There is Nothing Special About XML

• It is just plain text with the addition of some XML tags enclosed in angle brackets.

• Software that can handle plain text can also handle XML. In a simple text editor, the XML tags will be visible and will not be handled specially.

• In an XML-aware application however, the XML tags can be handled specially.

• The tags may or may not be visible, or have a functional meaning, depending on the nature of the application.

Page 40: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements

• XML Elements are extensible and they have relationships.

• XML Elements have simple naming rules.

Page 41: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements are Extensible

• XML documents can be extended to carry more information.<note><to>Tove</to><from>Jani</from><body>Don't forget me this weekend!</body></note>

MESSAGE To: ToveFrom: Jani

Don't forget me this weekend!

Page 42: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements are Extensible

• XML document added some extra information to it:

MESSAGE To: ToveFrom: Jani

Don't forget me this weekend!

<note><date>2002-08-01</date><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 43: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements have Relationships

• Imagine that this is a description of a book:

My First XMLIntroduction to XML

· What is HTML · What is XML

XML Syntax· Elements must have a closing tag · Elements must be properly nested

Page 44: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Elements have Relationships• Imagine that this XML document describes the book:

<book>

<title>My First XML</title><prod id="33-657" media="paper"></prod><chapter>Introduction to XML

<para>What is HTML</para><para>What is XML</para></chapter>

<chapter>XML Syntax<para>Elements must have a closing tag</para><para>Elements must be properly nested</para></chapter>

</book>

Page 45: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Elements have Content

Elements can have different content types.

• An XML element is everything from (including) the element's start tag to (including) the element's end tag.

• An element can have element content, mixed content, simple content, or empty content. An element can also have attributes.

Page 46: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Element NamingXML elements must follow these

naming rules• Names can contain letters, numbers,

and other characters • Names must not start with a number

or punctuation character • Names must not start with the letters

xml (or XML, or Xml, etc) • Names cannot contain spaces

Page 47: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Element Naming

Take care when you "invent" element names and follow these simple rules

• Any name can be used, no words are reserved, but the idea is to make names descriptive. Names with an underscore separator are nice.

Page 48: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Attributes

• XML elements can have attributes in the start tag, just like HTML.

• Attributes are used to provide additional information about elements.

Page 49: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Attributes• XML elements can have attributes.• From HTML you will remember this: <IMG

SRC="computer.gif">. The SRC attribute provides additional information about the IMG element.

• In HTML (and in XML) attributes provide additional information about elements:<img src="computer.gif"><a href="demo.asp">

Page 50: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Attributes

• Attributes often provide information that is not a part of the data.

• In the example below, the file type is irrelevant to the data, but important to the software that wants to manipulate the element:<file type="gif">computer.gif</file>

Page 51: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Quote Styles, "female" or 'female'?

• Attribute values must always be enclosed in quotes, but either single or double quotes can be used.

• For a person's sex, the person tag can be written like this:

or <person sex="female">

<person sex='female'>

Page 52: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Quote Styles in Attributes• If the attribute value itself contains double

quotes it is necessary to use single quotes, like in this example: 

• If the attribute value itself contains single quotes it is necessary to use double quotes, like in this example: 

<gangster name='George "Shotgun" Ziegler'>

<gangster name="George 'Shotgun' Ziegler">

Page 53: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

• Data can be stored in child elements or in attributes.<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname></person>

<person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname></person>

Page 54: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

• Following XML documents contain exactly the same information:

• Alternative #1<note date="12/11/2002"><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 55: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

• Alternative #2<note><date>12/11/2002</date><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 56: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

• Alternative #3<note><date> <day>12</day> <month>11</month> <year>2002</year></date><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 57: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

• Alternative #4<note day="12" month="11" year="2002“ to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"></note>

Page 58: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Some of the problems with using attributes

1. attributes cannot contain multiple values (child elements can)

2. attributes are not easily expandable (for future changes)

3. attributes cannot describe structures (child elements can)

4. attributes are more difficult to manipulate by program code

5. attribute values are not easy to test against a Document Type Definition (DTD) - which is used to define the legal elements of an XML document

Page 59: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Use of Elements vs. Attributes

Rule of thumb• If you use attributes as containers for

data, you end up with documents that are difficult to read and maintain.

• Try to use elements to describe data.

• Use attributes only to provide information that is not relevant to the data.

Page 60: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

An Exception to Attribute Rule

The ID in this example is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data.<messages> <note id="p501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="p502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not!</body> </note> </messages>

Page 61: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Validation

• XML with correct syntax is Well Formed XML.

• XML validated against a DTD is Valid XML.

Page 62: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Well Formed XML Documents

A "Well Formed" XML document has correct XML syntax.

XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested XML attribute values must always be

quoted

Page 63: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Valid XML DocumentsA "Valid" XML document also conforms to

a DTD<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE note SYSTEM "InternalNote.dtd"><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 64: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML DTD

A DTD defines the legal elements of an XML document

• The purpose of a DTD is to define the legal building blocks of an XML document.

• It defines the document structure with a list of legal elements.

Page 65: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML Schema (XSD) 

• XML Schema is an XML based alternative to DTD

• W3C supports an alternative to DTD called XML Schema.

Page 66: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Displaying XML with XSL• XSL is the preferred style sheet

language of XML• XSL (the eXtensible Stylesheet Language)

is far more sophisticated than CSS.• One way to use XSL is to transform XML

into HTML before it is displayed by the browser as demonstrated in these examples:

• XML file, the XSL style sheet, and View the result.

Page 67: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

XML File<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="simple.xsl"?><breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description> two of our famous Belgian Waffles </description> <calories>650</calories> </food></breakfast_menu>

Page 68: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y

Page 69: Lecture #2 xml

D e p a r t m e n t o f C o m p u t e r E n g i n e e r i n g

D o k u z E y l ü l U n i v e r s i t y