SlideShare a Scribd company logo
XML
JERRY KURIAN. OVER 20 YEARS EXPERIENCE.
TECHNOLOGY INNOVATOR & ENTREPRENEUR
Started coding with an Intel 486 machine more than 25 years
back and enjoying it ever since. Developed using VB, Pascal,
C++, Java Enterprise and OSS, Scala, Node JS and the saga
continues. Started using Spring, hibernate before it became hip.
Started using Scala when it was in its infancy.
After spending 8 years working in various software
companies like Huawei Tech, Quidnunc across UK, US,
China and India, the entrepreneurship bug bit in 2006
(before it was hip!!). Built one of the pioneers in SMS
social network called CellZapp, I developed the
product on my own and sold it to marquee customers
like ESPN and Hungama Digital. Recently launched a
product in field informatics www.isense-tech.co.in.
Successfully launched across 3 pilot customers and on
track to sign up more.
A family man with two kids, I am a passionate weekend
cricketer and an involved dad. I urge my two sons to
follow their dreams, which they do by staying out of the
conventional schooling system and exploring their passion
at a democratic free school called BeMe. Check it out at
http://beme.org.in
ORIGIN OF XML
 XML (Extensible Markup Language) is a derivative of
SGML (Standard Generalized Markup Language), the
earliest attempt at a markup language
 XML is not a programming language, but a set of
rules that structure data in a representational manner
 XML rules are standard and allow easy eXtensibility
as per business needs
WHY XML
 Most application have domain specific data that
needs to be shared across components
 With the service orientation of new-age applications,
data has to be shared across different applications
too
 Application can share data in a format that can be
parsed and understood by a program
WHY XML
 Data has traditionally been shared by defining
protocols and arranging data as per the protocol
 Every protocol needs development of a parser for
understanding the protocol and extracting data out of
it
 Development of parser is not an easy undertaking
and in fact adds no value to the overall application in
terms of its actual business goals
WHY XML
 XML provides an easy substitution to the need of
creating proprietary protocols
 By following XML rules, new domain specific
language (Protocol) can be generated without the
need for creating its custom parser
 Any XML document can be parsed by using a valid
XML parser
WHY XML
 XML allows application developers to define a
business specific protocol which is easy to read for
humans as well as easy to parse for applications
 Numerous parsers are available in all programming
language to parse any XML document
ADVANTAGES
 XML allows definition of data in a format
understandable to both humans and computers
 Standard rules of XML allow a standard parser to be
used for parsing any XML document
 XML enables representation of data in simple texts,
allowing easy transfer over any type of
communication medium
XML DOCUMENT
 An XML document is made up of a set of tags in the
form of ‘<‘ ‘some text’ ‘>’ that denotes start of a ‘node’
 The node area ends with ‘<‘ ‘/’ ‘some text’ ‘>’
 The XML nodes are made up of
 Element
 Attribute
 Entity
 Comment
XML USAGE PROBLEM
DEFINITION
 Consider a multi user gaming platform where each
user plays a game on his own machine and makes a
move
 Data about each move is sent to the other user in the
form of XML
 The game requires each player to send a challenge
question to another player with choice of at least 3
answers, one of which can be right
XML USAGE PROBLEM
DEFINITION
 Whenever a move is sent by player 1 to player 2, the
details of player 1 along with current points should
also be sent
XML DEFINITION
 In the problem definition, the various elements are
 Player
 Player Name
 Player Address
 Player Points
 Questions
 Question
 Answer
XML DEFINITION
 The various elements identified in the previous slide
can provide almost all the information about a move
made by a player
 These elements will be arranged in an XML document
in the following manner
<game>
<person>
<name>Jerry</name>
<address>Bangalore</address>
<points>10</points>
</person>
<questions>
<question>
<query>What is XML</query>
<answers>
<answer>A tree</answer>
<answer>An automobile</answer>
<answer>A markup language</answer>
</answers>
</question>
<question>
<query>Where is Bangalore located</query>
<answers>
<answer>Maharashtra</answer>
<answer>Karnataka</answer>
<answer>UP</answer>
</answers>
</question>
<questions>
</game>
How will you create an XML representing answers from the player who
Is being questioned?
<game>
<person>
<name>Hari</name>
<address>Bangalore</address>
<points>50</points>
</person>
<questions>
<question>
<query>What is XML</query>
<answer>A markup language</answer>
</question>
<question>
<query>Where is Bangalore located</query>
<answer>Karnataka</answer>
</question>
<questions>
</game>
XML Representing answer from the other player can be represented as
ELEMENT
 Element is the basic building block of XML document
 Every aspect of the domain is described through the
Element
 In our example, the nodes like <person>, <question>
etc are elements
 As seen above, one element can contain one or more
elements as its child element
ATTRIBUTE
 If an element has some additional characteristics,
which is not an element in itself, then it can be
denoted using an attribute
 The attribute is placed within the element node and
contains a name=value pair
 In our example, the list of answers should contain one
correct answer. The correctness of an answer can be
denoted using an attribute
<game>
<person>
<name>Jerry</name>
<address>Bangalore</address>
<points>10</points>
</person>
<questions>
<question>
<query>What is XML</query>
<answers>
<answer>A tree</answer>
<answer>An automobile</answer>
<answer correct=“true”>A markup language</answer>
</answers>
</question>
<question>
<query>Where is Bangalore located</query>
<answers>
<answer>Maharashtra</answer>
<answer correct=“true”>Karnataka</answer>
<answer>UP</answer>
</answers>
</question>
<questions>
</game>
ROOT ELEMENT
 The XML elements can be represented in the form of
a tree
 The top most element of the XML document is the
Root element and each of its child is a root to its own
children
 In our case, the <game> element is the root element
of the document.
EMPTY ELEMENTS
 There could be elements that do not have any child
elements under it
 These elements could just have the attributes in it
 Such elements are called Empty elements
 Empty elements are usually denoted as
<element_name/>. This is same as
<element_name></element_name> with no content
between
COMMENTS
 Comments can be added into an XML document to
give more information about tags
 Comments will be ignored by the parser
 Comments can be provided between the tags <!- -
and - - >
<!- - Your comment here - - >
ENTITY
 Entities can be used to substitute a value for a data
item
 Entities behave like macros where they are place
holders for something else
 Entities start with & and end with ;
 Predefined entity like &quot; will be replaced by a ‘
when parsed
CDATA
 As seen in the example, most of the element contain
text between then, which is the value for the element
 The XML parser returns the value of element by
getting the content between the nodes
 If the content contains some special characters like
‘<‘, ‘>’ its, then it may lead to error in parsing
CDATA
 Such characters can be escaped by using entities as
explained earlier
 But if you want to avoid entities, then CDATA section
can be used
 When CDATA section is encountered, the parser will
leave it alone and pass the text unchanged
 CDATA can be defined in the following format
<![CDATA[ content ]]>.
CDATA
 Example
<points><![CDATA[<20]]></points>
DTD
DOCUMENT TYPES
 There are two types of XML documents
 Well Formed
 Well formed and valid
 Well formed documents are any XML document that
follow the general XML rules
 The XML documents above are examples of well
formed XML documents
DOCUMENT TYPES
 Well formed and valid XML documents are ones that
not only follow general XML rules, but also conform to
certain domain specific grammar
 The domain specific grammar is denoted using DTD
(Document type definition)
 DTDs define rules for a domain specific XML
document
DTD
 DTD is made of tags that define the various nodes
allowed in an XML document
 The DTD can be used to define the various aspects of
XML document like
 Element
 Attribute
 Entities
DTD
 A document can refer to a DTD using the
<!DOCTYPE> element
<!DOCTYPE document [
<! - - DTD goes here - ->
]>
<game>
<person>
<name>Jerry</name>
<address>Bangalore</address>
<points>10</points>
</person>
DTD
 An XML document can also refer to an external DTD
file instead of defining it as part of the XML document
itself
<!DOCTYPE document SYSTEM “game.dtd">
 The SYSTEM specifies this to be a private DTD
PUBLIC DTDS
 DTDs can be created by public body and can be
accessed by any XML document
 <!DOCTYPE document PUBLIC ‘dtd’>
 The dtd location needs to be specified using a formal
public identifier (FPI)
 FPI Example:
 -//W3C//DTD XHTML 1.0 Transitional//EN
FPI RULES
 The first field indicates whether the DTD is for a formal standard.
For DTDs you create on your own, this field should be -. If a non-
official standards body has created the DTD, you use +. For
formal standards bodies, this field is a reference to the standard
itself (such as ISO/IEC 19775:2003).
 The second field holds the name of the group or person
responsible for the DTD. You should use a name that is unique
(for example, W3C just uses W3C).
 The third field specifies the type of the document the DTD is for
and should be followed by a unique version number of some kind
(such as Version 1.0).
 The fourth field specifies the language in which the DTD is
written (for example, EN for English).
DECLARING ELEMENT
 The XML elements are declared in DTD using the
following syntax
<!ELEMENT name content_model >
 The name indicates the name of the element
 The content_model indicates the content that the
element is allowed to have as its children
 If there is no content_model specified then the
element will be treated as an empty element
DECLARING ELEMENT
 In our example, the game element can be declared in
the following way
<!ELEMENT game (person,questions)>
 The above element definition specifies that the game
element can have person and questions elements as
its children
 If an element provides content_model as ANY then
that element can contain any type of children,
effectively telling parser to ignore validation of the
element
<!ELEMENT name ANY>
CHILD ELEMENTS
 The DTD can specify the number of children allowed
for each element
<!ELEMENT game (person)>
Specifies game element can have only one person
child element
<!ELEMENT questions (question)*>
Specifies that the questions element can zero or
many question elements as children
CHILD ELEMENTS
Element x or y can be present- but not bothx | y
Element x should be followed by element yx , y
There can be zero or one occurrence of the
element
?
There can be one or more occurrences of the
element
+
There can be zero or more occurrences of the
element
*
DescriptionNotation
ATTRIBUTE
 Attributes provide additional details for an element
 Attributes can be defined in a DTD using the following
notation
<!ATTLIST element_name attribute_name type
default_value
ATTRIBUTE DEFINITION
 In our example,
the element
answer has an
attribute correct
 <!ATTLIST answer
correct CDATA
#IMPLIED>
Specifies default
value for attribute
value
Mandates the
attribute
#REQUIRED
Sets attribute’s
value to value
#FIXED value
Attribute is
optional
#IMPLIED
ATTRIBUTE TYPES
 Attributes can have the following types
 CDATA
 Enumerated type
 NMTOKEN
 NMTOKENS
 ID
ATTRIBUTE TYPES
 CDATA- allows character data that should not contain
special characters
 Enumerated types provides a comma separated list of
options.
 <!ATTLIST answer correct (true | false) #REQUIRED>
]>
 NMTOKEN are any name token that confirm to XML
standards
 NMTOKENS are a set of NMTOKENS seperated by
white space
ENTITY
 An entity in XML is just a data item
 Entities are usually text that are used quite often
across the document
 Entities can also be binary data
 Entities can be declared like
<!ENTITY name definition>
ENTITY
 Example
<!ENTITY copyright "(c) XML Power Corp. 2005">
 An entity can then be used in the document as
follows
<copy>&copyright;</copy>
SCHEMAS
XML SCHEMAS
 XML schemas are an alternate way of defining the
structure of an XML document
 Schemas are much more comprehensive and detailed
way of specifying the XML syntax
 Schemas also specify the element and attribute of an
XML document
SCHEMA EXAMPLE
 The game XML document can be defined as
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="game" type="gameType"/>
<xsd:complexType name="gameType">
<xsd:sequence>
<xsd:element name="person" minOccurs="1"/>
<xsd:element name="questions" type="questionType" minOccurs="1"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="questionType">
<xsd:sequence>
<xsd:element ref="query"/>
<xsd:element name="answers" type="answersType"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="query" type="xsd:string"/>
<xsd:complexType name="answersType">
<xsd:sequence>
<xsd:element name="answer" type="answerType"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="answerType">
<xsd:attribute name="correct" type="xsd:string" use="optional"/>
</xsd:complexType>
</xsd:schema>
SCHEMA ELEMENT
 Schema element can be defined in the following
manner
<xsd:element name=“query" type="xsd:string"/>
 Any element that contain child elements or attribute
needs to be defined as a complexType
 Elements that enclose only simple data such as
numbers, strings or date are simpleTypes
SCHEMA ELEMENT
 There are some built-in simple schema types like
 String
 anyURI
 Boolean
 Date
 dateTime
 <xsd:sequence> element specifies the sequence of
the elements
NUMBER OF ELEMENTS
 The person element has a minOccurs attribute to
specify that it will occur at least once
 To make an element option, minOccurs should be 0
 To make it appear from 0 to 10 times, then we can
use minOccurs=“0” and maxOccurs=“10”
 To specify unlimited number of occurances, set
maxOccurs=“unbounded”
VALUES OF ELEMENT
 An element can be specified a default value through
<xsd:element name=“term” type=“xsd:integer”
default=“10”/>
 An element can be specified a fixed value through
<xsd:element name=“term” type=“xsd:integer”
fixed=“200”/>
ATTRIBUTES
 An element with attributes can be specified in the
following manner
<xsd:attribute name="correct" type="xsd:string"
use="optional"/>
 Optional tag specifies that the attribute is optional
 Some of the other use attribute that can be specified
are
 Default
 Fixed
 Optional
 Prohibited
 required
NAMESPACE
 The namespaces are useful in reuse of XML tags
 Once XML document can reuse part of another well
defined XML document
 The new XML document may contain elements that
have same name as of the other XML document
being referred
 The name clashes can be avoided using a
namespace
NAMESPACE
 The namespace for an XML document can be defined
using the targetNamespace attribute of the schema
element
<xsd:schema targetNamespace="http://xmlpowercorp"
xmlns="http://xmlpowercorp"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
attributeFormDefault="qualified" elementFormDefault
NAMESPACE
 The qualified attribute value specifies that the
namespace name will be specified before every
element in that namespace
 To avoid this we can use set the value to unqualified
UNQUALIFIED NAMESPACE
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://xmlpowercorp"
elementFormDefault="unqualified"
attributeFormDefault="unqualified>
<xsd:element name="game" type="gameType"/>
<xsd:complexType name="gameType">
<xsd:sequence>
<xsd:element name="person" minOccurs="1"/>
<xsd:element name="questions" type="questionType"
minOccurs="1"/>
</xsd:sequence>
<xsd:attribute name="documentDate" type="xsd:date"/>
</xsd:complexType>
In this case, the parser will assume that all elements are to be found in the default
namespace, which will create problem in this case for questionType
UNQUALIFIED NAMESPACE
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://xmlpowercorp"
xmlns:xmp="http://xmlpowercorp"
elementFormDefault="unqualified"
attributeFormDefault="unqualified>
<xsd:element name="game" type=“xmp:gameType"/>
<xsd:complexType name="gameType">
<xsd:sequence>
<xsd:element name="person" minOccurs="1"/>
<xsd:element name="questions" type=xmp:questionType"
minOccurs="1"/>
</xsd:sequence>
<xsd:attribute name="documentDate" type="xsd:date"/>
</xsd:complexType>
In this case, the parser will look at the qualified xmp for the elements gameType and
questionType and know that it belongs a different namespace

More Related Content

What's hot (18)

PPTX
XML Introduction
Bikash chhetri
 
PPTX
Xml dtd
sana mateen
 
ODP
XML
Osama Qunoo
 
PPT
Xml 215-presentation
Manish Chaurasia
 
PPT
Document Type Definition
yht4ever
 
PPT
Web Services Part 1
patinijava
 
PPT
2 dtd - validating xml documents
gauravashq
 
PDF
Building XML Based Applications
Prabu U
 
PPT
Xml
philipsinter
 
PPT
4 xml namespaces and xml schema
gauravashq
 
PPTX
It8074 soa-unit i
RevathiAPICSE
 
PPTX
Basic xml syntax
Raghu nath
 
PPTX
Basic XML
Hoang Nguyen
 
PPT
02 well formed and valid documents
Baskarkncet
 
PPTX
Document type definition
Raghu nath
 
XML Introduction
Bikash chhetri
 
Xml dtd
sana mateen
 
Xml 215-presentation
Manish Chaurasia
 
Document Type Definition
yht4ever
 
Web Services Part 1
patinijava
 
2 dtd - validating xml documents
gauravashq
 
Building XML Based Applications
Prabu U
 
4 xml namespaces and xml schema
gauravashq
 
It8074 soa-unit i
RevathiAPICSE
 
Basic xml syntax
Raghu nath
 
Basic XML
Hoang Nguyen
 
02 well formed and valid documents
Baskarkncet
 
Document type definition
Raghu nath
 

Similar to Basics of Xml (20)

PPTX
Internet_Technology_UNIT V- Introduction to XML.pptx
shilpar780389
 
PPTX
Extensible Markup Language(XML)_lecture.pptx
Abdul Jalil Tamjid
 
PPTX
Web Technology Part 4
Thapar Institute
 
PPTX
Xml 1
pavishkumarsingh
 
PDF
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
Dr.Florence Dayana
 
PPT
web program-Extended MARKUP Language XML.ppt
mcjaya2024
 
PPT
Introduction to XML.ppt
Varsha Uchagaonkar
 
PPT
Introduction to XML.ppt
Varsha Uchagaonkar
 
PPTX
XML1.pptx
53ShaikhImadoddin
 
PPT
Ch2 neworder
davidlahr32
 
PPT
Xml iet 2015
kiransurariya
 
PDF
XML
Prabu U
 
PPTX
Xml unit1
sathyasudha
 
PPT
XML-Unit 1.ppt
ssuseree7dcd
 
PPT
Xml and DTD's
Swati Parmar
 
PPT
1 xml fundamentals
Dr.Saranya K.G
 
PPTX
xml.pptx
TilakaRt
 
PPTX
Xml in bio medical field
Juman Ghazi
 
PDF
WT UNIT-2 XML.pdf
Ranjeet Reddy
 
PPTX
distributed system concerned lab sessions
milkesa13
 
Internet_Technology_UNIT V- Introduction to XML.pptx
shilpar780389
 
Extensible Markup Language(XML)_lecture.pptx
Abdul Jalil Tamjid
 
Web Technology Part 4
Thapar Institute
 
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
Dr.Florence Dayana
 
web program-Extended MARKUP Language XML.ppt
mcjaya2024
 
Introduction to XML.ppt
Varsha Uchagaonkar
 
Introduction to XML.ppt
Varsha Uchagaonkar
 
Ch2 neworder
davidlahr32
 
Xml iet 2015
kiransurariya
 
XML
Prabu U
 
Xml unit1
sathyasudha
 
XML-Unit 1.ppt
ssuseree7dcd
 
Xml and DTD's
Swati Parmar
 
1 xml fundamentals
Dr.Saranya K.G
 
xml.pptx
TilakaRt
 
Xml in bio medical field
Juman Ghazi
 
WT UNIT-2 XML.pdf
Ranjeet Reddy
 
distributed system concerned lab sessions
milkesa13
 
Ad

Recently uploaded (20)

PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Ad

Basics of Xml

  • 1. XML
  • 2. JERRY KURIAN. OVER 20 YEARS EXPERIENCE. TECHNOLOGY INNOVATOR & ENTREPRENEUR Started coding with an Intel 486 machine more than 25 years back and enjoying it ever since. Developed using VB, Pascal, C++, Java Enterprise and OSS, Scala, Node JS and the saga continues. Started using Spring, hibernate before it became hip. Started using Scala when it was in its infancy. After spending 8 years working in various software companies like Huawei Tech, Quidnunc across UK, US, China and India, the entrepreneurship bug bit in 2006 (before it was hip!!). Built one of the pioneers in SMS social network called CellZapp, I developed the product on my own and sold it to marquee customers like ESPN and Hungama Digital. Recently launched a product in field informatics www.isense-tech.co.in. Successfully launched across 3 pilot customers and on track to sign up more. A family man with two kids, I am a passionate weekend cricketer and an involved dad. I urge my two sons to follow their dreams, which they do by staying out of the conventional schooling system and exploring their passion at a democratic free school called BeMe. Check it out at http://beme.org.in
  • 3. ORIGIN OF XML  XML (Extensible Markup Language) is a derivative of SGML (Standard Generalized Markup Language), the earliest attempt at a markup language  XML is not a programming language, but a set of rules that structure data in a representational manner  XML rules are standard and allow easy eXtensibility as per business needs
  • 4. WHY XML  Most application have domain specific data that needs to be shared across components  With the service orientation of new-age applications, data has to be shared across different applications too  Application can share data in a format that can be parsed and understood by a program
  • 5. WHY XML  Data has traditionally been shared by defining protocols and arranging data as per the protocol  Every protocol needs development of a parser for understanding the protocol and extracting data out of it  Development of parser is not an easy undertaking and in fact adds no value to the overall application in terms of its actual business goals
  • 6. WHY XML  XML provides an easy substitution to the need of creating proprietary protocols  By following XML rules, new domain specific language (Protocol) can be generated without the need for creating its custom parser  Any XML document can be parsed by using a valid XML parser
  • 7. WHY XML  XML allows application developers to define a business specific protocol which is easy to read for humans as well as easy to parse for applications  Numerous parsers are available in all programming language to parse any XML document
  • 8. ADVANTAGES  XML allows definition of data in a format understandable to both humans and computers  Standard rules of XML allow a standard parser to be used for parsing any XML document  XML enables representation of data in simple texts, allowing easy transfer over any type of communication medium
  • 9. XML DOCUMENT  An XML document is made up of a set of tags in the form of ‘<‘ ‘some text’ ‘>’ that denotes start of a ‘node’  The node area ends with ‘<‘ ‘/’ ‘some text’ ‘>’  The XML nodes are made up of  Element  Attribute  Entity  Comment
  • 10. XML USAGE PROBLEM DEFINITION  Consider a multi user gaming platform where each user plays a game on his own machine and makes a move  Data about each move is sent to the other user in the form of XML  The game requires each player to send a challenge question to another player with choice of at least 3 answers, one of which can be right
  • 11. XML USAGE PROBLEM DEFINITION  Whenever a move is sent by player 1 to player 2, the details of player 1 along with current points should also be sent
  • 12. XML DEFINITION  In the problem definition, the various elements are  Player  Player Name  Player Address  Player Points  Questions  Question  Answer
  • 13. XML DEFINITION  The various elements identified in the previous slide can provide almost all the information about a move made by a player  These elements will be arranged in an XML document in the following manner
  • 14. <game> <person> <name>Jerry</name> <address>Bangalore</address> <points>10</points> </person> <questions> <question> <query>What is XML</query> <answers> <answer>A tree</answer> <answer>An automobile</answer> <answer>A markup language</answer> </answers> </question> <question> <query>Where is Bangalore located</query> <answers> <answer>Maharashtra</answer> <answer>Karnataka</answer> <answer>UP</answer> </answers> </question> <questions> </game>
  • 15. How will you create an XML representing answers from the player who Is being questioned?
  • 16. <game> <person> <name>Hari</name> <address>Bangalore</address> <points>50</points> </person> <questions> <question> <query>What is XML</query> <answer>A markup language</answer> </question> <question> <query>Where is Bangalore located</query> <answer>Karnataka</answer> </question> <questions> </game> XML Representing answer from the other player can be represented as
  • 17. ELEMENT  Element is the basic building block of XML document  Every aspect of the domain is described through the Element  In our example, the nodes like <person>, <question> etc are elements  As seen above, one element can contain one or more elements as its child element
  • 18. ATTRIBUTE  If an element has some additional characteristics, which is not an element in itself, then it can be denoted using an attribute  The attribute is placed within the element node and contains a name=value pair  In our example, the list of answers should contain one correct answer. The correctness of an answer can be denoted using an attribute
  • 19. <game> <person> <name>Jerry</name> <address>Bangalore</address> <points>10</points> </person> <questions> <question> <query>What is XML</query> <answers> <answer>A tree</answer> <answer>An automobile</answer> <answer correct=“true”>A markup language</answer> </answers> </question> <question> <query>Where is Bangalore located</query> <answers> <answer>Maharashtra</answer> <answer correct=“true”>Karnataka</answer> <answer>UP</answer> </answers> </question> <questions> </game>
  • 20. ROOT ELEMENT  The XML elements can be represented in the form of a tree  The top most element of the XML document is the Root element and each of its child is a root to its own children  In our case, the <game> element is the root element of the document.
  • 21. EMPTY ELEMENTS  There could be elements that do not have any child elements under it  These elements could just have the attributes in it  Such elements are called Empty elements  Empty elements are usually denoted as <element_name/>. This is same as <element_name></element_name> with no content between
  • 22. COMMENTS  Comments can be added into an XML document to give more information about tags  Comments will be ignored by the parser  Comments can be provided between the tags <!- - and - - > <!- - Your comment here - - >
  • 23. ENTITY  Entities can be used to substitute a value for a data item  Entities behave like macros where they are place holders for something else  Entities start with & and end with ;  Predefined entity like &quot; will be replaced by a ‘ when parsed
  • 24. CDATA  As seen in the example, most of the element contain text between then, which is the value for the element  The XML parser returns the value of element by getting the content between the nodes  If the content contains some special characters like ‘<‘, ‘>’ its, then it may lead to error in parsing
  • 25. CDATA  Such characters can be escaped by using entities as explained earlier  But if you want to avoid entities, then CDATA section can be used  When CDATA section is encountered, the parser will leave it alone and pass the text unchanged  CDATA can be defined in the following format <![CDATA[ content ]]>.
  • 27. DTD
  • 28. DOCUMENT TYPES  There are two types of XML documents  Well Formed  Well formed and valid  Well formed documents are any XML document that follow the general XML rules  The XML documents above are examples of well formed XML documents
  • 29. DOCUMENT TYPES  Well formed and valid XML documents are ones that not only follow general XML rules, but also conform to certain domain specific grammar  The domain specific grammar is denoted using DTD (Document type definition)  DTDs define rules for a domain specific XML document
  • 30. DTD  DTD is made of tags that define the various nodes allowed in an XML document  The DTD can be used to define the various aspects of XML document like  Element  Attribute  Entities
  • 31. DTD  A document can refer to a DTD using the <!DOCTYPE> element <!DOCTYPE document [ <! - - DTD goes here - -> ]> <game> <person> <name>Jerry</name> <address>Bangalore</address> <points>10</points> </person>
  • 32. DTD  An XML document can also refer to an external DTD file instead of defining it as part of the XML document itself <!DOCTYPE document SYSTEM “game.dtd">  The SYSTEM specifies this to be a private DTD
  • 33. PUBLIC DTDS  DTDs can be created by public body and can be accessed by any XML document  <!DOCTYPE document PUBLIC ‘dtd’>  The dtd location needs to be specified using a formal public identifier (FPI)  FPI Example:  -//W3C//DTD XHTML 1.0 Transitional//EN
  • 34. FPI RULES  The first field indicates whether the DTD is for a formal standard. For DTDs you create on your own, this field should be -. If a non- official standards body has created the DTD, you use +. For formal standards bodies, this field is a reference to the standard itself (such as ISO/IEC 19775:2003).  The second field holds the name of the group or person responsible for the DTD. You should use a name that is unique (for example, W3C just uses W3C).  The third field specifies the type of the document the DTD is for and should be followed by a unique version number of some kind (such as Version 1.0).  The fourth field specifies the language in which the DTD is written (for example, EN for English).
  • 35. DECLARING ELEMENT  The XML elements are declared in DTD using the following syntax <!ELEMENT name content_model >  The name indicates the name of the element  The content_model indicates the content that the element is allowed to have as its children  If there is no content_model specified then the element will be treated as an empty element
  • 36. DECLARING ELEMENT  In our example, the game element can be declared in the following way <!ELEMENT game (person,questions)>  The above element definition specifies that the game element can have person and questions elements as its children  If an element provides content_model as ANY then that element can contain any type of children, effectively telling parser to ignore validation of the element <!ELEMENT name ANY>
  • 37. CHILD ELEMENTS  The DTD can specify the number of children allowed for each element <!ELEMENT game (person)> Specifies game element can have only one person child element <!ELEMENT questions (question)*> Specifies that the questions element can zero or many question elements as children
  • 38. CHILD ELEMENTS Element x or y can be present- but not bothx | y Element x should be followed by element yx , y There can be zero or one occurrence of the element ? There can be one or more occurrences of the element + There can be zero or more occurrences of the element * DescriptionNotation
  • 39. ATTRIBUTE  Attributes provide additional details for an element  Attributes can be defined in a DTD using the following notation <!ATTLIST element_name attribute_name type default_value
  • 40. ATTRIBUTE DEFINITION  In our example, the element answer has an attribute correct  <!ATTLIST answer correct CDATA #IMPLIED> Specifies default value for attribute value Mandates the attribute #REQUIRED Sets attribute’s value to value #FIXED value Attribute is optional #IMPLIED
  • 41. ATTRIBUTE TYPES  Attributes can have the following types  CDATA  Enumerated type  NMTOKEN  NMTOKENS  ID
  • 42. ATTRIBUTE TYPES  CDATA- allows character data that should not contain special characters  Enumerated types provides a comma separated list of options.  <!ATTLIST answer correct (true | false) #REQUIRED> ]>  NMTOKEN are any name token that confirm to XML standards  NMTOKENS are a set of NMTOKENS seperated by white space
  • 43. ENTITY  An entity in XML is just a data item  Entities are usually text that are used quite often across the document  Entities can also be binary data  Entities can be declared like <!ENTITY name definition>
  • 44. ENTITY  Example <!ENTITY copyright "(c) XML Power Corp. 2005">  An entity can then be used in the document as follows <copy>&copyright;</copy>
  • 46. XML SCHEMAS  XML schemas are an alternate way of defining the structure of an XML document  Schemas are much more comprehensive and detailed way of specifying the XML syntax  Schemas also specify the element and attribute of an XML document
  • 47. SCHEMA EXAMPLE  The game XML document can be defined as <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="game" type="gameType"/> <xsd:complexType name="gameType"> <xsd:sequence> <xsd:element name="person" minOccurs="1"/> <xsd:element name="questions" type="questionType" minOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="questionType"> <xsd:sequence> <xsd:element ref="query"/> <xsd:element name="answers" type="answersType"/> </xsd:sequence> </xsd:complexType> <xsd:element name="query" type="xsd:string"/> <xsd:complexType name="answersType"> <xsd:sequence> <xsd:element name="answer" type="answerType"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="answerType"> <xsd:attribute name="correct" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:schema>
  • 48. SCHEMA ELEMENT  Schema element can be defined in the following manner <xsd:element name=“query" type="xsd:string"/>  Any element that contain child elements or attribute needs to be defined as a complexType  Elements that enclose only simple data such as numbers, strings or date are simpleTypes
  • 49. SCHEMA ELEMENT  There are some built-in simple schema types like  String  anyURI  Boolean  Date  dateTime  <xsd:sequence> element specifies the sequence of the elements
  • 50. NUMBER OF ELEMENTS  The person element has a minOccurs attribute to specify that it will occur at least once  To make an element option, minOccurs should be 0  To make it appear from 0 to 10 times, then we can use minOccurs=“0” and maxOccurs=“10”  To specify unlimited number of occurances, set maxOccurs=“unbounded”
  • 51. VALUES OF ELEMENT  An element can be specified a default value through <xsd:element name=“term” type=“xsd:integer” default=“10”/>  An element can be specified a fixed value through <xsd:element name=“term” type=“xsd:integer” fixed=“200”/>
  • 52. ATTRIBUTES  An element with attributes can be specified in the following manner <xsd:attribute name="correct" type="xsd:string" use="optional"/>  Optional tag specifies that the attribute is optional  Some of the other use attribute that can be specified are  Default  Fixed  Optional  Prohibited  required
  • 53. NAMESPACE  The namespaces are useful in reuse of XML tags  Once XML document can reuse part of another well defined XML document  The new XML document may contain elements that have same name as of the other XML document being referred  The name clashes can be avoided using a namespace
  • 54. NAMESPACE  The namespace for an XML document can be defined using the targetNamespace attribute of the schema element <xsd:schema targetNamespace="http://xmlpowercorp" xmlns="http://xmlpowercorp" xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified" elementFormDefault
  • 55. NAMESPACE  The qualified attribute value specifies that the namespace name will be specified before every element in that namespace  To avoid this we can use set the value to unqualified
  • 56. UNQUALIFIED NAMESPACE <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://xmlpowercorp" elementFormDefault="unqualified" attributeFormDefault="unqualified> <xsd:element name="game" type="gameType"/> <xsd:complexType name="gameType"> <xsd:sequence> <xsd:element name="person" minOccurs="1"/> <xsd:element name="questions" type="questionType" minOccurs="1"/> </xsd:sequence> <xsd:attribute name="documentDate" type="xsd:date"/> </xsd:complexType> In this case, the parser will assume that all elements are to be found in the default namespace, which will create problem in this case for questionType
  • 57. UNQUALIFIED NAMESPACE <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://xmlpowercorp" xmlns:xmp="http://xmlpowercorp" elementFormDefault="unqualified" attributeFormDefault="unqualified> <xsd:element name="game" type=“xmp:gameType"/> <xsd:complexType name="gameType"> <xsd:sequence> <xsd:element name="person" minOccurs="1"/> <xsd:element name="questions" type=xmp:questionType" minOccurs="1"/> </xsd:sequence> <xsd:attribute name="documentDate" type="xsd:date"/> </xsd:complexType> In this case, the parser will look at the qualified xmp for the elements gameType and questionType and know that it belongs a different namespace