WEB PROGRAMMING UNIT-III NOTES
UNIT - III
Overview
:
This unit
focuses on creating XML documents that are designed to carry data. XML is all
about describing information. XML
was designed to transport and store data. We can create web documents from XML
using XSLT to transform our documents into HTML. We can then send
our XML to an
XSLT processor on the web server and serve that result to the web browser. This
makes our documentation available in whatever format we need it to be in.
Contents :
XML describes
and focuses on the data while HTML only displays and focuses on how data looks.
HTML is all about displaying information but XML is all about describing
information. The tags used to mark up HTML documents and the structure of HTML
documents are predefined. The author of HTML documents can only use tags that
are defined in the HTML standard.
In HTML some
elements can be improperly nested within each other like this:
<b><i>This
text is bold and italic</b></i>
In XML all
elements must be properly nested within each other like this:
An XML document
is composed of
1.
Declarations (prolog, dtd reference)
2.
Elements
3.
Comments
4.
Entities (predefined, custom defined, character entities)
The
XML declaration: Always the first line in the xml document:
The XML declaration should
always be included. It defines the XML version and the character encoding used
in the document. In this case the document conforms to the 1.0 specification of
XML and uses the ISO-8859-1 (Latin-1/West European) character set.
<?xml
version="1.0" encoding="ISO-8859-1"?>
Root Element: The next
line defines the first element of the document . It is called as the root
element
<E-mail>
Child Elements: The next 4
lines describe the four child elements of the root (To, From, Subject and
Body).
And finally the
last line defines the end of the root element .
</E-mail>
</E-mail>
syntax-rules
·
All XML elements must have a closing tag
·
XML tags are case
sensitive
·
XML Elements Must be Properly Nested
·
XML Documents Must Have a Root Element
·
Always Quote the XML Attribute Values
·
With XML, White Space is Preserved
·
Comments in
XML <!-- This is a
comment -->
·
XML Elements have
Relationships
o Elements in a xml document are related as parents and children.
XML elements
must follow these naming conventions:
Names must not
start with a number or punctuation character but it can contain letters,
numbers, and other characters without spaces. Names must not start with the
letters xml (or XML, or Xml, etc)
XML Attributes
·
XML elements can have attributes in the start tag,
just like HTML.
·
Attributes are used to provide additional
information about elements.
·
Attribute values must always be
enclosed in quotes. Use either single or double
quotes eg. <color="red"> or <color='red'>
·
If the attribute value itself
contains double quotes it is necessary to use single quotes, like in this example: <name='Rose "India" Net'>
·
:
If the attribute value itself contains single quotes it is
necessary to use double quotes, like
in this example: <name="Rose 'India'
Net">
DTD(Document Type
Definition)
A Document Type Definition (DTD) defines the legal building blocks
of an XML document.It defines the document structure with a list of legal elements and attributes. A DTD can be defined inside a XML document, or a external reference can be declared .
Internal DTD
If the DTD is defined
inside the XML document, it should be wrapped in a DOCTYPE definition with the
following syntax:
<!DOCTYPE
root-element [element-declarations]>
External DTD
If the DTD is
defined in an external file, it should be wrapped in a DOCTYPE definition with
the following syntax:
<!DOCTYPE
root-element SYSTEM "filename">
Importance of a
DTD
·
With a DTD, a XML file carries a description of its
own format.
·
With a DTD, independent groups of
people can agree to use a standard DTD for
interchanging data.
·
User application can use a standard
DTD to verify that the data he receives from the outside world is valid.
·
User can also use a DTD to verify his own data.
Building
blocks of XML DTD Documents:
·
Elements
·
Attributes
·
Entities
·
PCDATA
·
CDATA PCDATA
·
PCDATA means parsed
character data. It can be thought as the character data ( text ) found
between the start tag and the end tag of a XML
element.
·
PCDATA is a text to be parsed
by a parser. The text is checked by the parser for entities
and markup.
·
Tags inside the text will be treated as markup and
entities will be expanded. However, parsed character data should not contain
any &, <, or > characters. These should be
represented by the & , <, and > entities, respectively.
CDATA:
·
CDATA is
character data that will NOT be parsed by a parser. Tags inside the text will
NOT be treated as markup and entities will not be expanded.
DTD-Elements:
Elements are the main constituent
components of both XML documents. Elements can contain text, other
elements, or be empty.
Syntax:
<!ELEMENT element-name category>
or
<!ELEMENT element-name (element-content)> EX: Elements with Parsed Character Data
<!ELEMENT To
(#PCDATA)>
<!ELEMENT
From (#PCDATA)>
Elements with
Children (sequences)
Elements with
one or more children are declared with the name of the children elements inside
the parentheses as :
<!ELEMENT element-name (child1)>
or
<!ELEMENT element-name (child1,child2,...)>
EX:
<!ELEMENT
E-mail (To,From,Subject,Body)>
When children
are declared in a sequence separated by commas, the children must appear in the
same sequence in the document.
In a full
declaration, the children must also be declared. Children can have children.
Tag qualifiers
* Indicates zero or more occurrence.
<!ELEMENT
color (Fill-Red*)>
? Indicates
Zero or one time occurrence.
<!ELEMENT
color (Fill-Red?)>
+ Indicates
one or more occurrence
<!ELEMENT
color (Fill-Red+)>
( ) Indicates
a group of expressions to be matched together.
EX:<!ELEMENT
E-mail(#PCDATA|To|From|Subject|Body)*>
| Indicates an
option.
<!ELEMENT
E-mail (To,From,Subject,(Message|Body))>
Special tag
Values in DTD
•
Tag definition
can have following instead of sub-tags:
•
ANY
Ø Indicates that
the tag can contain any other defined element or PCDATA.
Ø
Usually used for the root element.
Ø
Elements can occur in any order in such a document.
Ø
Not recommended to be used.
•
EMPTY
Ø It says that the
element contains no contents (and consequently no corresponding end-tag)
Ø
Ex: to allow a tag flag used as <flag/> in a
xml file the DTD entry should be
<!ELEMENT
flag EMPTY>
DTD-Attributes:
Attributes provide extra information about elements.
In a DTD, attributes are declared with an ATTLIST declaration.
Declaring Attributes
The ATTLIST
declaration defines the element having a attribute with attribute name ,
attribute type , and attribute default value. An attribute declaration has the
following syntax:
<!ATTLIST
element-name attribute-name attribute-type default-value>
DTD example:
<!ATTLIST reciept type CDATA "check">
XML example:
<reciept type="check" />
DTD-Entities
Entities are
variables used to define shortcuts to standard text or special characters.
Entity references are references to entities Entities can be declared
internally or externally.
Internal Entity
Declaration
Syntax:
<!ENTITY entity-name "entity-value">
XML Schema
XML Schemas are
more powerful than DTDs.
XML Schema is a W3C Standard. It is an XML-based
alternative to DTDs. It describes the structure of an XML document.
The XML Schema
language is also referred to as XML Schema Definition (XSD).
We think that
very soon XML Schemas will be used in most Web applications as a replacement
for DTDs. Here are some reasons:
·
XML Schemas are extensible to future additions
·
XML Schemas are richer and more powerful than DTDs
·
XML Schemas are written in XML, supports data types
and namespaces.
What is an XML Schema?
·
XML Schema is used to define the legal building
blocks of an XML document, just like a DTD.
·
An XML Schema defines user-defined
integrants like elements, sub-elements and attributes needed in a xml document.
·
It defines the data types for elements and
attributes along with the occurrence order .
·
It defines whether an element is empty or can
include text.
·
It also defines default and fixed values for
elements and attributes
Features of XML Schemas : XML Schemas Support Data
Types
One of the greatest strengths of XML Schemas is its support for data
types. With support for data types:
·
It is easier to describe allowable document content
·
It is easier to validate the correctness of data
·
It is easier to work with data from a database
·
It is easier to define data facets (restrictions on data)
·
It is easier to define data patterns (data formats)
·
It is easier to convert data between different data types
XML Schemas
use XML Syntax
Another great strength about XML Schemas is that they are written in
XML. Simple XML editors are used to edit the Schema files. Even the same XML
parsers can be used to parse the Schema files.
XML Schemas are Extensible
XML Schemas are
extensible, because they are written in XML.So a user can reuse a Schema in
other Schemas and can also refer multiple schemas in the same document. He can
also create his own data types derived from the standard types
XML Schemas
Secure Reliable Data Communication
When sending
data from a sender to a receiver, it is essential that both parts have the same
"expectations" about the content. With XML Schemas, the sender can
describe the data in a way that the receiver will understand. A date like:
"03-11-2004" will, in some countries, be interpreted as 3.November
and in other countries as 11.March.However, an XML element with a data type
like this: <datetype="date">2004-03-11</date> ensures a
mutual understanding of the content, because the XML data type "date"
requires the format "YYYY-MM-DD".
XSLT
XSLT (Extensible Stylesheet Language
Transformations) is a declarative,
XML-based
language used for the transformation of
XML documents. The original document is not changed; rather, a new document is
created based on the content of an existing one.[2] The
new document
may be serialized (output)
by the processor in standard XML syntax or in another format, such as HTML or plain text.[3] XSLT is most often used to convert data between different XML schemas or to convert
XML data into web pages or
PDF documents.
Simple API for
XML (SAX)
SAX is a lexical, event-driven interface
in which a document is read serially and its contents are reported as callbacks to
various methods on
a handler object of the
user's design. SAX is fast and efficient to implement, but difficult to use for
extracting information at random from the XML, since it tends to burden the
application author with keeping track of what part of the document is being
processed. It is better suited to situations in which certain types of
information are always handled the same way, no matter where they occur in the
document.
Document Object Model (DOM)
DOM (Document
Object Model) is an interface-oriented
Application
Programming Interface that allows for navigation of the entire
document as if it were a tree of "Node"
objects
representing the document's contents. A DOM document can be created by a
parser, or can be generated manually by users (with limitations). Data types in
DOM Nodes are abstract; implementations provide their own programming language-specific
bindings. DOM
implementations tend to be memory intensive,
as they generally require the entire document to be loaded into memory and
constructed as a tree of objects before access is allowed.
Comments
Post a Comment