WEB PROGRAMMING UNIT-III NOTES


UNIT - III

Overview :

This unit focuses on creating XML documents that are designed to carry data. XML is all about describing information. XML was designed to transport and store data. We can create web documents from XML using XSLT to transform our documents into HTML. We can then send

our XML to an XSLT processor on the web server and serve that result to the web browser. This makes our documentation available in whatever format we need it to be in.

Contents :
 Introduction to XML DTD XML Schema XSLT DOM SAX

                  Introduction to XML

XML describes and focuses on the data while HTML only displays and focuses on how data looks. HTML is all about displaying information but XML is all about describing information. The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard.
In HTML some elements can be improperly nested within each other like this:
<b><i>This text is bold and italic</b></i>
In XML all elements must be properly nested within each other like this:

An XML document is composed of
1.       Declarations (prolog, dtd reference)
2.       Elements
3.       Comments
4.       Entities (predefined, custom defined, character entities)


The XML declaration: Always the first line in the xml document:
The XML declaration should always be included. It defines the XML version and the character encoding used in the document. In this case the document conforms to the 1.0 specification of XML and uses the ISO-8859-1 (Latin-1/West European) character set.
<?xml version="1.0" encoding="ISO-8859-1"?>
Root Element: The next line defines the first element of the document . It is called as the root element
<E-mail>
Child Elements: The next 4 lines describe the four child elements of the root (To, From, Subject and Body).

And finally the last line defines the end of the root element .
</E-mail>
</E-mail>
syntax-rules

·         All XML elements must have a closing tag
·         XML tags are case sensitive
·         XML Elements Must be Properly Nested
·         XML Documents Must Have a Root Element
·         Always Quote the XML Attribute Values
·         With XML, White Space is Preserved
·         Comments in XML               <!-- This is a comment -->
·         XML Elements have Relationships
o Elements in a xml document are related as parents and children.

XML elements must follow these naming conventions:
Names must not start with a number or punctuation character but it can contain letters, numbers, and other characters without spaces. Names must not start with the letters xml (or XML, or Xml, etc)

XML Attributes

·         XML elements can have attributes in the start tag, just like HTML.
·         Attributes are used to provide additional information about elements.
·         Attribute values must always be enclosed in quotes. Use either single or double quotes eg. <color="red"> or <color='red'>
·         If the attribute value itself contains double quotes it is necessary to use single quotes, like in this example: <name='Rose "India" Net'>
·         : If the attribute value itself contains single quotes it is necessary to use double quotes, like in this example: <name="Rose 'India' Net">

DTD(Document Type Definition)
A Document Type Definition (DTD) defines the legal building blocks of an XML document.It defines the document structure with a list of legal   elements and attributes. A DTD can be defined inside a  XML document, or a  external reference can be declared .

Internal DTD

If the DTD is defined inside the XML document, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
External DTD
If the DTD is defined in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element SYSTEM "filename">
Importance of a DTD
·         With a DTD, a XML file carries a description of its own format.
·         With a DTD, independent groups of people can agree to use a standard DTD for interchanging data.
·         User application can use a standard DTD to verify that the data he receives from the outside world is valid.
·         User can also use a DTD to verify his own data.

Building blocks of XML DTD Documents:

·         Elements
·         Attributes
·         Entities
·         PCDATA
·         CDATA PCDATA
·         PCDATA means parsed character data. It can be thought as the character data ( text ) found between the start tag and the end tag of a XML element.
·         PCDATA is a text to be parsed by a parser. The text is checked by the parser for entities and markup.
·         Tags inside the text will be treated as markup and entities will be expanded. However, parsed character data should not contain any &, <, or > characters. These should be represented by the &amp , &lt, and &gt entities, respectively.
CDATA:
·         CDATA is character data that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.


DTD-Elements: Elements are the main constituent components of both XML documents. Elements can contain text, other elements, or be empty.
Syntax:
<!ELEMENT element-name category>

or
<!ELEMENT element-name (element-content)> EX: Elements with Parsed Character Data
<!ELEMENT To (#PCDATA)>
<!ELEMENT From (#PCDATA)>

Elements with Children (sequences)
Elements with one or more children are declared with the name of the children elements inside the parentheses as :
<!ELEMENT element-name (child1)> or
<!ELEMENT element-name (child1,child2,...)> EX:
<!ELEMENT E-mail (To,From,Subject,Body)>

When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document.
In a full declaration, the children must also be declared. Children can have children.

Tag qualifiers
* Indicates zero or more occurrence.
<!ELEMENT color (Fill-Red*)>
? Indicates Zero or one time occurrence.
<!ELEMENT color (Fill-Red?)>
+ Indicates one or more occurrence
<!ELEMENT color (Fill-Red+)>
( ) Indicates a group of expressions to be matched together.
EX:<!ELEMENT E-mail(#PCDATA|To|From|Subject|Body)*>
| Indicates an option.
<!ELEMENT E-mail (To,From,Subject,(Message|Body))>

Special tag Values in DTD

          Tag definition can have following instead of sub-tags:
          ANY
Ø  Indicates that the tag can contain any other defined element or PCDATA.
Ø  Usually used for the root element.
Ø  Elements can occur in any order in such a document.
Ø  Not recommended to be used.
          EMPTY
Ø  It says that the element contains no contents (and consequently no corresponding end-tag)

Ø  Ex: to allow a tag flag used as <flag/> in a xml file the DTD entry should be
<!ELEMENT flag EMPTY>


DTD-Attributes:
Attributes provide extra information about elements.
In a DTD, attributes are declared with an ATTLIST declaration. Declaring Attributes
The ATTLIST declaration defines the element having a attribute with attribute name , attribute type , and attribute default value. An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attribute-type default-value>
DTD example:
<!ATTLIST reciept type CDATA "check">
XML example:
<reciept type="check" />

DTD-Entities
Entities are variables used to define shortcuts to standard text or special characters. Entity references are references to entities Entities can be declared internally or externally.

Internal Entity Declaration

Syntax:
<!ENTITY entity-name "entity-value">

XML Schema

XML Schemas are more powerful than DTDs.
XML Schema is a W3C Standard. It is an XML-based alternative to DTDs. It describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition (XSD).
We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons:

·         XML Schemas are extensible to future additions
·         XML Schemas are richer and more powerful than DTDs
·         XML Schemas are written in XML, supports data types and namespaces.

What is an XML Schema?
·         XML Schema is used to define the legal building blocks of an XML document, just like a DTD.

·         An XML Schema defines user-defined integrants like elements, sub-elements and attributes needed in a xml document.
·         It defines the data types for elements and attributes along with the occurrence order .
·         It defines whether an element is empty or can include text.
·         It also defines default and fixed values for elements and attributes

Features of XML Schemas : XML Schemas Support Data Types
One of the greatest strengths of XML Schemas is its support for data types. With support for data types:

·         It is easier to describe allowable document content
·         It is easier to validate the correctness of data
·         It is easier to work with data from a database
·         It is easier to define data facets (restrictions on data)
·         It is easier to define data patterns (data formats)
·         It is easier to convert data between different data types


XML Schemas use XML Syntax
Another great strength about XML Schemas is that they are written in XML. Simple XML editors are used to edit the Schema files. Even the same XML parsers can be used to parse the Schema files.
XML Schemas are Extensible
XML Schemas are extensible, because they are written in XML.So a user can reuse a Schema in other Schemas and can also refer multiple schemas in the same document. He can also create his own data types derived from the standard types
XML Schemas Secure Reliable Data Communication
When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about the content. With XML Schemas, the sender can describe the data in a way that the receiver will understand. A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in other countries as 11.March.However, an XML element with a data type like this: <datetype="date">2004-03-11</date> ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYY-MM-DD".


XSLT

XSLT (Extensible Stylesheet Language Transformations) is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one.[2] The new document may be serialized (output) by the processor in standard XML syntax or in another format, such as HTML or plain text.[3] XSLT is most often used to convert data between different XML schemas or to convert XML data into web pages or PDF documents.

Simple API for XML (SAX)

SAX is a lexical, event-driven interface in which a document is read serially and its contents are reported as callbacks to various methods on a handler object of the user's design. SAX is fast and efficient to implement, but difficult to use for extracting information at random from the XML, since it tends to burden the application author with keeping track of what part of the document is being processed. It is better suited to situations in which certain types of information are always handled the same way, no matter where they occur in the document.

Document Object Model (DOM)

DOM (Document Object Model) is an interface-oriented Application Programming Interface that allows for navigation of the entire document as if it were a tree of "Node" objects representing the document's contents. A DOM document can be created by a parser, or can be generated manually by users (with limitations). Data types in DOM Nodes are abstract; implementations provide their own programming language-specific bindings. DOM implementations tend to be memory intensive, as they generally require the entire document to be loaded into memory and constructed as a tree of objects before access is allowed.


Comments

Popular posts from this blog

Mr. V Vinay Kumar