Sunday, August 5, 2007

HOW XML WORKS???

[http://xml.silmaril.ie/]

What is XML?

[http://xml.silmaril.ie/basics/whatisxml/]

XML is the Extensible Markup Language. It improves the functionality of the Web by letting you identify your information in a more accurate, flexible, and adaptable way.

It is extensible because it is not a fixed format like HTML (which is a single, predefined markup language). Instead, XML is actually a metalanguage—a language for describing other languages—which lets you design your own markup languages for limitless different types of documents. XML can do this because it's written in SGML, the international standard metalanguage for text document markup (ISO 8879).


How to control formatting and appearance?

[http://xml.silmaril.ie/authors/style/]

In HTML, default styling was built into the browsers because the tagset of HTML was predefined and hardwired into browsers. In XML, where you can define your own tagset, browsers cannot possibly be expected to guess or know in advance what names you are going to use and what they will mean, so you need a stylesheet if you want to display formatted text.

Browsers which read XML will accept and use a CSS stylesheet at a minimum, but you can also use the more powerful XSLT stylesheet language to transform your XML into HTML—which browsers, of course, already know how to display (and that HTML can still use a CSS stylesheet). This way you get all the document management benefits of using XML, but you don't have to worry about your readers needing XML smarts in their browsers.


What's a Document Type Definition (DTD)?

[http://xml.silmaril.ie/authors/dtds/]

A DTD is a description in XML Declaration Syntax of a particular type or class of document. It sets out what names are to be used for the different types of element, where they may occur, and how they all fit together. (A question C.16, Schema does the same thing in XML Document Syntax, and allows more extensive data-checking.)

For example, if you want a document type to be able to describe Lists which contain Items, the relevant part of your DTD might contain something like this:


<!ELEMENT List (Item)+>



<!ELEMENT Item (#PCDATA)>


This defines a list as an element type containing one or more items (that's the plus sign); and it defines items as element types containing just plain text (Parsed Character Data or PCDATA). Validators read the DTD before they read your document so that they can identify where every element type ought to come and how each relates to the other, so that applications which need to know this in advance (most editors, search engines, navigators, and databases) can set themselves up correctly. The example above lets you create lists like:


<List>

<Item>Chocolate</Item>

<Item>Music</Item>

<Item>Surfing</Item>

</List>

(The indentation in the example is just for legibility while editing: it is not required by XML.)

A DTD provides applications with advance notice of what names and structures can be used in a particular document type. Using a DTD and a validating editor means you can be certain that all documents of that particular type will be constructed and named in a consistent and conformant manner.

DTDs are not required for processing the tip in question Bwell-formed documents, but they are needed if you want to take advantage of XML's special attribute types like the built-in ID/IDREF cross-reference mechanism; or the use of default attribute values; or references to external non-XML files (‘Notations’); or if you simply want a check on document validity before processing.

There are thousands of DTDs already in existence in all kinds of areas (see the SGML/XML Web pages for pointers). Many of them can be downloaded and used freely; or you can write your own (see the question on creating your own DTD. Old SGML DTDs need to be converted to XML for use with XML systems: read the question on converting SGML DTDs to XML, but most popular SGML DTDs are already available in XML form.

New Some XML editors use a binary compiled format of DTD produced by their own management routines to allow a single person in an organisation to be in charge of modifications, and to distribute only an unmodifiable (binary compiled) version to users.

The alternatives to a DTD are various forms of question C.16, Schema. These provide more extensive validation features than DTDs, including character data content validation.





No comments: