MAP | SASAX-RSS Documents > Implementation note |
The purpose of this document is to explain details about implementation of SASAX-RSS.
This will help you to write your specific XML document parser on SASAX framework.
ATTENTION: Source codes shown in this document give higer priority to simpleness of explanation than to exact sameness with distributed source.
I wish you do not confuse between one in this tutorial and another in distribution.
In this document, abbreviated class names are used. Complete names are shown below.
Notation | Full name |
---|---|
Attributes | org.xml.sax.Attributes |
SAXException | org.xml.sax.SAXException |
Please refer JavaDoc in document distribution or on web for these classes.
Notation | Full name |
---|---|
AbstractElement | jp.ne.dti.lares.foozy.sasax.AbstractElement |
CompositeElement | jp.ne.dti.lares.foozy.sasax.CompositeElement |
LooseDateTimeElement | jp.ne.dti.lares.foozy.sasax.LooseDateTimeElement |
Element | jp.ne.dti.lares.foozy.sasax.Element |
ElementDrivenHandler | jp.ne.dti.lares.foozy.sasax.ElementDrivenHandler |
Notification | jp.ne.dti.lares.foozy.sasax.Notification |
ParseContext | jp.ne.dti.lares.foozy.sasax.ParseContext |
Notation | Full name |
---|---|
ChannelElement | jp.ne.dti.lares.foozy.sasax.rss.ChannelElement |
ItemElement | jp.ne.dti.lares.foozy.sasax.rss.ItemElement |
RSSRootElement | jp.ne.dti.lares.foozy.sasax.rss.RSSRootElement |
Notation | Full name |
---|---|
Channel | jp.ne.dti.lares.foozy.sasax.rss.Channel |
ChannelNotification | jp.ne.dti.lares.foozy.sasax.rss.ChannelNotification |
Item | jp.ne.dti.lares.foozy.sasax.rss.Item |
ItemNotification | jp.ne.dti.lares.foozy.sasax.rss.ItemNotification |
Parser part of SASAX-RSS is simple enough to read/understand implementation of it, if you already read SASAX tutorial and understand about RSS specification.
One of few things, which are not described in SASAX tutorial, is
that "ignore"/"ordered" parameters of
CompositeElement
constructor.
That constructor is introduced since SASAX 1.2, and these parameter mean:
this is whether sub element should appear in order of registration or not. According to RSS specification, element order is not restricted.
SASAX-RSS parser only examines whether specified XML document is valid as RSS document or not.
You should do something to get information from RSS document (e.g.: detail about "channel", list of "item" and so on), and "something" is setting information container up in almost all cases.
It is a kind of deserialization for (information container) object from XML document, and is specific to container definition. So, deserialization implementation, which is in GUI part of SASAX-RSS, is separated from SASAX-RSS parser.
There are some ways to get parsing result on SASAX:
notifyDetermined()
method:AbstractElement
,
but re-usability of both Element
implementation class and
result handling logic are decreased.Notification
:Notification
can isolate XML document parsing logic on SASAX and
deserialization logic which is specific to your application.So, SASAX-RSS chooses last approach.
For example,
deserialization codes for "channel" RSS element, are shown below.
These are from ChannelNotification
,
which is implementation of Notification
.
In below example,
"channel_
" is the Channel
type member field of it.
Channel
is container for RSS channel information.
public void elementStarted(Element element, ParseContext context, Attributes attributes) throws SAXException // { // Get "rdf:about" attribute String about = attributes.getValue("http://purl.org/rss/1.0/", "about"); channel_.setValue("rdf:about", // name of value about); }
High lighted parts have information from XML document.
public void elementEnded(Element element, ParseContext context) throws SAXException // { ChannelElement channel = (ChannelElement)element; // Get "rss:title" value channel_.setValue("title", channel.title.getString(true)); // Get "rss:link" value channel_.setValue("link", channel.link.getString(true)); // Get "rss:description" value channel_.setValue("description", channel.description.getString(true)); }
Deserialization procedures, which require attribute value of element,
are placed in elementStarted()
,
and others, which require determination of sub elements,
are placed in elementEnded()
.
Is not it difficult, is it ?
At parsing time,
you should add Notification
for deserialization
to corresponded element,
like shown below.
// Create information container Channel channel = new Channel(); // Create RSS document parsingElement
RSSRootElement root = new RSSRootElement(null); // AddNotifiation
to "channel" element root.channel.addNotification(new ChannelNotification(channel)); : ElementDrivenHandler handler = new ElementDrivenHandler(root); handler.parse(reader); // parse RSS XML document // here,Channel
is deserialized from XML document
This section explains how to add extension elements and get value from them, by showing implementation code for "date" of "Dublin Core" under "item" of RSS.
In SASAX-RSS GUI source files,
specific implementation parts for "date" of "Dublin Core"
are sorrounded by ">>>> DC:DATE
"
and "<<<< DC:DATE
".
There are only 4 such parts. One of them is string symbol definition to share, and another is for displaying. The other two parts are real deserialization code: "registration of element" and "deserialization from element".
LooseDateTimeElement
can be used for "date" of "Dublin Core",
and element registration code is like as shown below.
LooseDateTimeElement dcDate =
new LooseDateTimeElement(root.item,
"http://purl.org/dc/elements/1.1/",
"date",
null // meaning GMT time zone
);
// register dcDate element under "item" as optional
root.item.addOptionalItem("dc:date", dcDate);
Deserialization code in ItemNotification
for "item" element is like as shown below.
public void elementEnded(Element element,
ParseContext context)
throws SAXException //
{
ItemElement item = (ItemElement)element;
:
LooseDateTimeElement dcDate =
(LooseDateTimeElement)(item.getComponent("dc:date"));
item_.setValue("dc:date", dcDate.getDate(false));
}
In above example,
"item_
" is the Item
type member field of it.
Item
is container for RSS item information.
MAP | SASAX-RSS Documents > Implementation note |