LotusScript.doc - XmlNodeReader

Script Library XmlNodeReader
XmlNodeReader

LotusScript.doc-enabled documentation. See http://www.lsdoc.org

The XmlNodeReader class is meant to be an easy interface for getting data out of XML. For example, starting with the following XML in a file:

<bookshelf>
    <book type="paperback">
        <title>The Cathedral and the Bazaar</title>
        <author.name>Eric S. Raymond</author.name>
    </book>
    <book type="hardback">
        <title>Hackers and Painters</title>
        <author.name>Paul Graham</author.name>
    </book>
</bookshelf>

If you want to get the title of the first book in that list, you can do this:

    Dim reader As New XmlNodeReader
    Call reader.ReadFile( "c:\booklist.xml" )
    Print reader.get( "bookshelf.book.title" )

For an attribute, you can do:

    Print reader.get( "bookshelf.book.@type" )

If a node name or attribute has a period in it, just replace it with a double-period:

    Print reader.get( "bookshelf.book.author..name" )

I'm using what I'll refer to as "node paths" to specify the specific node or attribute you want to get the value for. It's just the format: "nodeName.nodeName.nodeName" or "nodeName.nodeName.@attributeName". It's sort of a poor-man's XPath. It doesn't provide nearly the power or flexibility that XPath has, but it's good for most basic XML parsing.

If you specify a node/attribute that doesn't exist, you just get an empty string as a result -- no errors are thrown, so you don't have to get bogged down in error handling. If you DO want to see whether there was an error after trying to get a node, check the result of getLastError().

You can read XML in to an XmlNodeReader using ReadText(), ReadFile(), ReadStream(), or ReadNode(). You can get the result of a "node path" expression as a string, a NotesDomNode, or another XmlNodeReader. Here are some more examples of use, this time parsing an Atom feed:

 Dim reader As New XmlNodeReader
 Dim v As Variant
 
 '** read some XML text
 Call reader.ReadText( GetXmlText() )
 v = reader.get( "feed.entry.id" ) '** text of the "id" node under the first "entry" node
 v = reader.getAll( "feed.entry.category" )      '** text array of all "category" nodes under the first "entry" node
 v = reader.get( "feed.collection.atom:title" )      '** text of a namespaced node
 v = reader.getAll( "feed.entry.link.@href" )        '** text array of "href" attribute for all "link" nodes under the first "entry" node
 v = reader.getAttributeNames( "feed.generator" )	'** text array of all attribute names of the first "generator" node
 v = reader.getSubNodeNames( "feed.collection" )	'** text array of all child node names of the first "collection" node
 
 '** get the title of all the "entry" child nodes
 Dim arr As Variant
 arr = reader.getNodeReaders("feed.entry")
 Forall nr In arr
	v = nr.get("title")
 End Forall
 
 '** read one of the child nodes from above
 Dim reader2 As New XmlNodeReader
 '** you could also use reader2 = reader.getNodeReader( "feed.entry" ) here
 Call reader2.ReadNode( reader.getNode("feed.entry") ) '** read a NotesDomNode
 v = reader2.get( "id" )     '** text of the first "id" node beneath the base node
 v = reader2.get( "@xml:lang" )  '** attribute of the base node itself
 v = reader2.get( "" )       '** text of the base node (may be a lot of whitespace)
 v = reader2.get( "author.email" )   '** text of a child node
 
 '** inline function chaining examples
 v = reader.getNodeReader( "feed.entry" ).get( "author.email" )
 v = reader.getNodeReaders( "feed.entry" )(1).get( "summary" )
 v = reader2.readText( GetXmlText2() ).get( "feed.generator" )

 '** check for parse errors
 If reader2.isEmpty Then
	Messagebox reader2.getLastError()
 End If

 '** read a 5 MB DXL file (takes a few seconds to read in)
 Call reader2.ReadFile( "C:\dbexport.xml" )
 v = reader2.get( "database.@title" )

As with any DOM parsing venture, be careful with large files. I did some tests with a 5 MB XML file and that was a little slow but it worked.

Also, if you use ReadStream() to parse an XML NotesStream that you just created, be aware of this caveat from the Domino Designer Help:

"You cannot explicitly read or write a NotesStream object associated with a file prior to using it for XML input or output. For example, if you write to a file then use it for XML input, you must close and reopen the NotesStream object."

XmlNodeReader uses NotesDomParser to parse XML internally, so if you've just written some XML to a file be sure to close and reopen it before passing it to ReadStream() -- or just close it and use ReadFile().

version 1.0
Oct. 25, 2008
Copyright (c) 2008 Julian Robichaux, http://www.nsftools.com

This code is licensed under the terms of the MIT License, available at http://www.opensource.org/licenses/mit-license.php

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Class Summary
NodePathParser NodePathParser is a helper class used by XmlNodeReader that converts a "nodeName.nodeName.nodeName" type string to an array of node names, and pulls out the attribute name and last node name for reference.
XmlNodeReader The XmlNodeReader class is meant to be an easy interface for getting data out of XML.

Script Library XmlNodeReader XmlNodeReader

Script Library XmlNodeReader
XmlNodeReader