Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439

This exercise uses the DOM parser to parse an XML documentand print out the document in a JSP.

Level of Difficulty: Difficult (new/difficult concepts)
Estimated time: 1 hour
Pre-requisites:


Copying this lab's files

As in the previous labs, copy this labs's files into your public_html/dca directory from:

  /pub/dca/lab04



Running the sample file

There are two sample files related to the DOM parsing examplefor this lab exercise:

The dom.jsp file is the code that we will be working on during the exercise.The cd.xml file is a sample data file that we will read in.

First, you need to tell the JSP to read the XML data filefrom YOUR home directory.Open up the JSP file in a text editor:

  gedit dom.jsp &

Look for the line that contains the path name to cd.xml,e.g. the path will look like:

    /home/USERNAME/public_html/dca/lab04/cd.xml

Change this path so that instead of USERNAME,it says your own Faculty login name.

Then, open the JSP file in your browser, making sure that the URLcontains charlie.it.uts.edu.au orsally.it.uts.edu.au, e.g.

  http://charlie.it.uts.edu.au/~username/dca/lab04/dom.jsp

When this file executes, it prints out the contents of thecd.xml file, as seen by the DOM parser.

So you can see what it is doing, take a look at cd.xmlin a text editor.



Understanding the code

In the section below, we will walk through the code providedand give an explanation of what is happening.

  <%@ page import="javax.xml.parsers.*" %>  <%@ page import="org.w3c.dom.*" %>  <%@ page import="java.io.*" %>

These three lines appear at the top of the file.Here we are importing the Java class libraries that will beneeded by the JSP code.

Note that they are JSP directives because they are enclosedby the delimiters "<%@" and"%>".


<%  // Create the file object we will read from  File file = new File("/home/USERNAME/public_html/dca/lab04/cd.xml");  // Create an instance of the DOM parser and parse the document  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  DocumentBuilder db = dbf.newDocumentBuilder();  Document doc = db.parse(file);  // Begin traversing the document  traverseTree(doc, out);%>

This section of code is where we set up the DOM parser to parse thedocument.Note that the code is a JSP scriptlet, as it is containedwithin the delimiters "<%" and"%>".The steps involved are:

  1. Create a File object that refers to the particular XMLfile we want to open.

  2. Get a reference to a DocumentBuilderFactory anda DocumentBuilder object.The need for this step is because there are potentially many differentimplementations of DOM parsers available.For example, the implementation that we will be using is called Xerces,and is part of the Apache project.Another implementation of a DOM parser comes from IBM.

    From an application programmer perspective, you aren't usually interested in which implementation of the DOM parser is being used. You just wantto get access to whichever DOM parser happens to be installed on thesystem you are using.The DocumentBuilderFactory class provides a generic wayof locating the "default" DOM parser implementation thatis installed on any system.When you call DocumentBuilder.newInstance(), it returnsa reference to some implementation of a DOM-compliant parser.The DocumentBuilder object refers to the actual DOMparser itself.

  3. Parse the XML file, by calling the parse() methodon the DocumentBuilder object.With DOM, whenever you call the parse() method, in return you get back a reference to a Documentobject that is the starting point for the parsed DOM tree.

    If there was a syntax error during parsing and the DOM tree couldnot be built, then a Java exception would be thrown and an errormessage would appear in the browser.This error message will look like the message generated when testingwell-formedness of XML documents in an earlier exercise.

  4. Finally, as a result of parsing we have a Documentobject which represents a DOM tree that we can traverse.In this exercise, there is a specific Java method for performingthe traversal, called traverseTree().We call the traverseTree() method and pass to ita reference to the Document, and also to thepre-defined JSP object called out, which is usedfor printing data into the HTML code that is sent back to the user's web browser.


<%!  private void traverseTree(Node currnode, JspWriter out) throws Exception {

Here we declare a Java method that will be used to perform thetraversal.We will call this method to handle each node in the DOM tree thathas been built in memory by the parser.

Some points to note:


The overall structure of the traverseTree() methodis shown below:

  int type = currnode.getNodeType();  switch (type) {    case Node.DOCUMENT_NODE:    {      // handle a document node      break;    }      case Node.ELEMENT_NODE:    {      // handle an element node      break;    }      case Node.ATTRIBUTE_NODE:    {      // handle an attribute node      break;    }      case Node.TEXT_NODE:    {      // handle a text node      break;    }  }

This shows an outline of the traverseTree() methodwithout all of the details filled in yet.Notice that for the current node we are processing, we firstfind out the node type, and then use a switch statementto branch to a block of code to handle that particular type ofnode.

Now we will examine each of the different handlers in turn.


    case Node.DOCUMENT_NODE:    {      out.println("<p>DOCUMENT</p>");      traverseTree (((Document)currnode).getDocumentElement(), out);      break;    }

There is only one "document" node for each XML document.In this case, first we just print a message to indicate that we haveencountered a document node.Seconly, we call the getDocumentElement() method toretrieve the root node of the document.With that root node, we then call the traverseTree() methodto handle it.Note that from within the traverseTree() method, we arecalling the same method again.This is an example of recursion in programming.


    case Node.ELEMENT_NODE:    {      String elementName = currnode.getNodeName();      out.println("<p>ELEMENT: [" + elementName + "]</p>");      if (currnode.hasAttributes()) {        NamedNodeMap attributes = currnode.getAttributes();        for (int i=0; i < attributes.getLength(); i++) {          Node currattr = attributes.item(i);          traverseTree(currattr, out);        }      }      NodeList childNodes = currnode.getChildNodes();      if(childNodes != null) {        for (int i=0; i < childNodes.getLength() ; i++) {          traverseTree (childNodes.item(i), out);        }      }      break;    }

This is the most complex of the handlers.There are three main parts to it:

  1. Find out the name of this element (elementName) andprint it out.

  2. Check to see if this element has any attributes associated with it.If it does, then we retrieve them (attributes) andthen loop through them one by one using a for loop.In DOM, every attribute is treated as a Node as well.So in this example, for each attribute, we simply call the traverseTree() method to handle it.

  3. The final step in this example is to process any child nodes ofthis element.We retrieve a list of all the child nodes, and use a forloop to process each one in turn, using the traverseTree()method to do the processing.Note that children of element nodes are typically either textnodes (if the element contains text) or further element nodes(if the element contains other XML elements nested inside it).

Note that this is where we decide the traversal algorithm to use.In this case, we are using a preorder traversal, which is the mostcommon kind of traversal for processing documents with DOM.


    case Node.ATTRIBUTE_NODE:    {      String attributeName = currnode.getNodeName();      String attributeValue = currnode.getNodeValue();      out.println("<p>ATTRIBUTE: name=[" + attributeName +      "], value=[" + attributeValue + "]</p>");      break;    }

In the case of attribute nodes, we just retrieve the attribute nameand value, and print them out.

Attribute nodes are leaf nodes in the DOM tree.They have no children to process.


    case Node.TEXT_NODE: {      String text = currnode.getNodeValue().trim();      if (text.length() > 0) {        out.println("<p>TEXT: [" + text + "]</p>");      }      break;    }

In the case of text nodes, we retrieve the value, and "trim" it.Trimming it means that we remove whitespace from either end of thestring.

If the resulting string has any characters left after trimming, thenwe print it out.This avoids printing text nodes that consist entirely of whitespace.

Text nodes are leaf nodes in the DOM tree.They have no children to process.



Adding indenting to show nesting level

First, copy your dom.jsp file to a new filenamed dom1.jsp.Make the following changes to dom1.jsp.

At the moment, the sample JSP prints all nodes at the same levelof indenting (against the left-hand margin).The first goal of this exercise is to modify the code so thateach time the traversal algorithm enters a new level of "depth" in the DOM tree, we indent the output onelevel further, and each time the traversal algorithm goesup one level in the DOM tree, we remove the indenting.

The easiest way to achieve indenting is to use the HTML<blockquote> tag.When you want to increase the indenting by one level, print outthe following line of HTML:

  <blockquote>

When you want to decrease the indenting by one level, print outthe corresponding closing tag:

  </blockquote>

Think about how the code works.Each time you process a node, the traverseTree() methodis called.Another way to think of it is that the start of thetraverseTree() method is the time at which you"enter" (i.e. start processing) a node, and the end of the traverseTree() method is when you "exit" (i.e. finish processing) the node.

The solution is quite short - it can be done by adding only two linesof code - but it does require you to think about and understand how the code works (particularly the traverseTree() method).



Printing a subset of the data

The next exercise is to selectively print data from the DOM tree.Copy the original dom.jsp file to becomedom2.jsp, and make your changes to dom2.jsp.

Suppose that using the cd.xml file, we only want toprint out a list of track titles, and none of the other information.

Modify the code so that only the <title> elementvalues are printed.It's not as easy as it sounds - remember that the actual value isn'tstored in the DOM "element" node, it is stored in a"text" node that is a child of the element.

What about the fact that the same element name (<title>)is used to represent both the CD title and the track title, dependingupon where it appears in the XML document?Don't worry about this in your first attempt at a solution, butsee if you can find a way to solve it.



Formatting the data in a table

The final exercise with the DOM parser is to print out the datafrom the cd.xml file in a HTML table.Your resulting output should look something like the following:

A Funk OdysseyJamiroquai
Track NumTitleTimeRating
1Feels So Good4:382
2Little L4:105
3You Give Me Something5:023
4Corner of the Earth3:571