SAX之Java实现学习笔记(一)
本文假设读者对XML有些了解
首先,先给出一个比较基本的处理xml文件的程序。你不必细看,直接跳过即可。需要时可以返回来看。
Echo01.java
import java.io.*;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
public class Echo01 extends DefaultHandler
{
    StringBuffer textBuffer;
    public static void main(String argv[])
    {
        if (argv.length != 1) {
            System.err.println("Usage: cmd filename");
            System.exit(1);
        }
        // Use an instance of ourselves as the SAX event handler
        DefaultHandler handler = new Echo01();
        // Use the default (non-validating) parser
              SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            // Set up output stream
            out = new OutputStreamWriter(System.out, "UTF-8");
            // Parse the input
            SAXParser saxParser = factory.newSAXParser();
            saxParser.parse( new File(argv[0]), handler);
        } catch (Throwable t) {
            t.printStackTrace();
        }
        System.exit(0);
    }
    static private Writer  out;
    //===========================================================
    // SAX DocumentHandler methods
    //===========================================================
    public void startDocument()
    throws SAXException
    {
        emit("<?xml version='1.0' encoding='UTF-8'?>");
        nl();
    }
    public void endDocument()
    throws SAXException
    {
        try {
            nl();
            out.flush();
        } catch (IOException e) {
            throw new SAXException("I/O error", e);
        }
    }
    public void startElement(String namespaceURI,
                             String sName, // simple name
                             String qName, // qualified name
                             Attributes attrs)
    throws SAXException
    {
        echoText();
              String eName = sName; // element name
        if ("".equals(eName)) eName = qName; // not namespaceAware
        emit("<"+eName);
        if (attrs != null) {
            for (int i = 0; i < attrs.getLength(); i++) {
                String aName = attrs.getLocalName(i); // Attr name
                if ("".equals(aName)) aName = attrs.getQName(i);
                emit(" ");
                emit(aName+"=\""+attrs.getValue(i)+"\"");
                          }
        }
        emit(">");
    }
    public void endElement(String namespaceURI,
                           String sName, // simple name
                           String qName  // qualified name
                          )
    throws SAXException
    {
        echoText();
        String eName = sName; // element name
        if ("".equals(eName)) eName = qName; // not namespaceAware
        emit("</"+eName+">");
    }
    public void characters(char buf[], int offset, int len)
    throws SAXException
    {
                     String s = new String(buf, offset, len);
        if (textBuffer == null) {
           textBuffer = new StringBuffer(s);
        } else {
           textBuffer.append(s);
        }
    }
    //===========================================================
    // Utility Methods ...
    //===========================================================
    // Display text accumulated in the character buffer
    private void echoText()
    throws SAXException
    {
        if (textBuffer == null) return;
                           String s = ""+textBuffer;
              emit(s);
              textBuffer = null;
    }
    // Wrap I/O exceptions in SAX exceptions, to
    // suit handler signature requirements
    private void emit(String s)
    throws SAXException
    {
        try {
            out.write(s);
            out.flush();
        } catch (IOException e) {
            throw new SAXException("I/O error", e);
        }
    }
    // Start a new line
    private void nl()
    throws SAXException
    {
      String lineEnd =  System.getProperty("line.separator");
        try {
            out.write(lineEnd);
        } catch (IOException e) {
            throw new SAXException("I/O error", e);
        }
    }
}
从程序中可以看出,解析一个XML文件的核心语句是下面一部分:
     // Use an instance of ourselves as the SAX event handler
        DefaultHandler handler = new Echo01();
        // Use the default (non-validating) parser
              SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            // Set up output stream
            out = new OutputStreamWriter(System.out, "UTF-8");
            // Parse the input
            SAXParser saxParser = factory.newSAXParser();
            saxParser.parse( new File(argv[0]), handler);
        } catch (Throwable t) {
            t.printStackTrace();
        }
先是创建一个SAXParserFactory工厂类的实例,然后通过SAXParser saxParser = factory.newSAXParser(); 这个工厂类的方法创建了一个saxParser。将xml文件(new File(argv[0]))和一个Sax Event Handler(handler)(在这个程序里面,这个Handler其实是本身这个类,这个类继承了org.xml.sax.helpers.DefaultHandler 这个类,并且在前面初始化了它:DefaultHandler handler = new Echo01();  )传递给它,让它进行解析。
关于xml文件的解析过程中的处理全部在Handler里面实现。一般Parser接受的是DefaultHandler或者HandlerBase这两个类。 这个例子里面的类是继承DefaultHandler这个虚类的。看下图:

而DefaultHandler是实现了EntityResolver, DTDHandler, ContentHandler, ErrorHandler四个接口的虚类。分别定义了如下的方法:

不同的方法,在不同的时候被Parser调用,(这个不同的时候就是Event-based)
详细介绍:(暂略)
DefualtHandler的UML图如下:
 
看完Handler,再转过头去看Parser,在代码里面用的是SAXParser(SAXParser saxParser)
仔细看里面的代码


你会发现,其实它并没有自己完成解析的工作,而是Wrap了另二个类XMLReader和Parser来完成解析工作。原来SAXParser只是起到一个Adapter的工作而已。
UML:

 






                                账号登录