org.cyberneko.html
Class HTMLScanner.ContentScanner

java.lang.Object
  extended by org.cyberneko.html.HTMLScanner.ContentScanner
All Implemented Interfaces:
HTMLScanner.Scanner
Enclosing class:
HTMLScanner

public class HTMLScanner.ContentScanner
extends java.lang.Object
implements HTMLScanner.Scanner

The primary HTML document scanner.

Author:
Andy Clark

Constructor Summary
HTMLScanner.ContentScanner()
           
 
Method Summary
protected  void addLocationItem(org.apache.xerces.xni.XMLAttributes attributes, int index)
          Adds location augmentations to the specified attribute.
 boolean scan(boolean complete)
          Scan.
protected  boolean scanAttribute(org.apache.xerces.util.XMLAttributesImpl attributes, boolean[] empty)
          Scans a real attribute.
protected  boolean scanAttribute(org.apache.xerces.util.XMLAttributesImpl attributes, boolean[] empty, char endc)
          Scans an attribute, pseudo or real.
protected  void scanCDATA()
          Scans a CDATA section.
protected  void scanCharacters()
          Scans characters.
protected  void scanComment()
          Scans a comment.
protected  void scanEndElement()
          Scans an end element.
protected  boolean scanMarkupContent(org.apache.xerces.util.XMLStringBuffer buffer, char cend)
          Scans markup content.
protected  void scanPI()
          Scans a processing instruction.
protected  boolean scanPseudoAttribute(org.apache.xerces.util.XMLAttributesImpl attributes)
          Scans a pseudo attribute.
protected  java.lang.String scanStartElement(boolean[] empty)
          Scans a start element.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLScanner.ContentScanner

public HTMLScanner.ContentScanner()
Method Detail

scan

public boolean scan(boolean complete)
             throws java.io.IOException
Scan.

Specified by:
scan in interface HTMLScanner.Scanner
Parameters:
complete - True if the scanner should not return until scanning is complete.
Returns:
True if additional scanning is required.
Throws:
java.io.IOException - Thrown if I/O error occurs.

scanCharacters

protected void scanCharacters()
                       throws java.io.IOException
Scans characters.

Throws:
java.io.IOException

scanCDATA

protected void scanCDATA()
                  throws java.io.IOException
Scans a CDATA section.

Throws:
java.io.IOException

scanComment

protected void scanComment()
                    throws java.io.IOException
Scans a comment.

Throws:
java.io.IOException

scanMarkupContent

protected boolean scanMarkupContent(org.apache.xerces.util.XMLStringBuffer buffer,
                                    char cend)
                             throws java.io.IOException
Scans markup content.

Throws:
java.io.IOException

scanPI

protected void scanPI()
               throws java.io.IOException
Scans a processing instruction.

Throws:
java.io.IOException

scanStartElement

protected java.lang.String scanStartElement(boolean[] empty)
                                     throws java.io.IOException
Scans a start element.

Parameters:
empty - Is used for a second return value to indicate whether the start element tag is empty (e.g. "/>").
Throws:
java.io.IOException

scanAttribute

protected boolean scanAttribute(org.apache.xerces.util.XMLAttributesImpl attributes,
                                boolean[] empty)
                         throws java.io.IOException
Scans a real attribute.

Parameters:
attributes - The list of attributes.
empty - Is used for a second return value to indicate whether the start element tag is empty (e.g. "/>").
Throws:
java.io.IOException

scanPseudoAttribute

protected boolean scanPseudoAttribute(org.apache.xerces.util.XMLAttributesImpl attributes)
                               throws java.io.IOException
Scans a pseudo attribute.

Parameters:
attributes - The list of attributes.
Throws:
java.io.IOException

scanAttribute

protected boolean scanAttribute(org.apache.xerces.util.XMLAttributesImpl attributes,
                                boolean[] empty,
                                char endc)
                         throws java.io.IOException
Scans an attribute, pseudo or real.

Parameters:
attributes - The list of attributes.
empty - Is used for a second return value to indicate whether the start element tag is empty (e.g. "/>").
endc - The end character that appears before the closing angle bracket ('>').
Throws:
java.io.IOException

addLocationItem

protected void addLocationItem(org.apache.xerces.xni.XMLAttributes attributes,
                               int index)
Adds location augmentations to the specified attribute.


scanEndElement

protected void scanEndElement()
                       throws java.io.IOException
Scans an end element.

Throws:
java.io.IOException


(C) Copyright 2002-2008, Andy Clark. All rights reserved.