com.ctc.wstx.dtd
Class FullDTDReader

java.lang.Object
  extended by com.ctc.wstx.io.WstxInputData
      extended by com.ctc.wstx.sr.StreamScanner
          extended by com.ctc.wstx.dtd.MinimalDTDReader
              extended by com.ctc.wstx.dtd.FullDTDReader
All Implemented Interfaces:
InputConfigFlags, ParsingErrorMsgs, InputProblemReporter

public class FullDTDReader
extends MinimalDTDReader

Reader that reads in DTD information from internal or external subset.

There are 2 main modes for DTDReader, depending on whether it is parsing internal or external subset. Parsing of internal subset is somewhat simpler, since no dependency checking is needed. For external subset, handling of parameter entities is bit more complicated, as care has to be taken to distinguish between using PEs defined in int. subset, and ones defined in ext. subset itself. This determines cachability of external subsets.

Reader also implements simple stand-alone functionality for flattening DTD files (expanding all references to their eventual textual form); this is sometimes useful when optimizing modularized DTDs (which are more maintainable) into single monolithic DTDs (which in general can be more performant).


Field Summary
 
Fields inherited from class com.ctc.wstx.sr.StreamScanner
CHAR_CR_LF_OR_NULL, CHAR_FIRST_PURE_TEXT, CHAR_LOWEST_LEGAL_LOCALNAME_CHAR, INT_CR_LF_OR_NULL, mCfgNormalizeLFs, mCfgNsEnabled, mCfgReplaceEntities, mConfig, mCurrDepth, mCurrName, mDocXmlVersion, mInput, mInputTopDepth, mNameBuffer, mRootInput, mTokenInputCol, mTokenInputRow, mTokenInputTotal, SAX_COMPAT_MODE
 
Fields inherited from class com.ctc.wstx.io.WstxInputData
CHAR_NULL, CHAR_SPACE, INT_NULL, INT_SPACE, MAX_UNICODE_CHAR, mCurrInputProcessed, mCurrInputRow, mCurrInputRowStart, mInputBuffer, mInputLen, mInputPtr, mXml11
 
Fields inherited from interface com.ctc.wstx.cfg.InputConfigFlags
CFG_AUTO_CLOSE_INPUT, CFG_CACHE_DTDS, CFG_CACHE_DTDS_BY_PUBLIC_ID, CFG_COALESCE_TEXT, CFG_INTERN_NS_URIS, CFG_LAZY_PARSING, CFG_NAMESPACE_AWARE, CFG_NORMALIZE_ATTR_VALUES, CFG_NORMALIZE_LFS, CFG_PRESERVE_LOCATION, CFG_REPLACE_ENTITY_REFS, CFG_REPORT_CDATA, CFG_REPORT_PROLOG_WS, CFG_SUPPORT_DTD, CFG_SUPPORT_DTDPP, CFG_SUPPORT_EXTERNAL_ENTITIES, CFG_VALIDATE_AGAINST_DTD, CFG_VALIDATE_TEXT_CHARS, CFG_XMLID_TYPING, CFG_XMLID_UNIQ_CHECKS
 
Fields inherited from interface com.ctc.wstx.cfg.ParsingErrorMsgs
SUFFIX_EOF_EXP_NAME, SUFFIX_IN_ATTR_VALUE, SUFFIX_IN_CDATA, SUFFIX_IN_CLOSE_ELEMENT, SUFFIX_IN_COMMENT, SUFFIX_IN_DEF_ATTR_VALUE, SUFFIX_IN_DOC, SUFFIX_IN_DTD, SUFFIX_IN_DTD_EXTERNAL, SUFFIX_IN_DTD_INTERNAL, SUFFIX_IN_ELEMENT, SUFFIX_IN_ENTITY_REF, SUFFIX_IN_EPILOG, SUFFIX_IN_NAME, SUFFIX_IN_PROC_INSTR, SUFFIX_IN_PROLOG, SUFFIX_IN_TEXT, SUFFIX_IN_XML_DECL
 
Method Summary
protected  java.lang.String checkDTDKeyword(java.lang.String exp)
          Method called to verify whether input has specified keyword; if it has, returns null and points to char after the keyword; if not, returns whatever constitutes a keyword matched, for error reporting purposes.
protected  void checkXmlIdAttr(int type)
           
protected  void checkXmlSpaceAttr(int type, WordResolver enumValues)
           
protected  boolean ensureInput(int minAmount)
          Method called to make sure current main-level input buffer has at least specified number of characters available consequtively, without having to call StreamScanner.loadMore().
 EntityDecl findEntity(java.lang.String entName)
          Method that may need to be called by attribute default value validation code, during parsing....
protected  EntityDecl findEntity(java.lang.String id, java.lang.Object arg)
          Abstract method for sub-classes to implement, for finding a declared general or parsed entity.
static DTDSubset flattenExternalSubset(WstxInputSource src, java.io.Writer flattenWriter, boolean inclComments, boolean inclConditionals, boolean inclPEs)
          Method that will parse, process and output contents of an external DTD subset.
protected  char handleExpandedSurrogate(char first, char second)
          In most cases, surrogate pair can be expanded in-situ (like done with regular xml reader), but there are cases where this can not be done.
protected  void handleGreedyEntityProblem(WstxInputSource input)
           
protected  void handleIncompleteEntityProblem(WstxInputSource closing)
          Handling of PE matching problems is actually intricate; one type will be a WFC ("PE Between Declarations", which refers to PEs that start from outside declarations), and another just a VC ("Proper Declaration/PE Nesting", when PE is contained within declaration)
protected  void handleUndeclaredEntity(java.lang.String id)
          Undeclared parameter entity is a VC, not WFC...
protected  void initInputSource(WstxInputSource newInput, boolean isExt)
          Method called when an entity has been expanded (new input source has been created).
protected  boolean loadMore()
          Need to override this method, to check couple of things: first, that nested input sources are balanced, when expanding parameter entities inside entity value definitions (as per XML specs), and secondly, to handle (optional) flattening output.
protected  boolean loadMoreFromCurrent()
           
protected  void parseDirective()
           
protected  void parseDirectiveFlattened()
          Method similar to parseDirective(), but one that takes care to properly output dtd contents via DTDWriter as necessary.
protected  DTDSubset parseDTD()
           
protected  void readComment(DTDEventListener l)
          Method similar to MinimalDTDReader.skipComment(), but that has to collect contents, to be reported for a SAX handler.
protected  java.lang.String readDTDKeyword(java.lang.String prefix)
          Method called usually to indicate an error condition; will read rest of specified keyword (including characters that can be part of XML identifiers), append that to passed prefix (which is optional), and return resulting String.
static DTDSubset readExternalSubset(WstxInputSource src, ReaderConfig cfg, DTDSubset intSubset, boolean constructFully, int xmlVersion)
          Method called to read in the external subset definition.
static DTDSubset readInternalSubset(WstxInputData srcData, WstxInputSource input, ReaderConfig cfg, boolean constructFully, int xmlVersion)
          Method called to read in the internal subset definition.
protected  void readPI()
          Method similar to MinimalDTDReader.skipPI(), but one that does basic well-formedness checks.
 void setFlattenWriter(java.io.Writer w, boolean inclComments, boolean inclConditionals, boolean inclPEs)
          Method that will set specified Writer as the 'flattening writer'; writer used to output flattened version of DTD read in.
 
Methods inherited from class com.ctc.wstx.dtd.MinimalDTDReader
dtdNextChar, dtdNextFromCurr, getErrorMsg, getLocation, getNextSkippingPEs, skipComment, skipCommentContent, skipInternalSubset, skipInternalSubset, skipPI
 
Methods inherited from class com.ctc.wstx.sr.StreamScanner
closeAllInput, constructFromIoe, constructNullCharException, constructWfcException, doReportProblem, expandBy50Pct, expandEntity, fullyResolveEntity, getCurrentInput, getCurrentLocation, getLastCharLocation, getNameBuffer, getNext, getNextAfterWS, getNextChar, getNextCharAfterWS, getNextCharFromCurrent, getNextInCurrAfterWS, getNextInCurrAfterWS, getSource, getStartLocation, getSystemId, inputInBuffer, loadMore, loadMoreFromCurrent, markLF, markLF, parseEntityName, parseFNameForError, parseFullName, parseFullName, parseFullName2, parseLocalName, parseLocalName2, parsePublicId, parseSystemId, parseUntil, peekNext, pushback, reportProblem, reportProblem, reportProblem, reportProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, reportValidationProblem, resolveCharOnlyEntity, resolveNonCharEntity, resolveSimpleEntity, skipCRLF, skipFullName, throwFromIOE, throwFromStrE, throwIllegalCall, throwInvalidSpace, throwInvalidSpace, throwLazyError, throwNullChar, throwNullParent, throwParseError, throwParseError, throwParseError, throwUnexpectedChar, throwUnexpectedEOB, throwUnexpectedEOF, throwWfcException, tokenTypeDesc
 
Methods inherited from class com.ctc.wstx.io.WstxInputData
copyBufferStateFrom, findIllegalNameChar, findIllegalNmtokenChar, getCharDesc, isNameChar, isNameChar, isNameStartChar, isNameStartChar, isSpaceChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

readInternalSubset

public static DTDSubset readInternalSubset(WstxInputData srcData,
                                           WstxInputSource input,
                                           ReaderConfig cfg,
                                           boolean constructFully,
                                           int xmlVersion)
                                    throws java.io.IOException,
                                           javax.xml.stream.XMLStreamException
Method called to read in the internal subset definition.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

readExternalSubset

public static DTDSubset readExternalSubset(WstxInputSource src,
                                           ReaderConfig cfg,
                                           DTDSubset intSubset,
                                           boolean constructFully,
                                           int xmlVersion)
                                    throws java.io.IOException,
                                           javax.xml.stream.XMLStreamException
Method called to read in the external subset definition.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

flattenExternalSubset

public static DTDSubset flattenExternalSubset(WstxInputSource src,
                                              java.io.Writer flattenWriter,
                                              boolean inclComments,
                                              boolean inclConditionals,
                                              boolean inclPEs)
                                       throws java.io.IOException,
                                              javax.xml.stream.XMLStreamException
Method that will parse, process and output contents of an external DTD subset. It will do processing similar to readExternalSubset(com.ctc.wstx.io.WstxInputSource, com.ctc.wstx.api.ReaderConfig, com.ctc.wstx.dtd.DTDSubset, boolean, int), but additionally will copy its processed ("flattened") input to specified writer.

Parameters:
src - Input source used to read the main external subset
flattenWriter - Writer to output processed DTD content to
inclComments - If true, will pass comments to the writer; if false, will strip comments out
inclConditionals - If true, will include conditional block markers, as well as intervening content; if false, will strip out both markers and ignorable sections.
inclPEs - If true, will output parameter entity declarations; if false will parse and use them, but not output.
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

setFlattenWriter

public void setFlattenWriter(java.io.Writer w,
                             boolean inclComments,
                             boolean inclConditionals,
                             boolean inclPEs)
Method that will set specified Writer as the 'flattening writer'; writer used to output flattened version of DTD read in. This is similar to running a C-preprocessor on C-sources, except that defining writer will not prevent normal parsing of DTD itself.


findEntity

public EntityDecl findEntity(java.lang.String entName)
Method that may need to be called by attribute default value validation code, during parsing....

Note: see base class for some additional remarks about this method.

Overrides:
findEntity in class MinimalDTDReader

parseDTD

protected DTDSubset parseDTD()
                      throws java.io.IOException,
                             javax.xml.stream.XMLStreamException
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

parseDirective

protected void parseDirective()
                       throws java.io.IOException,
                              javax.xml.stream.XMLStreamException
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

parseDirectiveFlattened

protected void parseDirectiveFlattened()
                                throws java.io.IOException,
                                       javax.xml.stream.XMLStreamException
Method similar to parseDirective(), but one that takes care to properly output dtd contents via DTDWriter as necessary. Separated to simplify both methods; otherwise would end up with 'if (... flatten...) ... else ...' spaghetti code.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

initInputSource

protected void initInputSource(WstxInputSource newInput,
                               boolean isExt)
                        throws java.io.IOException,
                               javax.xml.stream.XMLStreamException
Description copied from class: StreamScanner
Method called when an entity has been expanded (new input source has been created). Needs to initialize location information and change active input source.

Overrides:
initInputSource in class StreamScanner
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

loadMore

protected boolean loadMore()
                    throws java.io.IOException,
                           javax.xml.stream.XMLStreamException
Need to override this method, to check couple of things: first, that nested input sources are balanced, when expanding parameter entities inside entity value definitions (as per XML specs), and secondly, to handle (optional) flattening output.

Overrides:
loadMore in class StreamScanner
Returns:
true if reading succeeded (or may succeed), false if we reached EOF.
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

loadMoreFromCurrent

protected boolean loadMoreFromCurrent()
                               throws java.io.IOException,
                                      javax.xml.stream.XMLStreamException
Overrides:
loadMoreFromCurrent in class StreamScanner
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

ensureInput

protected boolean ensureInput(int minAmount)
                       throws java.io.IOException
Description copied from class: StreamScanner
Method called to make sure current main-level input buffer has at least specified number of characters available consequtively, without having to call StreamScanner.loadMore(). It can only be called when input comes from main-level buffer; further, call can shift content in input buffer, so caller has to flush any data still pending. In short, caller has to know exactly what it's doing. :-)

Note: method does not check for any other input sources than the current one -- if current source can not fulfill the request, a failure is indicated.

Overrides:
ensureInput in class StreamScanner
Returns:
true if there's now enough data; false if not (EOF)
Throws:
java.io.IOException

checkDTDKeyword

protected java.lang.String checkDTDKeyword(java.lang.String exp)
                                    throws java.io.IOException,
                                           javax.xml.stream.XMLStreamException
Method called to verify whether input has specified keyword; if it has, returns null and points to char after the keyword; if not, returns whatever constitutes a keyword matched, for error reporting purposes.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

readDTDKeyword

protected java.lang.String readDTDKeyword(java.lang.String prefix)
                                   throws java.io.IOException,
                                          javax.xml.stream.XMLStreamException
Method called usually to indicate an error condition; will read rest of specified keyword (including characters that can be part of XML identifiers), append that to passed prefix (which is optional), and return resulting String.

Parameters:
prefix - Part of keyword already read in.
Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

readPI

protected void readPI()
               throws java.io.IOException,
                      javax.xml.stream.XMLStreamException
Method similar to MinimalDTDReader.skipPI(), but one that does basic well-formedness checks.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

readComment

protected void readComment(DTDEventListener l)
                    throws java.io.IOException,
                           javax.xml.stream.XMLStreamException
Method similar to MinimalDTDReader.skipComment(), but that has to collect contents, to be reported for a SAX handler.

Throws:
java.io.IOException
javax.xml.stream.XMLStreamException

findEntity

protected EntityDecl findEntity(java.lang.String id,
                                java.lang.Object arg)
Description copied from class: StreamScanner
Abstract method for sub-classes to implement, for finding a declared general or parsed entity.

Overrides:
findEntity in class MinimalDTDReader
Parameters:
arg - If Boolean.TRUE, we are expanding a general entity
id - Identifier of the entity to find

handleUndeclaredEntity

protected void handleUndeclaredEntity(java.lang.String id)
                               throws javax.xml.stream.XMLStreamException
Undeclared parameter entity is a VC, not WFC...

Overrides:
handleUndeclaredEntity in class MinimalDTDReader
Throws:
javax.xml.stream.XMLStreamException

handleIncompleteEntityProblem

protected void handleIncompleteEntityProblem(WstxInputSource closing)
                                      throws javax.xml.stream.XMLStreamException
Handling of PE matching problems is actually intricate; one type will be a WFC ("PE Between Declarations", which refers to PEs that start from outside declarations), and another just a VC ("Proper Declaration/PE Nesting", when PE is contained within declaration)

Overrides:
handleIncompleteEntityProblem in class MinimalDTDReader
Throws:
javax.xml.stream.XMLStreamException

handleGreedyEntityProblem

protected void handleGreedyEntityProblem(WstxInputSource input)
                                  throws javax.xml.stream.XMLStreamException
Throws:
javax.xml.stream.XMLStreamException

handleExpandedSurrogate

protected char handleExpandedSurrogate(char first,
                                       char second)
In most cases, surrogate pair can be expanded in-situ (like done with regular xml reader), but there are cases where this can not be done. Specifically, when expanding internal entities from the internal subset (or when flattening DTDs) this would lead to problems.

Overrides:
handleExpandedSurrogate in class MinimalDTDReader

checkXmlSpaceAttr

protected void checkXmlSpaceAttr(int type,
                                 WordResolver enumValues)
                          throws javax.xml.stream.XMLStreamException
Throws:
javax.xml.stream.XMLStreamException

checkXmlIdAttr

protected void checkXmlIdAttr(int type)
                       throws javax.xml.stream.XMLStreamException
Throws:
javax.xml.stream.XMLStreamException