Module sgmllib :: Class SGMLParser
[show private | hide private]
[frames | no frames]

Class SGMLParser

ParserBase --+
             |
            SGMLParser

Known Subclasses:
ExtractUrls, InterProParser, LocalParser, MyParser, NdbParser, _FormParser

Method Summary
  __init__(self, verbose)
Initialize and reset this instance.
  close(self)
Handle the remaining data.
  convert_charref(self, name)
Convert character reference, may be overridden.
  convert_codepoint(self, codepoint)
  convert_entityref(self, name)
Convert entity references.
  error(self, message)
  feed(self, data)
Feed some data to the parser.
  finish_endtag(self, tag)
  finish_shorttag(self, tag, data)
  finish_starttag(self, tag, attrs)
  get_starttag_text(self)
  goahead(self, end)
  handle_charref(self, name)
Handle character reference, no need to override.
  handle_comment(self, data)
  handle_data(self, data)
  handle_decl(self, decl)
  handle_endtag(self, tag, method)
  handle_entityref(self, name)
Handle entity references, no need to override.
  handle_pi(self, data)
  handle_starttag(self, tag, method, attrs)
  parse_endtag(self, i)
  parse_pi(self, i)
  parse_starttag(self, i)
  report_unbalanced(self, tag)
  reset(self)
Reset this instance.
  setliteral(self, *args)
Enter literal mode (CDATA).
  setnomoretags(self)
Enter literal mode (CDATA) till EOF.
  unknown_charref(self, ref)
  unknown_endtag(self, tag)
  unknown_entityref(self, ref)
  unknown_starttag(self, tag, attrs)
  _convert_ref(self, match)
    Inherited from ParserBase
  getpos(self)
Return current line number and offset.
  parse_comment(self, i, report)
  parse_declaration(self, i)
  parse_marked_section(self, i, report)
  unknown_decl(self, data)
  updatepos(self, i, j)
  _parse_doctype_attlist(self, i, declstartpos)
  _parse_doctype_element(self, i, declstartpos)
  _parse_doctype_entity(self, i, declstartpos)
  _parse_doctype_notation(self, i, declstartpos)
  _parse_doctype_subset(self, i, declstartpos)
  _scan_name(self, i, declstartpos)

Class Variable Summary
SRE_Pattern entity_or_charref = &(?:([a-zA-Z][-\.a-zA-Z0-9]*)|#([0-9...
dict entitydefs = {'amp': '&', 'lt': '<', 'gt': '>', 'apos': ...
str _decl_otherchars = '='

Method Details

__init__(self, verbose=0)
(Constructor)

Initialize and reset this instance.
Overrides:
markupbase.ParserBase.__init__

close(self)

Handle the remaining data.

convert_charref(self, name)

Convert character reference, may be overridden.

convert_entityref(self, name)

Convert entity references.

As an alternative to overriding this method; one can tailor the results by setting up the self.entitydefs mapping appropriately.

feed(self, data)

Feed some data to the parser.

        Call this as often as you want, with as little or as much text
        as you want (may include '
').  (This just saves the text,
        all the processing is done by goahead().)

handle_charref(self, name)

Handle character reference, no need to override.

handle_entityref(self, name)

Handle entity references, no need to override.

reset(self)

Reset this instance. Loses all unprocessed data.
Overrides:
markupbase.ParserBase.reset

setliteral(self, *args)

Enter literal mode (CDATA).

Intended for derived classes only.

setnomoretags(self)

Enter literal mode (CDATA) till EOF.

Intended for derived classes only.

Class Variable Details

entity_or_charref

Type:
SRE_Pattern
Value:
&(?:([a-zA-Z][-\.a-zA-Z0-9]*)|#([0-9]+))(;?)                           

entitydefs

Type:
dict
Value:
{'amp': '&', 'lt': '<', 'gt': '>', 'apos': "'", 'quot': '"'}           

_decl_otherchars

Type:
str
Value:
'='                                                                    

Generated by Epydoc 2.1 on Mon Aug 27 16:43:45 2007 http://epydoc.sf.net