AbstractXMLParser class

Implements a simple XML parser base class that can be overridden to create an XML parser.

Class Overview

Implements a simple XML parser base class that can be overridden to create an XML parser.

You should extend this class and add methods of the form start_foo($attr) and end_foo(), where foo is the tagname you want to match.

You can also add methods of the form handle_foo_data($data) to handle character data that appears in the tag named foo. This can be used to easily extract data from simple XML documents.

If the parser encounters a tag that does not have corresponding start_foo and end_foo methods, unknown_starttag or unknown_endtag are called.

Todo

Character/entity references (maybe support all HTML entities by default?)

Public Methods

__construct($encoding=null)

Initializes the parser.

The constructor method instantiates and configures the underlying XML parser and its handlers.

Parameters

$encoding

The character encoding to use for this parser. This parameter is optional (defaults to null) and can be safely omitted. See the PHP manual for xml_parser_create for more information about encodings.

feed($data)

Passes some data to the parser.

The parser will process it and invoke the appropriate handlers and callback methods.

Parameters

$data

A string containing some data

See also

feed_file($filename)

Passes the contents of a complete file to the parser.

The parser will process it and invoke the appropriate handlers and callback methods.

Parameters

$filename

A string pointing to an XML file

See also

close()

Tells the parser that there's no more input data.

If you override this method, make sure you call this one too by using parent::close();

unknown_starttag($name, $attr)

Called when an unhandled start tag is found.

Override this method if you want your parser to have a default start tag handler.

Parameters

$name

The name of the tag

$attr

The attributes for this tag

unknown_endtag($name)

Called when an unhandled end tag is found.

Override this method if you want your parser to have a default end tag handler.

Parameters

$name

The name of the tag

Protected Methods

handle_data($data) [protected]

Handle character data.

This method can be overridden to do something sensible with character data from the input data.

Parameters

$data

The preprocessed data (with whitespace collapsed)

unknown_pi($name, $data) [protected]

Called when an unhandled processing instruction is found.

Override this method if you want your parser to have a default processing instruction handler.

Parameters

$name

The name of the processing instruction

$data

A string containing text data

handle_comment($data) [protected]

Called when comments are found in the input data.

Parameters

$data

The comment data (comment markers are already trimmed)

default_handler($data) [protected]

Default handler.

Override this method if you want to specify a default handler.

Parameters

$data

A string containing text data

handle_error($errno, $errstr, $line, $col, $byte) [protected]

Default error handler.

Just throws a fatal error. Override this method if you want to do less intrusive things.

Parameters

$errno

Error number

$errstr

Error string

$line

Line where the error occurred

$col

Column where the error occurred

$byte

Byte offset where the error occurred

parent_tag($number_of_levels_up=1) [protected]

Return the current parent tag name.

An optional parameter can be specified to specify additional steps up the tree.

Parameters

$number_of_levels_up

Controls how many levels up the tree (default is 1)

Return value

Tag name or null

current_tag() [protected]

Return the current tag name.

This can be useful when handling cdata.

Return value

Tag name or null

Private Methods

start_element_handler($parser, $name, $attr) [private]

Callback method for the XML parser.

Parameters

$parser

The parser instance

$name

The name of the tag

$attr

The attributes for this tag

end_element_handler($parser, $name) [private]

Callback method for the XML parser.

Parameters

$parser

The parser instance

$name

The name of the tag

_handle_data($parser, $data) [private]

Called when text data from the XML document is encountered.

This method stores the data in an internal buffer, which is flushed when a tag ends.

Parameters

$parser

The parser instance

$data

A string containing text data

flush_data() [private]

Tells the parser to flush all buffered data.

This method is called when a tag ends to ensure data is processed in the correct order.

handle_pi($parser, $name, $data) [private]

Callback method for the XML parser.

Parameters

$parser

The parser instance

$name

The name of the processing instruction

$data

A string containing text data

_default_handler($parser, $data) [private]

Internal default handler preprocessor that looks for XML comments.

Parameters

$parser

The parser instance

$data

A string containing text data

_handle_error() [private]

Internal error handler.

Collects error info and dispatches to the real error handler method.

Private Attributes

$parser [private]

The xml_parser instance.

$finalized [private]

Boolean indicating the parser state.

$cdata_buffer [private]

Character data buffer.

$tags [private]

Stack used to track tag nesting.