Difference between validating and non validating parser
This parser is part of a package developed with the assistance of Microsoft, providing a Java implementation of much of the XML manipulation functionality in Internet Explorer 5.While it is freely available, support (such as bug fixes) costs.(Were there not the example of the XML spec itself, and feedback from the XML editors on this issue, it would seem that this processor was in compliance.) Character references that would expand to Unicode surrogate pairs are inappropriately rejected.Nobody has any real reason to use such pairs yet, so in practice this isn't a problem.(Good diagnostics would have cost space, which this processor chose not to spend that way.) There seems to be a pattern where the processor expects a quoted string of some kind, and is surprised by what it found instead.There are cases where it's clear why those documents were rejected.
There appear to be a declaration ordering constraint imposed by the processor, and difficulties handling conditional sections.
For example, syntax that looks like a parameter entity reference but is found inside of a comment should be ignored, but isn't.
The XML spec itself uses such constructs in its DTD, but its errata haven't yet been updated to address the issue of exactly where parameter entities get expanded and where they don't.
However, it is quite good at rejecting malformed documents for the correct reasons.
Quite a lot of the documents that this rejects have XML declarations which aren't quite what the processor expects, in some cases seemingly due to having standalone declarations.