The Incredible C++ XML Parser and JSON Parser
cross-platform, scalable and fast
C++ XML Parser
In 2003, I started working on XML technology and produce my
library. This old library is now used in thousands of applications all around the
world (and also in space! 😲 ). The main objective of old XMLParser library was to allow me to easily manipulate
input/ouput configuration files and xml data files. The old library was limited to relatively
small data files (typically, smaller than 10MB) because it's a pure DOM-style parser 😒 .
During the next 10 years, I received many emails from coders using the old XMLParser library to parse
larger and larger files (some individual use it to parse 300MB XML files!). Altough the old library managed to parse these larger files, it consumed a
very large amount of RAM memory (sometime up to 10GB) and of CPU ressources. Furthermore, I am now
manipulating (inside Anatella) terabyte-size XML files. In May 2013, I decided that it was
time for an "upgrade"! 😉 ...and the Incredible XML Parser was born! 😊
The Incredible XML Parser is composed of only 2 files: a .cpp file and a .h file.
The total size is 220 KB.
The Incredible XML Parser library includes three parsers: It has:
The Incredible XML DOM Parser, the Incredible XML Pull Parser and the Incredible JSON Pull Parser can all process terabyte-size
XML/JSON files in a few hours on commodity hardware with very low memory consumption
(i.e. less than a few megabyte).
- An ultra fast XML Pull Parser (that is named
"IXMLPullParser") that requires very little memory to run. The Pull
Parser is ultra fast but it does not offer the flexibility and the user-friendliness
of a full-fledged DOM parser.
- A very fast XML DOM parser
(that is named "IXMLDomParser")
(The Dom parser is built "on-top" of the Pull Parser) that provides more comfort when manipulating XML
elements. It works by using recursion and building a node tree for breaking down the elements of an XML
- An ultra fast JSON Pull Parser (that is named
"IJSONPullParser") that requires very little memory to run. The JSON Pull
Parser is ultra fast and is compatible with the Incredible XML DOM Parser
so that you can build (with the DOM Parser) a node tree in-memory that allows you to easily&quickly explore your JSON file (for
example: using advanced XPATH queries: see example12!).
The objectives of the Incredible XML/JSON Parser are the same as the old XMLParser library:
And, in addition, it provides even more speed & scalability.
- user-friendliness (i.e. it should be easy to use).
- Small foot-print & no dependencies (i.e. this must remain a small library, easy to include & compile everywhere, on any plateform).
For the Incredible XML Parser, I kept all the
nice functionnalites from the old XML Parser that made it so popular and I added the following:
To the best of my knowledge, there exists no other "non-validating C++ XML
parser" that is as simple and as powerfull. 😄 This is especially true if you need to
parse large XML documents: In such a case, there are no parser that comes even close to the
Incredible XML Parser presented here.
- The Incredible XML Pull Parser has one of the lowest memory
consumption amongst all XML Pull parsers.
- The Incredible JSON Pull Parser has one of the lowest memory
consumption amongst all JSON Pull parsers.
- The Incredible XML DOM Parser has the lowest memory consumption amongst all XML DOM parsers.
- The Incredible XML Pull Parser is one of the fastest XML Pull parser.
- The Incredible JSON Pull Parser is one of the fastest JSON Pull parser.
- The Incredible XML DOM Parser is the fastest XML DOM parser.
- The Incredible XML DOM Parser is the only DOM parser able to work on UNLIMITED file size.
- The 2 Incredible XML Parsers are able to handle nearly any character encodings.
- The 3 Incredible Parsers fully support "char*" mode and "wchar_t*" mode.
- The 3 Incredible Parsers are able to handle stream-lined data. This has several advantages:
- you are not limited anymore by your RAM memory size.
- very reduced and (more or less) constant memory consumption.
- you can process very easily stream-lined data (such as data coming from an
HTTP connection or the data coming from the decompression of a ZIP file).
- The 3 Incredible Parsers are 100% thread-safe (more precisely: they are reentrant).
- The Incredible XML&JSON Pull Parsers are "in-place" parser (They do not copy internally any strings, so that it's as fast as possible).
- The Incredible XML&JSON Pull Parsers are one of the easiest-to-use XML Pull parsers(because they always return zero-terminated char* or wchar_t*, in opposition to other "in-place" parsers).
- The Incredible XML Dom Parser supports "hot starts" and is able to parse a sub-section of the
original XML file without doing any memory allocation at all. The "hot start" functionality is unique
and is very important because it allows us to use a very flexible DOM-style Parser on UNLIMITED XML&JSON file
size (see example 7 inside the documentation) using very little RAM memory.
- The Incredible XML Dom Parser provides an ultra fast XPATH support. With XPATH, you can find very easily inside any XML&JSON file the information that you need.
- The Incredible XML Parser has an extensive (doxygen) documentation
- The Incredible XML Dom Parser is a good replacement for the
old XMLParser library (The IXMLNode class from the Incredible XML Dom Parser is a direct replacement to the XMLNode class from the
old XMLParser library).
- The Incredible XML Parser is easy to customize: The code is concise, commented and written in a plain and
simple way. Thus, if you really need to change something (but I doubt of it), it's easy.
I originally selected the name "Ultimate" for the XML Parser because I cannot see how it would be possible to improve on
the XML Parser Library presented here 😜. Of course, you can always add features such as "XML Validation",etc. but it
will only produce a slower, more "bloated" library. It's really the "Incredible XML parser" 😏 and if you are a
professional developper serious about your work, you should use the "Incredible XML parser" and no other parser 🙏.
The Incredible XML Parser is distributed under the Aladdin Free Public License(AFPL).
The old XLMParser library is completely free and will remain free forever.
The Incredible XML parser is also completely free in these situations:
If you are not in the situations described herabove, you can still buy a BSD license (or MIT license) to use
the XML Parser inside all your projects: Simply
to request your license.
- You only need the Aladdin Free Public License(AFPL).
- You need another license (e.g. a BSD license or a MIT license) but you'll use the Incredible XML Parser inside:
- a computer video game (or anything related to video games).
- a software for a charity organization.
If you like this library, you can create a URL-Link towards this page from your
website (use this URL: http://www.applied-mathematics.net/tools/IXMLParser.html).
If you want to help other people to produce better softwares using XML technology,
you can increase the visibility of this library by adding a URL-link toward this
page (so that its google-ranking increases !).
If you like this library, please add
a message in the guestbook
To obtain the library, simply
, and I will send to you the Incredible XML Parser directly, the same day. You will receive by e-mail a zip-file.
Inside the zip file, you will find 5 examples:
- ansi (char*) unix/solaris project example (makefile based)
- ansi (char*) windows project example (for Visual Studio .NET)
- ansi (char*) windows .dll project with a small test project to check the generated .dll
- wide char (wchar_t*) unix/solaris project example (makefile based)
- wide char (wchar_t*) windows project example (for Visual Studio .NET)
- V3.01: May 19, 2013: initial version.
- V3.02: May 24, 2013: Various bug fixes & improvements.
- v3.03: May 24, 2013: Performed extensive testing on large documents and fixed some remaining small bugs.
- v3.04: May 28, 2013: changed the name from "Ultimate" to "Incredible" XMLParser.
- v3.05: May 30, 2013: 2 additions
- v3.06: June 12, 2013: 1 bug fix, 1 addition
- FIX: compilation with new gcc.
- added support for UTF32
A small tutorial
Let's assume that you want to parse the XML file "PMMLModel.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Header copyright="Frank Vanden Berghen">
<Application name="<Condor>" version="1.99beta" />
<Extension name="keys"> <Key name="urn"> </Key> </Extension>
<DataField name="persfam" optype="continuous" dataType="double">
<Value value="9.900000e+001" property="missing" />
<DataField name="prov" optype="continuous" dataType="double" />
<DataField name="urb" optype="continuous" dataType="double" />
<DataField name="ses" optype="continuous" dataType="double" />
<RegressionModel functionName="regression" modelType="linearRegression">
<NumericPredictor name="persfam" coefficient="-0.00275951" />
<NumericPredictor name="prov" coefficient="0.000319433" />
<NumericPredictor name="ses" coefficient="-0.000454307" />
<NONNumericPredictor name="testXmlExample" />
Let's analyse line by line the following small example program:
#include <stdio.h> // to get the "printf" function
int main(int argc, char **argv)
// This create a new Incredible XML DOM parser:
// This open and parse the XML file:
// This prints "<Condor>":
printf("Application Name is: '%s'\n", xNode.getChildNode("Application").getAttribute("name"));
// This prints "Hello world!":
printf("Text inside Header tag is :'%s'\n", xNode.getText());
// This gets the number of "NumericPredictor" tags:
// This prints the "coefficient" value for all the "NumericPredictor" tags:
for (int i=0; i<n; i++)
// This create a IXMLRenderer object and use this object to print a formatted XML string based on
// the content of the first "Extension" tag of the XML file (more details below):
To easily manipulate the data contained inside the XML file, the first operation is
to create an IXMLDomParser object (in the above example, it's named "iDom") and use it to get an instance of the class ITCXMLNode that represents the XML file in
memory. You can use:
or, if you use the UNICODE windows version of the library:
or, if the XML document is already in a memory buffer pointed by the variable "char
This will create an object called xMainNode
that represents the first tag named PMML
found inside the XML document. This object is the top of tree structure representing
the XML file in memory. The following command creates a new object called xNode
that represents the "Header"
tag inside the "PMML"
The following command prints on the screen "<Condor>"
(note that the "<"
character entity has been replaced by "<"):
printf("Application Name is: '%S'\n", xNode.getChildNode("Application").getAttribute("name"));
The following command prints on the screen "Hello
printf("Text inside Header tag is :'%s'\n", xNode.getText());
Let's assume you want to "go to" the tag named "RegressionTable":
Note that the previous value of the object named xNode
has been "garbage collected" so that no memory leak occurs. If you
want to know how many tags named "NumericPredictor"
are contained inside the tag named "RegressionTable":
The variable n now
contains the value 3. If you want to print the value of the coefficient
attribute for all the NumericPredictor
for (int i=0; i<n; i++)
Or equivalently, but faster at runtime:
for (int i=0; i<n; i++)
If you want to generate and print on the screen the following XML formatted
<Key name="urn" />
You can use:
Note that you must NOT free yourself the memory buffer containing the returned XML string (You must NOT write
any "free(t);") : The memory buffer
containing the XML string is owned by the iRenderer
object and it will be free'd when the iRenderer object
is destroyed (i.e. when it falls "out-of-scope").
The parameter true to
the function getString() means that we want formatted output.
That's all folks! With this basic knowledge, you should be able to retreive easily
any data from any XML file!
The Incredible XML Parser library contains many more other small usefull methods that are
not described here (The zip file contains some additional examples to explain
other functionalities and a complete Doxygen documentation about the IXMParser). These methods allows you to: