PageRenderTime 29ms CodeModel.GetById 16ms app.highlight 5ms RepoModel.GetById 1ms app.codeStats 0ms

/External/tinyxml/readme.txt

http://awoe.googlecode.com/
Plain Text | 530 lines | 398 code | 132 blank | 0 comment | 0 complexity | f8f366f3370dda889f60faa7db162cf4 MD5 | raw file
  1/** @mainpage
  2
  3<h1> TinyXML </h1>
  4
  5TinyXML is a simple, small, C++ XML parser that can be easily 
  6integrated into other programs.
  7
  8<h2> What it does. </h2>
  9	
 10In brief, TinyXML parses an XML document, and builds from that a 
 11Document Object Model (DOM) that can be read, modified, and saved.
 12
 13XML stands for "eXtensible Markup Language." It allows you to create 
 14your own document markups. Where HTML does a very good job of marking 
 15documents for browsers, XML allows you to define any kind of document 
 16markup, for example a document that describes a "to do" list for an 
 17organizer application. XML is a very structured and convenient format.
 18All those random file formats created to store application data can 
 19all be replaced with XML. One parser for everything.
 20
 21The best place for the complete, correct, and quite frankly hard to
 22read spec is at <a href="http://www.w3.org/TR/2004/REC-xml-20040204/">
 23http://www.w3.org/TR/2004/REC-xml-20040204/</a>. An intro to XML
 24(that I really like) can be found at 
 25<a href="http://skew.org/xml/tutorial/">http://skew.org/xml/tutorial</a>.
 26
 27There are different ways to access and interact with XML data.
 28TinyXML uses a Document Object Model (DOM), meaning the XML data is parsed
 29into a C++ objects that can be browsed and manipulated, and then 
 30written to disk or another output stream. You can also construct an XML document 
 31from scratch with C++ objects and write this to disk or another output
 32stream.
 33
 34TinyXML is designed to be easy and fast to learn. It is two headers 
 35and four cpp files. Simply add these to your project and off you go. 
 36There is an example file - xmltest.cpp - to get you started. 
 37
 38TinyXML is released under the ZLib license, 
 39so you can use it in open source or commercial code. The details
 40of the license are at the top of every source file.
 41
 42TinyXML attempts to be a flexible parser, but with truly correct and
 43compliant XML output. TinyXML should compile on any reasonably C++
 44compliant system. It does not rely on exceptions or RTTI. It can be 
 45compiled with or without STL support. TinyXML fully supports
 46the UTF-8 encoding, and the first 64k character entities.
 47
 48
 49<h2> What it doesn't do. </h2>
 50
 51TinyXML doesn't parse or use DTDs (Document Type Definitions) or XSLs
 52(eXtensible Stylesheet Language.) There are other parsers out there 
 53(check out www.sourceforge.org, search for XML) that are much more fully
 54featured. But they are also much bigger, take longer to set up in
 55your project, have a higher learning curve, and often have a more
 56restrictive license. If you are working with browsers or have more
 57complete XML needs, TinyXML is not the parser for you.
 58
 59The following DTD syntax will not parse at this time in TinyXML:
 60
 61@verbatim
 62	<!DOCTYPE Archiv [
 63	 <!ELEMENT Comment (#PCDATA)>
 64	]>
 65@endverbatim
 66
 67because TinyXML sees this as a !DOCTYPE node with an illegally 
 68embedded !ELEMENT node. This may be addressed in the future.
 69
 70<h2> Tutorials. </h2>
 71
 72For the impatient, here is a tutorial to get you going. A great way to get started, 
 73but it is worth your time to read this (very short) manual completely.
 74
 75- @subpage tutorial0
 76
 77<h2> Code Status.  </h2>
 78
 79TinyXML is mature, tested code. It is very stable. If you find
 80bugs, please file a bug report on the sourceforge web site
 81(www.sourceforge.net/projects/tinyxml). We'll get them straightened 
 82out as soon as possible.
 83
 84There are some areas of improvement; please check sourceforge if you are
 85interested in working on TinyXML.
 86
 87<h2> Related Projects </h2>
 88
 89TinyXML projects you may find useful! (Descriptions provided by the projects.)
 90
 91<ul>
 92<li> <b>TinyXPath</b> (http://tinyxpath.sourceforge.net). TinyXPath is a small footprint 
 93     XPath syntax decoder, written in C++.</li>
 94<li> <b>TinyXML++</b> (http://code.google.com/p/ticpp/). TinyXML++ is a completely new 
 95     interface to TinyXML that uses MANY of the C++ strengths. Templates, 
 96	 exceptions, and much better error handling.</li>
 97</ul>
 98
 99<h2> Features </h2>
100
101<h3> Using STL </h3>
102
103TinyXML can be compiled to use or not use STL. When using STL, TinyXML
104uses the std::string class, and fully supports std::istream, std::ostream,
105operator<<, and operator>>. Many API methods have both 'const char*' and
106'const std::string&' forms.
107
108When STL support is compiled out, no STL files are included whatsoever. All
109the string classes are implemented by TinyXML itself. API methods
110all use the 'const char*' form for input.
111
112Use the compile time #define:
113
114	TIXML_USE_STL
115
116to compile one version or the other. This can be passed by the compiler,
117or set as the first line of "tinyxml.h".
118
119Note: If compiling the test code in Linux, setting the environment
120variable TINYXML_USE_STL=YES/NO will control STL compilation. In the
121Windows project file, STL and non STL targets are provided. In your project,
122It's probably easiest to add the line "#define TIXML_USE_STL" as the first
123line of tinyxml.h.
124
125<h3> UTF-8 </h3>
126
127TinyXML supports UTF-8 allowing to manipulate XML files in any language. TinyXML
128also supports "legacy mode" - the encoding used before UTF-8 support and
129probably best described as "extended ascii".
130
131Normally, TinyXML will try to detect the correct encoding and use it. However,
132by setting the value of TIXML_DEFAULT_ENCODING in the header file, TinyXML
133can be forced to always use one encoding.
134
135TinyXML will assume Legacy Mode until one of the following occurs:
136<ol>
137	<li> If the non-standard but common "UTF-8 lead bytes" (0xef 0xbb 0xbf)
138		 begin the file or data stream, TinyXML will read it as UTF-8. </li>
139	<li> If the declaration tag is read, and it has an encoding="UTF-8", then
140		 TinyXML will read it as UTF-8. </li>
141	<li> If the declaration tag is read, and it has no encoding specified, then TinyXML will 
142		 read it as UTF-8. </li>
143	<li> If the declaration tag is read, and it has an encoding="something else", then TinyXML 
144		 will read it as Legacy Mode. In legacy mode, TinyXML will work as it did before. It's 
145		 not clear what that mode does exactly, but old content should keep working.</li>
146	<li> Until one of the above criteria is met, TinyXML runs in Legacy Mode.</li>
147</ol>
148
149What happens if the encoding is incorrectly set or detected? TinyXML will try
150to read and pass through text seen as improperly encoded. You may get some strange results or 
151mangled characters. You may want to force TinyXML to the correct mode.
152
153You may force TinyXML to Legacy Mode by using LoadFile( TIXML_ENCODING_LEGACY ) or
154LoadFile( filename, TIXML_ENCODING_LEGACY ). You may force it to use legacy mode all
155the time by setting TIXML_DEFAULT_ENCODING = TIXML_ENCODING_LEGACY. Likewise, you may 
156force it to TIXML_ENCODING_UTF8 with the same technique.
157
158For English users, using English XML, UTF-8 is the same as low-ASCII. You
159don't need to be aware of UTF-8 or change your code in any way. You can think
160of UTF-8 as a "superset" of ASCII.
161
162UTF-8 is not a double byte format - but it is a standard encoding of Unicode!
163TinyXML does not use or directly support wchar, TCHAR, or Microsoft's _UNICODE at this time. 
164It is common to see the term "Unicode" improperly refer to UTF-16, a wide byte encoding
165of unicode. This is a source of confusion.
166
167For "high-ascii" languages - everything not English, pretty much - TinyXML can
168handle all languages, at the same time, as long as the XML is encoded
169in UTF-8. That can be a little tricky, older programs and operating systems
170tend to use the "default" or "traditional" code page. Many apps (and almost all
171modern ones) can output UTF-8, but older or stubborn (or just broken) ones
172still output text in the default code page. 
173
174For example, Japanese systems traditionally use SHIFT-JIS encoding. 
175Text encoded as SHIFT-JIS can not be read by TinyXML. 
176A good text editor can import SHIFT-JIS and then save as UTF-8.
177
178The <a href="http://skew.org/xml/tutorial/">Skew.org link</a> does a great
179job covering the encoding issue.
180
181The test file "utf8test.xml" is an XML containing English, Spanish, Russian,
182and Simplified Chinese. (Hopefully they are translated correctly). The file
183"utf8test.gif" is a screen capture of the XML file, rendered in IE. Note that
184if you don't have the correct fonts (Simplified Chinese or Russian) on your
185system, you won't see output that matches the GIF file even if you can parse
186it correctly. Also note that (at least on my Windows machine) console output
187is in a Western code page, so that Print() or printf() cannot correctly display
188the file. This is not a bug in TinyXML - just an OS issue. No data is lost or 
189destroyed by TinyXML. The console just doesn't render UTF-8.
190
191
192<h3> Entities </h3>
193TinyXML recognizes the pre-defined "character entities", meaning special
194characters. Namely:
195
196@verbatim
197	&amp;	&
198	&lt;	<
199	&gt;	>
200	&quot;	"
201	&apos;	'
202@endverbatim
203
204These are recognized when the XML document is read, and translated to there
205UTF-8 equivalents. For instance, text with the XML of:
206
207@verbatim
208	Far &amp; Away
209@endverbatim
210
211will have the Value() of "Far & Away" when queried from the TiXmlText object,
212and will be written back to the XML stream/file as an ampersand. Older versions
213of TinyXML "preserved" character entities, but the newer versions will translate
214them into characters.
215
216Additionally, any character can be specified by its Unicode code point:
217The syntax "&#xA0;" or "&#160;" are both to the non-breaking space characher.
218
219<h3> Printing </h3>
220TinyXML can print output in several different ways that all have strengths and limitations.
221
222- Print( FILE* ). Output to a std-C stream, which includes all C files as well as stdout.
223	- "Pretty prints", but you don't have control over printing options.
224	- The output is streamed directly to the FILE object, so there is no memory overhead
225	  in the TinyXML code.
226	- used by Print() and SaveFile()
227
228- operator<<. Output to a c++ stream.
229	- Integrates with standart C++ iostreams.
230	- Outputs in "network printing" mode without line breaks. Good for network transmission
231	  and moving XML between C++ objects, but hard for a human to read.
232
233- TiXmlPrinter. Output to a std::string or memory buffer.
234	- API is less concise
235	- Future printing options will be put here.
236	- Printing may change slightly in future versions as it is refined and expanded.
237
238<h3> Streams </h3>
239With TIXML_USE_STL on TinyXML supports C++ streams (operator <<,>>) streams as well
240as C (FILE*) streams. There are some differences that you may need to be aware of.
241
242C style output:
243	- based on FILE*
244	- the Print() and SaveFile() methods
245
246	Generates formatted output, with plenty of white space, intended to be as 
247	human-readable as possible. They are very fast, and tolerant of ill formed 
248	XML documents. For example, an XML document that contains 2 root elements 
249	and 2 declarations, will still print.
250
251C style input:
252	- based on FILE*
253	- the Parse() and LoadFile() methods
254
255	A fast, tolerant read. Use whenever you don't need the C++ streams.
256
257C++ style output:
258	- based on std::ostream
259	- operator<<
260
261	Generates condensed output, intended for network transmission rather than
262	readability. Depending on your system's implementation of the ostream class,
263	these may be somewhat slower. (Or may not.) Not tolerant of ill formed XML:
264	a document should contain the correct one root element. Additional root level
265	elements will not be streamed out.
266
267C++ style input:
268	- based on std::istream
269	- operator>>
270
271	Reads XML from a stream, making it useful for network transmission. The tricky
272	part is knowing when the XML document is complete, since there will almost
273	certainly be other data in the stream. TinyXML will assume the XML data is
274	complete after it reads the root element. Put another way, documents that
275	are ill-constructed with more than one root element will not read correctly.
276	Also note that operator>> is somewhat slower than Parse, due to both 
277	implementation of the STL and limitations of TinyXML.
278
279<h3> White space </h3>
280The world simply does not agree on whether white space should be kept, or condensed.
281For example, pretend the '_' is a space, and look at "Hello____world". HTML, and 
282at least some XML parsers, will interpret this as "Hello_world". They condense white
283space. Some XML parsers do not, and will leave it as "Hello____world". (Remember
284to keep pretending the _ is a space.) Others suggest that __Hello___world__ should become
285Hello___world.
286
287It's an issue that hasn't been resolved to my satisfaction. TinyXML supports the
288first 2 approaches. Call TiXmlBase::SetCondenseWhiteSpace( bool ) to set the desired behavior.
289The default is to condense white space.
290
291If you change the default, you should call TiXmlBase::SetCondenseWhiteSpace( bool )
292before making any calls to Parse XML data, and I don't recommend changing it after
293it has been set.
294
295
296<h3> Handles </h3>
297
298Where browsing an XML document in a robust way, it is important to check
299for null returns from method calls. An error safe implementation can
300generate a lot of code like:
301
302@verbatim
303TiXmlElement* root = document.FirstChildElement( "Document" );
304if ( root )
305{
306	TiXmlElement* element = root->FirstChildElement( "Element" );
307	if ( element )
308	{
309		TiXmlElement* child = element->FirstChildElement( "Child" );
310		if ( child )
311		{
312			TiXmlElement* child2 = child->NextSiblingElement( "Child" );
313			if ( child2 )
314			{
315				// Finally do something useful.
316@endverbatim
317
318Handles have been introduced to clean this up. Using the TiXmlHandle class,
319the previous code reduces to:
320
321@verbatim
322TiXmlHandle docHandle( &document );
323TiXmlElement* child2 = docHandle.FirstChild( "Document" ).FirstChild( "Element" ).Child( "Child", 1 ).ToElement();
324if ( child2 )
325{
326	// do something useful
327@endverbatim
328
329Which is much easier to deal with. See TiXmlHandle for more information.
330
331
332<h3> Row and Column tracking </h3>
333Being able to track nodes and attributes back to their origin location
334in source files can be very important for some applications. Additionally,
335knowing where parsing errors occured in the original source can be very
336time saving.
337
338TinyXML can tracks the row and column origin of all nodes and attributes
339in a text file. The TiXmlBase::Row() and TiXmlBase::Column() methods return
340the origin of the node in the source text. The correct tabs can be 
341configured in TiXmlDocument::SetTabSize().
342
343
344<h2> Using and Installing </h2>
345
346To Compile and Run xmltest:
347
348A Linux Makefile and a Windows Visual C++ .dsw file is provided. 
349Simply compile and run. It will write the file demotest.xml to your 
350disk and generate output on the screen. It also tests walking the
351DOM by printing out the number of nodes found using different 
352techniques.
353
354The Linux makefile is very generic and runs on many systems - it 
355is currently tested on mingw and
356MacOSX. You do not need to run 'make depend'. The dependecies have been
357hard coded.
358
359<h3>Windows project file for VC6</h3>
360<ul>
361<li>tinyxml:		tinyxml library, non-STL </li>
362<li>tinyxmlSTL:		tinyxml library, STL </li>
363<li>tinyXmlTest:	test app, non-STL </li>
364<li>tinyXmlTestSTL: test app, STL </li>
365</ul>
366
367<h3>Makefile</h3>
368At the top of the makefile you can set:
369
370PROFILE, DEBUG, and TINYXML_USE_STL. Details (such that they are) are in
371the makefile.
372
373In the tinyxml directory, type "make clean" then "make". The executable
374file 'xmltest' will be created.
375
376
377
378<h3>To Use in an Application:</h3>
379
380Add tinyxml.cpp, tinyxml.h, tinyxmlerror.cpp, tinyxmlparser.cpp, tinystr.cpp, and tinystr.h to your
381project or make file. That's it! It should compile on any reasonably
382compliant C++ system. You do not need to enable exceptions or
383RTTI for TinyXML.
384
385
386<h2> How TinyXML works.  </h2>
387
388An example is probably the best way to go. Take:
389@verbatim
390	<?xml version="1.0" standalone=no>
391	<!-- Our to do list data -->
392	<ToDo>
393		<Item priority="1"> Go to the <bold>Toy store!</bold></Item>
394		<Item priority="2"> Do bills</Item>
395	</ToDo>
396@endverbatim
397
398Its not much of a To Do list, but it will do. To read this file 
399(say "demo.xml") you would create a document, and parse it in:
400@verbatim
401	TiXmlDocument doc( "demo.xml" );
402	doc.LoadFile();
403@endverbatim
404
405And its ready to go. Now lets look at some lines and how they 
406relate to the DOM.
407
408@verbatim
409<?xml version="1.0" standalone=no>
410@endverbatim
411
412	The first line is a declaration, and gets turned into the
413	TiXmlDeclaration class. It will be the first child of the
414	document node.
415	
416	This is the only directive/special tag parsed by TinyXML.
417	Generally directive tags are stored in TiXmlUnknown so the 
418	commands wont be lost when it is saved back to disk.
419
420@verbatim
421<!-- Our to do list data -->
422@endverbatim
423
424	A comment. Will become a TiXmlComment object.
425
426@verbatim
427<ToDo>
428@endverbatim
429
430	The "ToDo" tag defines a TiXmlElement object. This one does not have 
431	any attributes, but does contain 2 other elements.
432
433@verbatim
434<Item priority="1"> 
435@endverbatim
436
437	Creates another TiXmlElement which is a child of the "ToDo" element. 
438	This element has 1 attribute, with the name "priority" and the value 
439	"1".
440
441@verbatim
442Go to the
443@endverbatim 
444
445	A TiXmlText. This is a leaf node and cannot contain other nodes. 
446	It is a child of the "Item" TiXmlElement.
447
448@verbatim
449<bold>
450@endverbatim
451
452	
453	Another TiXmlElement, this one a child of the "Item" element.
454
455Etc.
456
457Looking at the entire object tree, you end up with:
458@verbatim
459TiXmlDocument					"demo.xml"
460	TiXmlDeclaration			"version='1.0'" "standalone=no"
461	TiXmlComment				" Our to do list data"
462	TiXmlElement				"ToDo"
463		TiXmlElement			"Item" Attribtutes: priority = 1
464			TiXmlText			"Go to the "
465			TiXmlElement		"bold"
466				TiXmlText		"Toy store!"
467		TiXmlElement			"Item" Attributes: priority=2
468			TiXmlText			"Do bills"
469@endverbatim
470
471<h2> Documentation </h2>
472
473The documentation is build with Doxygen, using the 'dox' 
474configuration file.
475
476<h2> License </h2>
477
478TinyXML is released under the zlib license:
479
480This software is provided 'as-is', without any express or implied 
481warranty. In no event will the authors be held liable for any 
482damages arising from the use of this software.
483
484Permission is granted to anyone to use this software for any 
485purpose, including commercial applications, and to alter it and 
486redistribute it freely, subject to the following restrictions:
487
4881. The origin of this software must not be misrepresented; you must 
489not claim that you wrote the original software. If you use this 
490software in a product, an acknowledgment in the product documentation 
491would be appreciated but is not required.
492
4932. Altered source versions must be plainly marked as such, and 
494must not be misrepresented as being the original software.
495
4963. This notice may not be removed or altered from any source 
497distribution.
498
499<h2> References  </h2>
500
501The World Wide Web Consortium is the definitive standard body for 
502XML, and their web pages contain huge amounts of information. 
503
504The definitive spec: <a href="http://www.w3.org/TR/2004/REC-xml-20040204/">
505http://www.w3.org/TR/2004/REC-xml-20040204/</a>
506
507I also recommend "XML Pocket Reference" by Robert Eckstein and published by 
508OReilly...the book that got the whole thing started.
509
510<h2> Contributors, Contacts, and a Brief History </h2>
511
512Thanks very much to everyone who sends suggestions, bugs, ideas, and 
513encouragement. It all helps, and makes this project fun. A special thanks
514to the contributors on the web pages that keep it lively.
515
516So many people have sent in bugs and ideas, that rather than list here 
517we try to give credit due in the "changes.txt" file.
518
519TinyXML was originally written by Lee Thomason. (Often the "I" still
520in the documentation.) Lee reviews changes and releases new versions,
521with the help of Yves Berquin, Andrew Ellerton, and the tinyXml community.
522
523We appreciate your suggestions, and would love to know if you 
524use TinyXML. Hopefully you will enjoy it and find it useful. 
525Please post questions, comments, file bugs, or contact us at:
526
527www.sourceforge.net/projects/tinyxml
528
529Lee Thomason, Yves Berquin, Andrew Ellerton
530*/