/StormLib/stormlib/bzip2/manual.html

http://ghostcb.googlecode.com/ · HTML · 2540 lines · 2509 code · 31 blank · 0 comment · 0 complexity · 9363d211e9b4a49a3f325c1d65e8adec MD5 · raw file

Large files are truncated click here to view the full file

  1. <html>
  2. <head>
  3. <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
  4. <title>bzip2 and libbzip2, version 1.0.5</title>
  5. <meta name="generator" content="DocBook XSL Stylesheets V1.69.1">
  6. <style type="text/css" media="screen">/* Colours:
  7. #74240f dark brown h1, h2, h3, h4
  8. #336699 medium blue links
  9. #339999 turquoise link hover colour
  10. #202020 almost black general text
  11. #761596 purple md5sum text
  12. #626262 dark gray pre border
  13. #eeeeee very light gray pre background
  14. #f2f2f9 very light blue nav table background
  15. #3366cc medium blue nav table border
  16. */
  17. a, a:link, a:visited, a:active { color: #336699; }
  18. a:hover { color: #339999; }
  19. body { font: 80%/126% sans-serif; }
  20. h1, h2, h3, h4 { color: #74240f; }
  21. dt { color: #336699; font-weight: bold }
  22. dd {
  23. margin-left: 1.5em;
  24. padding-bottom: 0.8em;
  25. }
  26. /* -- ruler -- */
  27. div.hr_blue {
  28. height: 3px;
  29. background:#ffffff url("/images/hr_blue.png") repeat-x; }
  30. div.hr_blue hr { display:none; }
  31. /* release styles */
  32. #release p { margin-top: 0.4em; }
  33. #release .md5sum { color: #761596; }
  34. /* ------ styles for docs|manuals|howto ------ */
  35. /* -- lists -- */
  36. ul {
  37. margin: 0px 4px 16px 16px;
  38. padding: 0px;
  39. list-style: url("/images/li-blue.png");
  40. }
  41. ul li {
  42. margin-bottom: 10px;
  43. }
  44. ul ul {
  45. list-style-type: none;
  46. list-style-image: none;
  47. margin-left: 0px;
  48. }
  49. /* header / footer nav tables */
  50. table.nav {
  51. border: solid 1px #3366cc;
  52. background: #f2f2f9;
  53. background-color: #f2f2f9;
  54. margin-bottom: 0.5em;
  55. }
  56. /* don't have underlined links in chunked nav menus */
  57. table.nav a { text-decoration: none; }
  58. table.nav a:hover { text-decoration: underline; }
  59. table.nav td { font-size: 85%; }
  60. code, tt, pre { font-size: 120%; }
  61. code, tt { color: #761596; }
  62. div.literallayout, pre.programlisting, pre.screen {
  63. color: #000000;
  64. padding: 0.5em;
  65. background: #eeeeee;
  66. border: 1px solid #626262;
  67. background-color: #eeeeee;
  68. margin: 4px 0px 4px 0px;
  69. }
  70. </style>
  71. </head>
  72. <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="book" lang="en">
  73. <div class="titlepage">
  74. <div>
  75. <div><h1 class="title">
  76. <a name="userman"></a>bzip2 and libbzip2, version 1.0.5</h1></div>
  77. <div><h2 class="subtitle">A program and library for data compression</h2></div>
  78. <div><div class="authorgroup"><div class="author">
  79. <h3 class="author">
  80. <span class="firstname">Julian</span> <span class="surname">Seward</span>
  81. </h3>
  82. <div class="affiliation"><span class="orgname">http://www.bzip.org<br></span></div>
  83. </div></div></div>
  84. <div><p class="releaseinfo">Version 1.0.5 of 10 December 2007</p></div>
  85. <div><p class="copyright">Copyright Š 1996-2007 Julian Seward</p></div>
  86. <div><div class="legalnotice">
  87. <a name="id2499833"></a><p>This program, <code class="computeroutput">bzip2</code>, the
  88. associated library <code class="computeroutput">libbzip2</code>, and
  89. all documentation, are copyright Š 1996-2007 Julian Seward.
  90. All rights reserved.</p>
  91. <p>Redistribution and use in source and binary forms, with
  92. or without modification, are permitted provided that the
  93. following conditions are met:</p>
  94. <div class="itemizedlist"><ul type="bullet">
  95. <li style="list-style-type: disc"><p>Redistributions of source code must retain the
  96. above copyright notice, this list of conditions and the
  97. following disclaimer.</p></li>
  98. <li style="list-style-type: disc"><p>The origin of this software must not be
  99. misrepresented; you must not claim that you wrote the original
  100. software. If you use this software in a product, an
  101. acknowledgment in the product documentation would be
  102. appreciated but is not required.</p></li>
  103. <li style="list-style-type: disc"><p>Altered source versions must be plainly marked
  104. as such, and must not be misrepresented as being the original
  105. software.</p></li>
  106. <li style="list-style-type: disc"><p>The name of the author may not be used to
  107. endorse or promote products derived from this software without
  108. specific prior written permission.</p></li>
  109. </ul></div>
  110. <p>THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY
  111. EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
  112. THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
  113. PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
  114. AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
  115. EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
  116. TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  117. DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
  118. ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  119. LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
  120. IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
  121. THE POSSIBILITY OF SUCH DAMAGE.</p>
  122. <p>PATENTS: To the best of my knowledge,
  123. <code class="computeroutput">bzip2</code> and
  124. <code class="computeroutput">libbzip2</code> do not use any patented
  125. algorithms. However, I do not have the resources to carry
  126. out a patent search. Therefore I cannot give any guarantee of
  127. the above statement.
  128. </p>
  129. </div></div>
  130. </div>
  131. <hr>
  132. </div>
  133. <div class="toc">
  134. <p><b>Table of Contents</b></p>
  135. <dl>
  136. <dt><span class="chapter"><a href="#intro">1. Introduction</a></span></dt>
  137. <dt><span class="chapter"><a href="#using">2. How to use bzip2</a></span></dt>
  138. <dd><dl>
  139. <dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt>
  140. <dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt>
  141. <dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt>
  142. <dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt>
  143. <dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt>
  144. <dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt>
  145. <dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt>
  146. <dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt>
  147. <dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt>
  148. </dl></dd>
  149. <dt><span class="chapter"><a href="#libprog">3.
  150. Programming with <code class="computeroutput">libbzip2</code>
  151. </a></span></dt>
  152. <dd><dl>
  153. <dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt>
  154. <dd><dl>
  155. <dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt>
  156. <dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt>
  157. <dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt>
  158. </dl></dd>
  159. <dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt>
  160. <dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt>
  161. <dd><dl>
  162. <dt><span class="sect2"><a href="#bzcompress-init">3.3.1. <code class="computeroutput">BZ2_bzCompressInit</code></a></span></dt>
  163. <dt><span class="sect2"><a href="#bzCompress">3.3.2. <code class="computeroutput">BZ2_bzCompress</code></a></span></dt>
  164. <dt><span class="sect2"><a href="#bzCompress-end">3.3.3. <code class="computeroutput">BZ2_bzCompressEnd</code></a></span></dt>
  165. <dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. <code class="computeroutput">BZ2_bzDecompressInit</code></a></span></dt>
  166. <dt><span class="sect2"><a href="#bzDecompress">3.3.5. <code class="computeroutput">BZ2_bzDecompress</code></a></span></dt>
  167. <dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. <code class="computeroutput">BZ2_bzDecompressEnd</code></a></span></dt>
  168. </dl></dd>
  169. <dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt>
  170. <dd><dl>
  171. <dt><span class="sect2"><a href="#bzreadopen">3.4.1. <code class="computeroutput">BZ2_bzReadOpen</code></a></span></dt>
  172. <dt><span class="sect2"><a href="#bzread">3.4.2. <code class="computeroutput">BZ2_bzRead</code></a></span></dt>
  173. <dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. <code class="computeroutput">BZ2_bzReadGetUnused</code></a></span></dt>
  174. <dt><span class="sect2"><a href="#bzreadclose">3.4.4. <code class="computeroutput">BZ2_bzReadClose</code></a></span></dt>
  175. <dt><span class="sect2"><a href="#bzwriteopen">3.4.5. <code class="computeroutput">BZ2_bzWriteOpen</code></a></span></dt>
  176. <dt><span class="sect2"><a href="#bzwrite">3.4.6. <code class="computeroutput">BZ2_bzWrite</code></a></span></dt>
  177. <dt><span class="sect2"><a href="#bzwriteclose">3.4.7. <code class="computeroutput">BZ2_bzWriteClose</code></a></span></dt>
  178. <dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt>
  179. <dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt>
  180. </dl></dd>
  181. <dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt>
  182. <dd><dl>
  183. <dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. <code class="computeroutput">BZ2_bzBuffToBuffCompress</code></a></span></dt>
  184. <dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. <code class="computeroutput">BZ2_bzBuffToBuffDecompress</code></a></span></dt>
  185. </dl></dd>
  186. <dt><span class="sect1"><a href="#zlib-compat">3.6. <code class="computeroutput">zlib</code> compatibility functions</a></span></dt>
  187. <dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a <code class="computeroutput">stdio</code>-free environment</a></span></dt>
  188. <dd><dl>
  189. <dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of <code class="computeroutput">stdio</code></a></span></dt>
  190. <dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt>
  191. </dl></dd>
  192. <dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt>
  193. </dl></dd>
  194. <dt><span class="chapter"><a href="#misc">4. Miscellanea</a></span></dt>
  195. <dd><dl>
  196. <dt><span class="sect1"><a href="#limits">4.1. Limitations of the compressed file format</a></span></dt>
  197. <dt><span class="sect1"><a href="#port-issues">4.2. Portability issues</a></span></dt>
  198. <dt><span class="sect1"><a href="#bugs">4.3. Reporting bugs</a></span></dt>
  199. <dt><span class="sect1"><a href="#package">4.4. Did you get the right package?</a></span></dt>
  200. <dt><span class="sect1"><a href="#reading">4.5. Further Reading</a></span></dt>
  201. </dl></dd>
  202. </dl>
  203. </div>
  204. <div class="chapter" lang="en">
  205. <div class="titlepage"><div><div><h2 class="title">
  206. <a name="intro"></a>1. Introduction</h2></div></div></div>
  207. <p><code class="computeroutput">bzip2</code> compresses files
  208. using the Burrows-Wheeler block-sorting text compression
  209. algorithm, and Huffman coding. Compression is generally
  210. considerably better than that achieved by more conventional
  211. LZ77/LZ78-based compressors, and approaches the performance of
  212. the PPM family of statistical compressors.</p>
  213. <p><code class="computeroutput">bzip2</code> is built on top of
  214. <code class="computeroutput">libbzip2</code>, a flexible library for
  215. handling compressed data in the
  216. <code class="computeroutput">bzip2</code> format. This manual
  217. describes both how to use the program and how to work with the
  218. library interface. Most of the manual is devoted to this
  219. library, not the program, which is good news if your interest is
  220. only in the program.</p>
  221. <div class="itemizedlist"><ul type="bullet">
  222. <li style="list-style-type: disc"><p><a href="#using">How to use bzip2</a> describes how to use
  223. <code class="computeroutput">bzip2</code>; this is the only part
  224. you need to read if you just want to know how to operate the
  225. program.</p></li>
  226. <li style="list-style-type: disc"><p><a href="#libprog">Programming with libbzip2</a> describes the
  227. programming interfaces in detail, and</p></li>
  228. <li style="list-style-type: disc"><p><a href="#misc">Miscellanea</a> records some
  229. miscellaneous notes which I thought ought to be recorded
  230. somewhere.</p></li>
  231. </ul></div>
  232. </div>
  233. <div class="chapter" lang="en">
  234. <div class="titlepage"><div><div><h2 class="title">
  235. <a name="using"></a>2. How to use bzip2</h2></div></div></div>
  236. <div class="toc">
  237. <p><b>Table of Contents</b></p>
  238. <dl>
  239. <dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt>
  240. <dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt>
  241. <dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt>
  242. <dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt>
  243. <dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt>
  244. <dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt>
  245. <dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt>
  246. <dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt>
  247. <dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt>
  248. </dl>
  249. </div>
  250. <p>This chapter contains a copy of the
  251. <code class="computeroutput">bzip2</code> man page, and nothing
  252. else.</p>
  253. <div class="sect1" lang="en">
  254. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  255. <a name="name"></a>2.1. NAME</h2></div></div></div>
  256. <div class="itemizedlist"><ul type="bullet">
  257. <li style="list-style-type: disc"><p><code class="computeroutput">bzip2</code>,
  258. <code class="computeroutput">bunzip2</code> - a block-sorting file
  259. compressor, v1.0.4</p></li>
  260. <li style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> -
  261. decompresses files to stdout</p></li>
  262. <li style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code> -
  263. recovers data from damaged bzip2 files</p></li>
  264. </ul></div>
  265. </div>
  266. <div class="sect1" lang="en">
  267. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  268. <a name="synopsis"></a>2.2. SYNOPSIS</h2></div></div></div>
  269. <div class="itemizedlist"><ul type="bullet">
  270. <li style="list-style-type: disc"><p><code class="computeroutput">bzip2</code> [
  271. -cdfkqstvzVL123456789 ] [ filenames ... ]</p></li>
  272. <li style="list-style-type: disc"><p><code class="computeroutput">bunzip2</code> [
  273. -fkvsVL ] [ filenames ... ]</p></li>
  274. <li style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> [ -s ] [
  275. filenames ... ]</p></li>
  276. <li style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code>
  277. filename</p></li>
  278. </ul></div>
  279. </div>
  280. <div class="sect1" lang="en">
  281. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  282. <a name="description"></a>2.3. DESCRIPTION</h2></div></div></div>
  283. <p><code class="computeroutput">bzip2</code> compresses files
  284. using the Burrows-Wheeler block sorting text compression
  285. algorithm, and Huffman coding. Compression is generally
  286. considerably better than that achieved by more conventional
  287. LZ77/LZ78-based compressors, and approaches the performance of
  288. the PPM family of statistical compressors.</p>
  289. <p>The command-line options are deliberately very similar to
  290. those of GNU <code class="computeroutput">gzip</code>, but they are
  291. not identical.</p>
  292. <p><code class="computeroutput">bzip2</code> expects a list of
  293. file names to accompany the command-line flags. Each file is
  294. replaced by a compressed version of itself, with the name
  295. <code class="computeroutput">original_name.bz2</code>. Each
  296. compressed file has the same modification date, permissions, and,
  297. when possible, ownership as the corresponding original, so that
  298. these properties can be correctly restored at decompression time.
  299. File name handling is naive in the sense that there is no
  300. mechanism for preserving original file names, permissions,
  301. ownerships or dates in filesystems which lack these concepts, or
  302. have serious file name length restrictions, such as
  303. MS-DOS.</p>
  304. <p><code class="computeroutput">bzip2</code> and
  305. <code class="computeroutput">bunzip2</code> will by default not
  306. overwrite existing files. If you want this to happen, specify
  307. the <code class="computeroutput">-f</code> flag.</p>
  308. <p>If no file names are specified,
  309. <code class="computeroutput">bzip2</code> compresses from standard
  310. input to standard output. In this case,
  311. <code class="computeroutput">bzip2</code> will decline to write
  312. compressed output to a terminal, as this would be entirely
  313. incomprehensible and therefore pointless.</p>
  314. <p><code class="computeroutput">bunzip2</code> (or
  315. <code class="computeroutput">bzip2 -d</code>) decompresses all
  316. specified files. Files which were not created by
  317. <code class="computeroutput">bzip2</code> will be detected and
  318. ignored, and a warning issued.
  319. <code class="computeroutput">bzip2</code> attempts to guess the
  320. filename for the decompressed file from that of the compressed
  321. file as follows:</p>
  322. <div class="itemizedlist"><ul type="bullet">
  323. <li style="list-style-type: disc"><p><code class="computeroutput">filename.bz2 </code>
  324. becomes
  325. <code class="computeroutput">filename</code></p></li>
  326. <li style="list-style-type: disc"><p><code class="computeroutput">filename.bz </code>
  327. becomes
  328. <code class="computeroutput">filename</code></p></li>
  329. <li style="list-style-type: disc"><p><code class="computeroutput">filename.tbz2</code>
  330. becomes
  331. <code class="computeroutput">filename.tar</code></p></li>
  332. <li style="list-style-type: disc"><p><code class="computeroutput">filename.tbz </code>
  333. becomes
  334. <code class="computeroutput">filename.tar</code></p></li>
  335. <li style="list-style-type: disc"><p><code class="computeroutput">anyothername </code>
  336. becomes
  337. <code class="computeroutput">anyothername.out</code></p></li>
  338. </ul></div>
  339. <p>If the file does not end in one of the recognised endings,
  340. <code class="computeroutput">.bz2</code>,
  341. <code class="computeroutput">.bz</code>,
  342. <code class="computeroutput">.tbz2</code> or
  343. <code class="computeroutput">.tbz</code>,
  344. <code class="computeroutput">bzip2</code> complains that it cannot
  345. guess the name of the original file, and uses the original name
  346. with <code class="computeroutput">.out</code> appended.</p>
  347. <p>As with compression, supplying no filenames causes
  348. decompression from standard input to standard output.</p>
  349. <p><code class="computeroutput">bunzip2</code> will correctly
  350. decompress a file which is the concatenation of two or more
  351. compressed files. The result is the concatenation of the
  352. corresponding uncompressed files. Integrity testing
  353. (<code class="computeroutput">-t</code>) of concatenated compressed
  354. files is also supported.</p>
  355. <p>You can also compress or decompress files to the standard
  356. output by giving the <code class="computeroutput">-c</code> flag.
  357. Multiple files may be compressed and decompressed like this. The
  358. resulting outputs are fed sequentially to stdout. Compression of
  359. multiple files in this manner generates a stream containing
  360. multiple compressed file representations. Such a stream can be
  361. decompressed correctly only by
  362. <code class="computeroutput">bzip2</code> version 0.9.0 or later.
  363. Earlier versions of <code class="computeroutput">bzip2</code> will
  364. stop after decompressing the first file in the stream.</p>
  365. <p><code class="computeroutput">bzcat</code> (or
  366. <code class="computeroutput">bzip2 -dc</code>) decompresses all
  367. specified files to the standard output.</p>
  368. <p><code class="computeroutput">bzip2</code> will read arguments
  369. from the environment variables
  370. <code class="computeroutput">BZIP2</code> and
  371. <code class="computeroutput">BZIP</code>, in that order, and will
  372. process them before any arguments read from the command line.
  373. This gives a convenient way to supply default arguments.</p>
  374. <p>Compression is always performed, even if the compressed
  375. file is slightly larger than the original. Files of less than
  376. about one hundred bytes tend to get larger, since the compression
  377. mechanism has a constant overhead in the region of 50 bytes.
  378. Random data (including the output of most file compressors) is
  379. coded at about 8.05 bits per byte, giving an expansion of around
  380. 0.5%.</p>
  381. <p>As a self-check for your protection,
  382. <code class="computeroutput">bzip2</code> uses 32-bit CRCs to make
  383. sure that the decompressed version of a file is identical to the
  384. original. This guards against corruption of the compressed data,
  385. and against undetected bugs in
  386. <code class="computeroutput">bzip2</code> (hopefully very unlikely).
  387. The chances of data corruption going undetected is microscopic,
  388. about one chance in four billion for each file processed. Be
  389. aware, though, that the check occurs upon decompression, so it
  390. can only tell you that something is wrong. It can't help you
  391. recover the original uncompressed data. You can use
  392. <code class="computeroutput">bzip2recover</code> to try to recover
  393. data from damaged files.</p>
  394. <p>Return values: 0 for a normal exit, 1 for environmental
  395. problems (file not found, invalid flags, I/O errors, etc.), 2
  396. to indicate a corrupt compressed file, 3 for an internal
  397. consistency error (eg, bug) which caused
  398. <code class="computeroutput">bzip2</code> to panic.</p>
  399. </div>
  400. <div class="sect1" lang="en">
  401. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  402. <a name="options"></a>2.4. OPTIONS</h2></div></div></div>
  403. <div class="variablelist"><dl>
  404. <dt><span class="term"><code class="computeroutput">-c --stdout</code></span></dt>
  405. <dd><p>Compress or decompress to standard
  406. output.</p></dd>
  407. <dt><span class="term"><code class="computeroutput">-d --decompress</code></span></dt>
  408. <dd><p>Force decompression.
  409. <code class="computeroutput">bzip2</code>,
  410. <code class="computeroutput">bunzip2</code> and
  411. <code class="computeroutput">bzcat</code> are really the same
  412. program, and the decision about what actions to take is done on
  413. the basis of which name is used. This flag overrides that
  414. mechanism, and forces bzip2 to decompress.</p></dd>
  415. <dt><span class="term"><code class="computeroutput">-z --compress</code></span></dt>
  416. <dd><p>The complement to
  417. <code class="computeroutput">-d</code>: forces compression,
  418. regardless of the invokation name.</p></dd>
  419. <dt><span class="term"><code class="computeroutput">-t --test</code></span></dt>
  420. <dd><p>Check integrity of the specified file(s), but
  421. don't decompress them. This really performs a trial
  422. decompression and throws away the result.</p></dd>
  423. <dt><span class="term"><code class="computeroutput">-f --force</code></span></dt>
  424. <dd>
  425. <p>Force overwrite of output files. Normally,
  426. <code class="computeroutput">bzip2</code> will not overwrite
  427. existing output files. Also forces
  428. <code class="computeroutput">bzip2</code> to break hard links to
  429. files, which it otherwise wouldn't do.</p>
  430. <p><code class="computeroutput">bzip2</code> normally declines
  431. to decompress files which don't have the correct magic header
  432. bytes. If forced (<code class="computeroutput">-f</code>),
  433. however, it will pass such files through unmodified. This is
  434. how GNU <code class="computeroutput">gzip</code> behaves.</p>
  435. </dd>
  436. <dt><span class="term"><code class="computeroutput">-k --keep</code></span></dt>
  437. <dd><p>Keep (don't delete) input files during
  438. compression or decompression.</p></dd>
  439. <dt><span class="term"><code class="computeroutput">-s --small</code></span></dt>
  440. <dd>
  441. <p>Reduce memory usage, for compression,
  442. decompression and testing. Files are decompressed and tested
  443. using a modified algorithm which only requires 2.5 bytes per
  444. block byte. This means any file can be decompressed in 2300k
  445. of memory, albeit at about half the normal speed.</p>
  446. <p>During compression, <code class="computeroutput">-s</code>
  447. selects a block size of 200k, which limits memory use to around
  448. the same figure, at the expense of your compression ratio. In
  449. short, if your machine is low on memory (8 megabytes or less),
  450. use <code class="computeroutput">-s</code> for everything. See
  451. <a href="#memory-management">MEMORY MANAGEMENT</a> below.</p>
  452. </dd>
  453. <dt><span class="term"><code class="computeroutput">-q --quiet</code></span></dt>
  454. <dd><p>Suppress non-essential warning messages.
  455. Messages pertaining to I/O errors and other critical events
  456. will not be suppressed.</p></dd>
  457. <dt><span class="term"><code class="computeroutput">-v --verbose</code></span></dt>
  458. <dd><p>Verbose mode -- show the compression ratio for
  459. each file processed. Further
  460. <code class="computeroutput">-v</code>'s increase the verbosity
  461. level, spewing out lots of information which is primarily of
  462. interest for diagnostic purposes.</p></dd>
  463. <dt><span class="term"><code class="computeroutput">-L --license -V --version</code></span></dt>
  464. <dd><p>Display the software version, license terms and
  465. conditions.</p></dd>
  466. <dt><span class="term"><code class="computeroutput">-1</code> (or
  467. <code class="computeroutput">--fast</code>) to
  468. <code class="computeroutput">-9</code> (or
  469. <code class="computeroutput">-best</code>)</span></dt>
  470. <dd><p>Set the block size to 100 k, 200 k ... 900 k
  471. when compressing. Has no effect when decompressing. See <a href="#memory-management">MEMORY MANAGEMENT</a> below. The
  472. <code class="computeroutput">--fast</code> and
  473. <code class="computeroutput">--best</code> aliases are primarily
  474. for GNU <code class="computeroutput">gzip</code> compatibility.
  475. In particular, <code class="computeroutput">--fast</code> doesn't
  476. make things significantly faster. And
  477. <code class="computeroutput">--best</code> merely selects the
  478. default behaviour.</p></dd>
  479. <dt><span class="term"><code class="computeroutput">--</code></span></dt>
  480. <dd><p>Treats all subsequent arguments as file names,
  481. even if they start with a dash. This is so you can handle
  482. files with names beginning with a dash, for example:
  483. <code class="computeroutput">bzip2 --
  484. -myfilename</code>.</p></dd>
  485. <dt>
  486. <span class="term"><code class="computeroutput">--repetitive-fast</code>, </span><span class="term"><code class="computeroutput">--repetitive-best</code></span>
  487. </dt>
  488. <dd><p>These flags are redundant in versions 0.9.5 and
  489. above. They provided some coarse control over the behaviour of
  490. the sorting algorithm in earlier versions, which was sometimes
  491. useful. 0.9.5 and above have an improved algorithm which
  492. renders these flags irrelevant.</p></dd>
  493. </dl></div>
  494. </div>
  495. <div class="sect1" lang="en">
  496. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  497. <a name="memory-management"></a>2.5. MEMORY MANAGEMENT</h2></div></div></div>
  498. <p><code class="computeroutput">bzip2</code> compresses large
  499. files in blocks. The block size affects both the compression
  500. ratio achieved, and the amount of memory needed for compression
  501. and decompression. The flags <code class="computeroutput">-1</code>
  502. through <code class="computeroutput">-9</code> specify the block
  503. size to be 100,000 bytes through 900,000 bytes (the default)
  504. respectively. At decompression time, the block size used for
  505. compression is read from the header of the compressed file, and
  506. <code class="computeroutput">bunzip2</code> then allocates itself
  507. just enough memory to decompress the file. Since block sizes are
  508. stored in compressed files, it follows that the flags
  509. <code class="computeroutput">-1</code> to
  510. <code class="computeroutput">-9</code> are irrelevant to and so
  511. ignored during decompression.</p>
  512. <p>Compression and decompression requirements, in bytes, can be
  513. estimated as:</p>
  514. <pre class="programlisting">Compression: 400k + ( 8 x block size )
  515. Decompression: 100k + ( 4 x block size ), or
  516. 100k + ( 2.5 x block size )</pre>
  517. <p>Larger block sizes give rapidly diminishing marginal
  518. returns. Most of the compression comes from the first two or
  519. three hundred k of block size, a fact worth bearing in mind when
  520. using <code class="computeroutput">bzip2</code> on small machines.
  521. It is also important to appreciate that the decompression memory
  522. requirement is set at compression time by the choice of block
  523. size.</p>
  524. <p>For files compressed with the default 900k block size,
  525. <code class="computeroutput">bunzip2</code> will require about 3700
  526. kbytes to decompress. To support decompression of any file on a
  527. 4 megabyte machine, <code class="computeroutput">bunzip2</code> has
  528. an option to decompress using approximately half this amount of
  529. memory, about 2300 kbytes. Decompression speed is also halved,
  530. so you should use this option only where necessary. The relevant
  531. flag is <code class="computeroutput">-s</code>.</p>
  532. <p>In general, try and use the largest block size memory
  533. constraints allow, since that maximises the compression achieved.
  534. Compression and decompression speed are virtually unaffected by
  535. block size.</p>
  536. <p>Another significant point applies to files which fit in a
  537. single block -- that means most files you'd encounter using a
  538. large block size. The amount of real memory touched is
  539. proportional to the size of the file, since the file is smaller
  540. than a block. For example, compressing a file 20,000 bytes long
  541. with the flag <code class="computeroutput">-9</code> will cause the
  542. compressor to allocate around 7600k of memory, but only touch
  543. 400k + 20000 * 8 = 560 kbytes of it. Similarly, the decompressor
  544. will allocate 3700k but only touch 100k + 20000 * 4 = 180
  545. kbytes.</p>
  546. <p>Here is a table which summarises the maximum memory usage
  547. for different block sizes. Also recorded is the total compressed
  548. size for 14 files of the Calgary Text Compression Corpus
  549. totalling 3,141,622 bytes. This column gives some feel for how
  550. compression varies with block size. These figures tend to
  551. understate the advantage of larger block sizes for larger files,
  552. since the Corpus is dominated by smaller files.</p>
  553. <pre class="programlisting"> Compress Decompress Decompress Corpus
  554. Flag usage usage -s usage Size
  555. -1 1200k 500k 350k 914704
  556. -2 2000k 900k 600k 877703
  557. -3 2800k 1300k 850k 860338
  558. -4 3600k 1700k 1100k 846899
  559. -5 4400k 2100k 1350k 845160
  560. -6 5200k 2500k 1600k 838626
  561. -7 6100k 2900k 1850k 834096
  562. -8 6800k 3300k 2100k 828642
  563. -9 7600k 3700k 2350k 828642</pre>
  564. </div>
  565. <div class="sect1" lang="en">
  566. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  567. <a name="recovering"></a>2.6. RECOVERING DATA FROM DAMAGED FILES</h2></div></div></div>
  568. <p><code class="computeroutput">bzip2</code> compresses files in
  569. blocks, usually 900kbytes long. Each block is handled
  570. independently. If a media or transmission error causes a
  571. multi-block <code class="computeroutput">.bz2</code> file to become
  572. damaged, it may be possible to recover data from the undamaged
  573. blocks in the file.</p>
  574. <p>The compressed representation of each block is delimited by
  575. a 48-bit pattern, which makes it possible to find the block
  576. boundaries with reasonable certainty. Each block also carries
  577. its own 32-bit CRC, so damaged blocks can be distinguished from
  578. undamaged ones.</p>
  579. <p><code class="computeroutput">bzip2recover</code> is a simple
  580. program whose purpose is to search for blocks in
  581. <code class="computeroutput">.bz2</code> files, and write each block
  582. out into its own <code class="computeroutput">.bz2</code> file. You
  583. can then use <code class="computeroutput">bzip2 -t</code> to test
  584. the integrity of the resulting files, and decompress those which
  585. are undamaged.</p>
  586. <p><code class="computeroutput">bzip2recover</code> takes a
  587. single argument, the name of the damaged file, and writes a
  588. number of files <code class="computeroutput">rec0001file.bz2</code>,
  589. <code class="computeroutput">rec0002file.bz2</code>, etc, containing
  590. the extracted blocks. The output filenames are designed so that
  591. the use of wildcards in subsequent processing -- for example,
  592. <code class="computeroutput">bzip2 -dc rec*file.bz2 &gt;
  593. recovered_data</code> -- lists the files in the correct
  594. order.</p>
  595. <p><code class="computeroutput">bzip2recover</code> should be of
  596. most use dealing with large <code class="computeroutput">.bz2</code>
  597. files, as these will contain many blocks. It is clearly futile
  598. to use it on damaged single-block files, since a damaged block
  599. cannot be recovered. If you wish to minimise any potential data
  600. loss through media or transmission errors, you might consider
  601. compressing with a smaller block size.</p>
  602. </div>
  603. <div class="sect1" lang="en">
  604. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  605. <a name="performance"></a>2.7. PERFORMANCE NOTES</h2></div></div></div>
  606. <p>The sorting phase of compression gathers together similar
  607. strings in the file. Because of this, files containing very long
  608. runs of repeated symbols, like "aabaabaabaab ..." (repeated
  609. several hundred times) may compress more slowly than normal.
  610. Versions 0.9.5 and above fare much better than previous versions
  611. in this respect. The ratio between worst-case and average-case
  612. compression time is in the region of 10:1. For previous
  613. versions, this figure was more like 100:1. You can use the
  614. <code class="computeroutput">-vvvv</code> option to monitor progress
  615. in great detail, if you want.</p>
  616. <p>Decompression speed is unaffected by these
  617. phenomena.</p>
  618. <p><code class="computeroutput">bzip2</code> usually allocates
  619. several megabytes of memory to operate in, and then charges all
  620. over it in a fairly random fashion. This means that performance,
  621. both for compressing and decompressing, is largely determined by
  622. the speed at which your machine can service cache misses.
  623. Because of this, small changes to the code to reduce the miss
  624. rate have been observed to give disproportionately large
  625. performance improvements. I imagine
  626. <code class="computeroutput">bzip2</code> will perform best on
  627. machines with very large caches.</p>
  628. </div>
  629. <div class="sect1" lang="en">
  630. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  631. <a name="caveats"></a>2.8. CAVEATS</h2></div></div></div>
  632. <p>I/O error messages are not as helpful as they could be.
  633. <code class="computeroutput">bzip2</code> tries hard to detect I/O
  634. errors and exit cleanly, but the details of what the problem is
  635. sometimes seem rather misleading.</p>
  636. <p>This manual page pertains to version 1.0.5 of
  637. <code class="computeroutput">bzip2</code>. Compressed data created by
  638. this version is entirely forwards and backwards compatible with the
  639. previous public releases, versions 0.1pl2, 0.9.0 and 0.9.5, 1.0.0,
  640. 1.0.1, 1.0.2 and 1.0.3, but with the following exception: 0.9.0 and
  641. above can correctly decompress multiple concatenated compressed files.
  642. 0.1pl2 cannot do this; it will stop after decompressing just the first
  643. file in the stream.</p>
  644. <p><code class="computeroutput">bzip2recover</code> versions
  645. prior to 1.0.2 used 32-bit integers to represent bit positions in
  646. compressed files, so it could not handle compressed files more
  647. than 512 megabytes long. Versions 1.0.2 and above use 64-bit ints
  648. on some platforms which support them (GNU supported targets, and
  649. Windows). To establish whether or not
  650. <code class="computeroutput">bzip2recover</code> was built with such
  651. a limitation, run it without arguments. In any event you can
  652. build yourself an unlimited version if you can recompile it with
  653. <code class="computeroutput">MaybeUInt64</code> set to be an
  654. unsigned 64-bit integer.</p>
  655. </div>
  656. <div class="sect1" lang="en">
  657. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  658. <a name="author"></a>2.9. AUTHOR</h2></div></div></div>
  659. <p>Julian Seward,
  660. <code class="computeroutput">jseward@bzip.org</code></p>
  661. <p>The ideas embodied in
  662. <code class="computeroutput">bzip2</code> are due to (at least) the
  663. following people: Michael Burrows and David Wheeler (for the
  664. block sorting transformation), David Wheeler (again, for the
  665. Huffman coder), Peter Fenwick (for the structured coding model in
  666. the original <code class="computeroutput">bzip</code>, and many
  667. refinements), and Alistair Moffat, Radford Neal and Ian Witten
  668. (for the arithmetic coder in the original
  669. <code class="computeroutput">bzip</code>). I am much indebted for
  670. their help, support and advice. See the manual in the source
  671. distribution for pointers to sources of documentation. Christian
  672. von Roques encouraged me to look for faster sorting algorithms,
  673. so as to speed up compression. Bela Lubkin encouraged me to
  674. improve the worst-case compression performance.
  675. Donna Robinson XMLised the documentation.
  676. Many people sent
  677. patches, helped with portability problems, lent machines, gave
  678. advice and were generally helpful.</p>
  679. </div>
  680. </div>
  681. <div class="chapter" lang="en">
  682. <div class="titlepage"><div><div><h2 class="title">
  683. <a name="libprog"></a>3. 
  684. Programming with <code class="computeroutput">libbzip2</code>
  685. </h2></div></div></div>
  686. <div class="toc">
  687. <p><b>Table of Contents</b></p>
  688. <dl>
  689. <dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt>
  690. <dd><dl>
  691. <dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt>
  692. <dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt>
  693. <dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt>
  694. </dl></dd>
  695. <dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt>
  696. <dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt>
  697. <dd><dl>
  698. <dt><span class="sect2"><a href="#bzcompress-init">3.3.1. <code class="computeroutput">BZ2_bzCompressInit</code></a></span></dt>
  699. <dt><span class="sect2"><a href="#bzCompress">3.3.2. <code class="computeroutput">BZ2_bzCompress</code></a></span></dt>
  700. <dt><span class="sect2"><a href="#bzCompress-end">3.3.3. <code class="computeroutput">BZ2_bzCompressEnd</code></a></span></dt>
  701. <dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. <code class="computeroutput">BZ2_bzDecompressInit</code></a></span></dt>
  702. <dt><span class="sect2"><a href="#bzDecompress">3.3.5. <code class="computeroutput">BZ2_bzDecompress</code></a></span></dt>
  703. <dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. <code class="computeroutput">BZ2_bzDecompressEnd</code></a></span></dt>
  704. </dl></dd>
  705. <dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt>
  706. <dd><dl>
  707. <dt><span class="sect2"><a href="#bzreadopen">3.4.1. <code class="computeroutput">BZ2_bzReadOpen</code></a></span></dt>
  708. <dt><span class="sect2"><a href="#bzread">3.4.2. <code class="computeroutput">BZ2_bzRead</code></a></span></dt>
  709. <dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. <code class="computeroutput">BZ2_bzReadGetUnused</code></a></span></dt>
  710. <dt><span class="sect2"><a href="#bzreadclose">3.4.4. <code class="computeroutput">BZ2_bzReadClose</code></a></span></dt>
  711. <dt><span class="sect2"><a href="#bzwriteopen">3.4.5. <code class="computeroutput">BZ2_bzWriteOpen</code></a></span></dt>
  712. <dt><span class="sect2"><a href="#bzwrite">3.4.6. <code class="computeroutput">BZ2_bzWrite</code></a></span></dt>
  713. <dt><span class="sect2"><a href="#bzwriteclose">3.4.7. <code class="computeroutput">BZ2_bzWriteClose</code></a></span></dt>
  714. <dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt>
  715. <dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt>
  716. </dl></dd>
  717. <dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt>
  718. <dd><dl>
  719. <dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. <code class="computeroutput">BZ2_bzBuffToBuffCompress</code></a></span></dt>
  720. <dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. <code class="computeroutput">BZ2_bzBuffToBuffDecompress</code></a></span></dt>
  721. </dl></dd>
  722. <dt><span class="sect1"><a href="#zlib-compat">3.6. <code class="computeroutput">zlib</code> compatibility functions</a></span></dt>
  723. <dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a <code class="computeroutput">stdio</code>-free environment</a></span></dt>
  724. <dd><dl>
  725. <dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of <code class="computeroutput">stdio</code></a></span></dt>
  726. <dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt>
  727. </dl></dd>
  728. <dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt>
  729. </dl>
  730. </div>
  731. <p>This chapter describes the programming interface to
  732. <code class="computeroutput">libbzip2</code>.</p>
  733. <p>For general background information, particularly about
  734. memory use and performance aspects, you'd be well advised to read
  735. <a href="#using">How to use bzip2</a> as well.</p>
  736. <div class="sect1" lang="en">
  737. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  738. <a name="top-level"></a>3.1. Top-level structure</h2></div></div></div>
  739. <p><code class="computeroutput">libbzip2</code> is a flexible
  740. library for compressing and decompressing data in the
  741. <code class="computeroutput">bzip2</code> data format. Although
  742. packaged as a single entity, it helps to regard the library as
  743. three separate parts: the low level interface, and the high level
  744. interface, and some utility functions.</p>
  745. <p>The structure of
  746. <code class="computeroutput">libbzip2</code>'s interfaces is similar
  747. to that of Jean-loup Gailly's and Mark Adler's excellent
  748. <code class="computeroutput">zlib</code> library.</p>
  749. <p>All externally visible symbols have names beginning
  750. <code class="computeroutput">BZ2_</code>. This is new in version
  751. 1.0. The intention is to minimise pollution of the namespaces of
  752. library clients.</p>
  753. <p>To use any part of the library, you need to
  754. <code class="computeroutput">#include &lt;bzlib.h&gt;</code>
  755. into your sources.</p>
  756. <div class="sect2" lang="en">
  757. <div class="titlepage"><div><div><h3 class="title">
  758. <a name="ll-summary"></a>3.1.1. Low-level summary</h3></div></div></div>
  759. <p>This interface provides services for compressing and
  760. decompressing data in memory. There's no provision for dealing
  761. with files, streams or any other I/O mechanisms, just straight
  762. memory-to-memory work. In fact, this part of the library can be
  763. compiled without inclusion of
  764. <code class="computeroutput">stdio.h</code>, which may be helpful
  765. for embedded applications.</p>
  766. <p>The low-level part of the library has no global variables
  767. and is therefore thread-safe.</p>
  768. <p>Six routines make up the low level interface:
  769. <code class="computeroutput">BZ2_bzCompressInit</code>,
  770. <code class="computeroutput">BZ2_bzCompress</code>, and
  771. <code class="computeroutput">BZ2_bzCompressEnd</code> for
  772. compression, and a corresponding trio
  773. <code class="computeroutput">BZ2_bzDecompressInit</code>,
  774. <code class="computeroutput">BZ2_bzDecompress</code> and
  775. <code class="computeroutput">BZ2_bzDecompressEnd</code> for
  776. decompression. The <code class="computeroutput">*Init</code>
  777. functions allocate memory for compression/decompression and do
  778. other initialisations, whilst the
  779. <code class="computeroutput">*End</code> functions close down
  780. operations and release memory.</p>
  781. <p>The real work is done by
  782. <code class="computeroutput">BZ2_bzCompress</code> and
  783. <code class="computeroutput">BZ2_bzDecompress</code>. These
  784. compress and decompress data from a user-supplied input buffer to
  785. a user-supplied output buffer. These buffers can be any size;
  786. arbitrary quantities of data are handled by making repeated calls
  787. to these functions. This is a flexible mechanism allowing a
  788. consumer-pull style of activity, or producer-push, or a mixture
  789. of both.</p>
  790. </div>
  791. <div class="sect2" lang="en">
  792. <div class="titlepage"><div><div><h3 class="title">
  793. <a name="hl-summary"></a>3.1.2. High-level summary</h3></div></div></div>
  794. <p>This interface provides some handy wrappers around the
  795. low-level interface to facilitate reading and writing
  796. <code class="computeroutput">bzip2</code> format files
  797. (<code class="computeroutput">.bz2</code> files). The routines
  798. provide hooks to facilitate reading files in which the
  799. <code class="computeroutput">bzip2</code> data stream is embedded
  800. within some larger-scale file structure, or where there are
  801. multiple <code class="computeroutput">bzip2</code> data streams
  802. concatenated end-to-end.</p>
  803. <p>For reading files,
  804. <code class="computeroutput">BZ2_bzReadOpen</code>,
  805. <code class="computeroutput">BZ2_bzRead</code>,
  806. <code class="computeroutput">BZ2_bzReadClose</code> and
  807. <code class="computeroutput">BZ2_bzReadGetUnused</code> are
  808. supplied. For writing files,
  809. <code class="computeroutput">BZ2_bzWriteOpen</code>,
  810. <code class="computeroutput">BZ2_bzWrite</code> and
  811. <code class="computeroutput">BZ2_bzWriteFinish</code> are
  812. available.</p>
  813. <p>As with the low-level library, no global variables are used
  814. so the library is per se thread-safe. However, if I/O errors
  815. occur whilst reading or writing the underlying compressed files,
  816. you may have to consult <code class="computeroutput">errno</code> to
  817. determine the cause of the error. In that case, you'd need a C
  818. library which correctly supports
  819. <code class="computeroutput">errno</code> in a multithreaded
  820. environment.</p>
  821. <p>To make the library a little simpler and more portable,
  822. <code class="computeroutput">BZ2_bzReadOpen</code> and
  823. <code class="computeroutput">BZ2_bzWriteOpen</code> require you to
  824. pass them file handles (<code class="computeroutput">FILE*</code>s)
  825. which have previously been opened for reading or writing
  826. respectively. That avoids portability problems associated with
  827. file operations and file attributes, whilst not being much of an
  828. imposition on the programmer.</p>
  829. </div>
  830. <div class="sect2" lang="en">
  831. <div class="titlepage"><div><div><h3 class="title">
  832. <a name="util-fns-summary"></a>3.1.3. Utility functions summary</h3></div></div></div>
  833. <p>For very simple needs,
  834. <code class="computeroutput">BZ2_bzBuffToBuffCompress</code> and
  835. <code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> are
  836. provided. These compress data in memory from one buffer to
  837. another buffer in a single function call. You should assess
  838. whether these functions fulfill your memory-to-memory
  839. compression/decompression requirements before investing effort in
  840. understanding the more general but more complex low-level
  841. interface.</p>
  842. <p>Yoshioka Tsuneo
  843. (<code class="computeroutput">tsuneo@rr.iij4u.or.jp</code>) has
  844. contributed some functions to give better
  845. <code class="computeroutput">zlib</code> compatibility. These
  846. functions are <code class="computeroutput">BZ2_bzopen</code>,
  847. <code class="computeroutput">BZ2_bzread</code>,
  848. <code class="computeroutput">BZ2_bzwrite</code>,
  849. <code class="computeroutput">BZ2_bzflush</code>,
  850. <code class="computeroutput">BZ2_bzclose</code>,
  851. <code class="computeroutput">BZ2_bzerror</code> and
  852. <code class="computeroutput">BZ2_bzlibVersion</code>. You may find
  853. these functions more convenient for simple file reading and
  854. writing, than those in the high-level interface. These functions
  855. are not (yet) officially part of the library, and are minimally
  856. documented here. If they break, you get to keep all the pieces.
  857. I hope to document them properly when time permits.</p>
  858. <p>Yoshioka also contributed modifications to allow the
  859. library to be built as a Windows DLL.</p>
  860. </div>
  861. </div>
  862. <div class="sect1" lang="en">
  863. <div class="titlepage"><div><div><h2 class="title" style="clear: both">
  864. <a name="err-handling"></a>3.2. Error handling</h2></div></div></div>
  865. <p>The library is designed to recover cleanly in all
  866. situations, including the worst-case situation of decompressing
  867. random data. I'm not 100% sure that it can always do this, so
  868. you might want to add a signal handler to catch segmentation
  869. violations during decompression if you are feeling especially
  870. paranoid. I would be interested in hearing more about the
  871. robustness of the library to corrupted compressed data.</p>
  872. <p>Version 1.0.3 more robust in this respect than any
  873. previous version. Investigations with Valgrind (a tool for detecting
  874. problems with memory management) indicate
  875. that, at least for the few files I tested, all single-bit errors
  876. in the decompressed data are caught properly, with no
  877. segmentation faults, no uses of uninitialised data, no out of
  878. range reads or writes, and no infinite looping in the decompressor.
  879. So it's certainly pretty robust, although
  880. I wouldn't claim it to be totally bombproof.</p>
  881. <p>The file <code class="computeroutput">bzlib.h</code> contains
  882. all definitions needed to use the library. In particular, you
  883. should definitely not include
  884. <code class="computeroutput">bzlib_private.h</code>.</p>
  885. <p>In <code class="computeroutput">bzlib.h</code>, the various
  886. return values are defined. The following list is not intended as
  887. an exhaustive description of the circumstances in which a given
  888. value may be returned -- those descriptions are given later.
  889. Rather, it is intended to convey the rough meaning of each return
  890. value. The first five actions are normal and not intended to
  891. denote an error situation.</p>
  892. <div class="variablelist"><dl>
  893. <dt><span class="term"><code class="computeroutput">BZ_OK</code></span></dt>
  894. <dd><p>The requested action was completed
  895. successfully.</p></dd>
  896. <dt><span class="term"><code class="computeroutput">BZ_RUN_OK, BZ_FLUSH_OK,
  897. BZ_FINISH_OK</code></span></dt>
  898. <dd><p>In
  899. <code class="computeroutput">BZ2_bzCompress</code>, the requested
  900. flush/finish/nothing-special action was completed
  901. successfully.</p></dd>
  902. <dt><span class="term"><code class="computeroutput">BZ_STREAM_END</code></span></dt>
  903. <dd><p>Compression of data was completed, or the
  904. logical stream end was detected during
  905. decompression.</p></dd>
  906. </dl></div>
  907. <p>The following return values indicate an error of some
  908. kind.</p>
  909. <div class="variablelist"><dl>
  910. <dt><span class="term"><code class="computeroutput">BZ_CONFIG_ERROR</code></span></dt>
  911. <dd><p>Indicates that the library has been improperly
  912. compiled on your platform -- a major configuration error.
  913. Specifically, it means that
  914. <code class="computeroutput">sizeof(char)</code>,
  915. <code class="computeroutput">sizeof(short)</code> and
  916. <code class="computeroutput">sizeof(int)</code> are not 1, 2 and
  917. 4 respectively, as they should be. Note that the library
  918. should still work properly on 64-bit platforms which follow
  919. the LP64 programming model -- that is, where
  920. <code class="computeroutput">sizeof(long)</code> and
  921. <code class="computeroutput">sizeof(void*)</code> are 8. Under
  922. LP64, <code class="computeroutput">sizeof(int)</code> is still 4,
  923. so <code class="computeroutput…