/usr.bin/compress/doc/README

https://bitbucket.org/freebsd/freebsd-head/ · #! · 284 lines · 244 code · 40 blank · 0 comment · 0 complexity · 8a41b39446f696827506094733367560 MD5 · raw file

  1. @(#)README 8.1 (Berkeley) 6/9/93
  2. $FreeBSD$
  3. Compress version 4.0 improvements over 3.0:
  4. o compress() speedup (10-50%) by changing division hash to xor
  5. o decompress() speedup (5-10%)
  6. o Memory requirements reduced (3-30%)
  7. o Stack requirements reduced to less than 4kb
  8. o Removed 'Big+Fast' compress code (FBITS) because of compress speedup
  9. o Portability mods for Z8000 and PC/XT (but not zeus 3.2)
  10. o Default to 'quiet' mode
  11. o Unification of 'force' flags
  12. o Manual page overhaul
  13. o Portability enhancement for M_XENIX
  14. o Removed text on #else and #endif
  15. o Added "-V" switch to print version and options
  16. o Added #defines for SIGNED_COMPARE_SLOW
  17. o Added Makefile and "usermem" program
  18. o Removed all floating point computations
  19. o New programs: [deleted]
  20. The "usermem" script attempts to determine the maximum process size. Some
  21. editing of the script may be necessary (see the comments). [It should work
  22. fine on 4.3 BSD.] If you can't get it to work at all, just create file
  23. "USERMEM" containing the maximum process size in decimal.
  24. The following preprocessor symbols control the compilation of "compress.c":
  25. o USERMEM Maximum process memory on the system
  26. o SACREDMEM Amount to reserve for other processes
  27. o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster
  28. o NO_UCHAR Don't use "unsigned char" types
  29. o BITS Overrules default set by USERMEM-SACREDMEM
  30. o vax Generate inline assembler
  31. o interdata Defines SIGNED_COMPARE_SLOW
  32. o M_XENIX Makes arrays < 65536 bytes each
  33. o pdp11 BITS=12, NO_UCHAR
  34. o z8000 BITS=12
  35. o pcxt BITS=12
  36. o BSD4_2 Allow long filenames ( > 14 characters) &
  37. Call setlinebuf(stderr)
  38. The difference "usermem-sacredmem" determines the maximum BITS that can be
  39. specified with the "-b" flag.
  40. memory: at least BITS
  41. ------ -- ----- ----
  42. 433,484 16
  43. 229,600 15
  44. 127,536 14
  45. 73,464 13
  46. 0 12
  47. The default is BITS=16.
  48. The maximum bits can be overruled by specifying "-DBITS=bits" at
  49. compilation time.
  50. WARNING: files compressed on a large machine with more bits than allowed by
  51. a version of compress on a smaller machine cannot be decompressed! Use the
  52. "-b12" flag to generate a file on a large machine that can be uncompressed
  53. on a 16-bit machine.
  54. The output of compress 4.0 is fully compatible with that of compress 3.0.
  55. In other words, the output of compress 4.0 may be fed into uncompress 3.0 or
  56. the output of compress 3.0 may be fed into uncompress 4.0.
  57. The output of compress 4.0 not compatible with that of
  58. compress 2.0. However, compress 4.0 still accepts the output of
  59. compress 2.0. To generate output that is compatible with compress
  60. 2.0, use the undocumented "-C" flag.
  61. -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85
  62. --------------------------------
  63. Enclosed is compress version 3.0 with the following changes:
  64. 1. "Block" compression is performed. After the BITS run out, the
  65. compression ratio is checked every so often. If it is decreasing,
  66. the table is cleared and a new set of substrings are generated.
  67. This makes the output of compress 3.0 not compatible with that of
  68. compress 2.0. However, compress 3.0 still accepts the output of
  69. compress 2.0. To generate output that is compatible with compress
  70. 2.0, use the undocumented "-C" flag.
  71. 2. A quiet "-q" flag has been added for use by the news system.
  72. 3. The character chaining has been deleted and the program now uses
  73. hashing. This improves the speed of the program, especially
  74. during decompression. Other speed improvements have been made,
  75. such as using putc() instead of fwrite().
  76. 4. A large table is used on large machines when a relatively small
  77. number of bits is specified. This saves much time when compressing
  78. for a 16-bit machine on a 32-bit virtual machine. Note that the
  79. speed improvement only occurs when the input file is > 30000
  80. characters, and the -b BITS is less than or equal to the cutoff
  81. described below.
  82. Most of these changes were made by James A. Woods (ames!jaw). Thank you
  83. James!
  84. To compile compress:
  85. cc -O -DUSERMEM=usermem -o compress compress.c
  86. Where "usermem" is the amount of physical user memory available (in bytes).
  87. If any physical memory is to be reserved for other processes, put in
  88. "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
  89. The difference "usermem-sacredmem" determines the maximum BITS that can be
  90. specified, and the cutoff bits where the large+fast table is used.
  91. memory: at least BITS cutoff
  92. ------ -- ----- ---- ------
  93. 4,718,592 16 13
  94. 2,621,440 16 12
  95. 1,572,864 16 11
  96. 1,048,576 16 10
  97. 631,808 16 --
  98. 329,728 15 --
  99. 178,176 14 --
  100. 99,328 13 --
  101. 0 12 --
  102. The default memory size is 750,000 which gives a maximum BITS=16 and no
  103. large+fast table.
  104. The maximum bits can be overruled by specifying "-DBITS=bits" at
  105. compilation time.
  106. If your machine doesn't support unsigned characters, define "NO_UCHAR"
  107. when compiling.
  108. If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
  109. After compilation, move "compress" to a standard executable location, such
  110. as /usr/local. Then:
  111. cd /usr/local
  112. ln compress uncompress
  113. ln compress zcat
  114. On machines that have a fixed stack size (such as Perkin-Elmer), set the
  115. stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
  116. Next, install the manual (compress.l).
  117. cp compress.l /usr/man/manl
  118. cd /usr/man/manl
  119. ln compress.l uncompress.l
  120. ln compress.l zcat.l
  121. - or -
  122. cp compress.l /usr/man/man1/compress.1
  123. cd /usr/man/man1
  124. ln compress.1 uncompress.1
  125. ln compress.1 zcat.1
  126. regards,
  127. petsd!joe
  128. Here is a note from the net:
  129. >From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985
  130. Path: ames!hplabs!pesnta!amd!turtlevax!ken
  131. From: ken@turtlevax.UUCP (Ken Turkowski)
  132. Newsgroups: net.sources
  133. Subject: Re: Compress release 3.0 : sample Makefile
  134. Organization: CADLINC, Inc. @ Menlo Park, CA
  135. In the compress 3.0 source recently posted to mod.sources, there is a
  136. #define variable which can be set for optimum performance on a machine
  137. with a large amount of memory. A program (usermem) to calculate the
  138. usable amount of physical user memory is enclosed, as well as a sample
  139. 4.2BSD Vax Makefile for compress.
  140. Here is the README file from the previous version of compress (2.0):
  141. >Enclosed is compress.c version 2.0 with the following bugs fixed:
  142. >
  143. >1. The packed files produced by compress are different on different
  144. > machines and dependent on the vax sysgen option.
  145. > The bug was in the different byte/bit ordering on the
  146. > various machines. This has been fixed.
  147. >
  148. > This version is NOT compatible with the original vax posting
  149. > unless the '-DCOMPATIBLE' option is specified to the C
  150. > compiler. The original posting has a bug which I fixed,
  151. > causing incompatible files. I recommend you NOT to use this
  152. > option unless you already have a lot of packed files from
  153. > the original posting by Thomas.
  154. >2. The exit status is not well defined (on some machines) causing the
  155. > scripts to fail.
  156. > The exit status is now 0,1 or 2 and is documented in
  157. > compress.l.
  158. >3. The function getopt() is not available in all C libraries.
  159. > The function getopt() is no longer referenced by the
  160. > program.
  161. >4. Error status is not being checked on the fwrite() and fflush() calls.
  162. > Fixed.
  163. >
  164. >The following enhancements have been made:
  165. >
  166. >1. Added facilities of "compact" into the compress program. "Pack",
  167. > "Unpack", and "Pcat" are no longer required (no longer supplied).
  168. >2. Installed work around for C compiler bug with "-O".
  169. >3. Added a magic number header (\037\235). Put the bits specified
  170. > in the file.
  171. >4. Added "-f" flag to force overwrite of output file.
  172. >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you
  173. > compile.
  174. >6. The 'uncompress' script has been deleted; simply
  175. > 'ln compress uncompress' after you compile and it will work.
  176. >7. Removed extra bit masking for machines that support unsigned
  177. > characters. If your machine doesn't support unsigned characters,
  178. > define "NO_UCHAR" when compiling.
  179. >
  180. >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a
  181. >standard executable location, such as /usr/local. Then:
  182. > cd /usr/local
  183. > ln compress uncompress
  184. > ln compress zcat
  185. >
  186. >On machines that have a fixed stack size (such as Perkin-Elmer), set the
  187. >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
  188. >
  189. >Next, install the manual (compress.l).
  190. > cp compress.l /usr/man/manl - or -
  191. > cp compress.l /usr/man/man1/compress.1
  192. >
  193. >Here is the README that I sent with my first posting:
  194. >
  195. >>Enclosed is a modified version of compress.c, along with scripts to make it
  196. >>run identically to pack(1), unpack(1), and pcat(1). Here is what I
  197. >>(petsd!joe) and a colleague (petsd!peora!srd) did:
  198. >>
  199. >>1. Removed VAX dependencies.
  200. >>2. Changed the struct to separate arrays; saves mucho memory.
  201. >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.)
  202. >>4. Sorted the character next chain and changed the search to stop
  203. >>prematurely. This saves a lot on the execution time when compressing.
  204. >>
  205. >>This version is totally compatible with the original version. Even though
  206. >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
  207. >>machine, due to the size of the arrays.
  208. >>
  209. >>Here is the README file from the original author:
  210. >>
  211. >>>Well, with all this discussion about file compression (for news batching
  212. >>>in particular) going around, I decided to implement the text compression
  213. >>>algorithm described in the June Computer magazine. The author claimed
  214. >>>blinding speed and good compression ratios. It's certainly faster than
  215. >>>compact (but, then, what wouldn't be), but it's also the same speed as
  216. >>>pack, and gets better compression than both of them. On 350K bytes of
  217. >>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80
  218. >>>seconds, and compress (herein) also took 80 seconds. But, compact and
  219. >>>pack got about 30% compression, whereas compress got over 50%. So, I
  220. >>>decided I had something, and that others might be interested, too.
  221. >>>
  222. >>>As is probably true of compact and pack (although I haven't checked),
  223. >>>the byte order within a word is probably relevant here, but as long as
  224. >>>you stay on a single machine type, you should be ok. (Can anybody
  225. >>>elucidate on this?) There are a couple of asm's in the code (extv and
  226. >>>insv instructions), so anyone porting it to another machine will have to
  227. >>>deal with this anyway (and could probably make it compatible with Vax
  228. >>>byte order at the same time). Anyway, I've linted the code (both with
  229. >>>and without -p), so it should run elsewhere. Note the longs in the
  230. >>>code, you can take these out if you reduce BITS to <= 15.
  231. >>>
  232. >>>Have fun, and as always, if you make good enhancements, or bug fixes,
  233. >>>I'd like to see them.
  234. >>>
  235. >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
  236. >>
  237. >> regards,
  238. >> joe
  239. >>
  240. >>--
  241. >>Full-Name: Joseph M. Orost
  242. >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
  243. >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
  244. >>Phone: (201) 870-5844