STATIST
Section: User Commands (1)
Updated: local
Index
Return to Main Contents
NAME
statist - calculate Huffman distribution for
freeze(1)
SYNOPSIS
statist
[
-gx...
]
DESCRIPTION
The default table is tuned for both C texts and executable files (as in
LHARC). If you will freeze any other files (natural language texts,
databases, images, fonts, etc.) you can calculate the matching
positions distribution using the
`statist'
program, which calculates and displays the mentioned
distribution for the given file. It is useful for large (100K or more)
files.
Though the built-in position table is polyvalent, the tuning can increase
the compression rate up to one additional percent. (Observed mainly on
text files.)
USAGE
statist [-g...] < sample_file
or
gensample | statist [-g...]
where
`gensample'
is a program generating some sample stream of
bytes similar to files to be frozen.
The
-g
and
-x
switches have the same meaning as for
freeze(1)
and may be repeated.
You can also see the intermediate values
and watch their changes by pressing INTR key when you wish.
Note: If you use
gensample | statist
, remember that INTR influence BOTH
processes !!
The results have the following format:
n1 n2 n3 n4 n5 n6 n7 n8
(uncertainty =
x)
Average match length:
xx.yy
Percentile 99.9:
p999
Percentile 99.5:
p995
Percentile 99.0:
p990
Percentile 97.0:
p970
Percentile 95.0:
p950
Percentile 90.0:
p900
Percentile 80.0:
p800
Percentile 70.0:
p700
Percentile 50.0:
p500
Sigma:
xx.yy
Here
n1 - n8
are values of the calculated position table elements,
uncertainty is a number which denotes validity of given results
(non-zero values of uncertainty indicate that the
results may be unusable). Other values (average match length,
percentiles and sigma) are FYI only.
You may create the
/etc/default/freeze
file (if you don't like
/etc/default/
directory, choose another - in MS-DOS it is FREEZE.CNF in
the directory of FREEZE.EXE), which has the following format:
name
=
n1 n2 n3 n4 n5 n6 n7 n8
(name
must start in column 1). For example:
---------- cut here -----------
# This is freeze's defaults file
russian=0 0 1 2 6 20 31 2 # The sample was mailx.lp (Russian)
english=0 0 1 2 7 16 36 0 # The sample was gcc.lp (English)
# End of file
---------- cut here -----------
If you find values, which are better THAN DEFAULT both for text (C
programs) and binary (executable) files, please send them to me.
Important note: statist.c is NOT a part of freeze package, it is an
aditional feature.
SEE ALSO
freeze(1),
melt(1),
fcat(1)
DIAGNOSTICS
Huffman tree has more than 8 levels, reducing...
Self-explanatory, but sometimes reducing falls into infinite loop.
xxxK
Progress indicator is written after each 4K of a file processed.
BUGS
Sometimes use of the results with uncertainty = 1 (on a file)
gives compression rate worse than default but use of the results
with uncertainty = 13 (on other file) works quite good.
Found bugs descriptions, incompatibilities, etc. please send to
leo@s514.ipmce.su.
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- USAGE
-
- SEE ALSO
-
- DIAGNOSTICS
-
- BUGS
-