D:\_KAZE\LZ_predator>dir
Volume in drive D is S640_Vol5
Volume Serial Number is F85D-148B
Directory of D:\_KAZE\LZ_predator
07/25/2012 04:21 AM
.
07/25/2012 04:21 AM ..
07/25/2012 04:23 AM 401 FOUR_EXEs.bat
07/25/2012 04:23 AM 103,936 Leprechaun_x-leton_32bit_01_01p.exe
07/25/2012 04:23 AM 104,448 Leprechaun_x-leton_32bit_02_01p.exe
07/25/2012 04:23 AM 134 LZpredator_COMPILE_Intel_32bit.bat
07/25/2012 04:23 AM 134 LZpredator_COMPILE_Intel_x64.bat
07/25/2012 04:23 AM 149 LZpredator_COMPILE_Microsoft_32bit.bat
07/25/2012 04:23 AM 149 LZpredator_COMPILE_Microsoft_x64.bat
07/25/2012 04:23 AM 49,703 LZ_predator.cpp
07/25/2012 04:23 AM 97,280 LZ_predator_Intel_O3_32bit.exe
07/25/2012 04:23 AM 108,544 LZ_predator_Intel_O3_64bit.exe
07/25/2012 04:23 AM 82,432 LZ_predator_Microsoft_v16_Ox_32bit.exe
07/25/2012 04:23 AM 92,160 LZ_predator_Microsoft_v16_Ox_64bit.exe
07/25/2012 04:23 AM 1,020 n-gramming.bat
07/25/2012 04:23 AM 54,376,795 OSHO.TXT.LZMM
14 File(s) 55,017,285 bytes
2 Dir(s) 29,885,386,752 bytes free
D:\_KAZE\LZ_predator>n-gramming.bat
Making 1-grams and 2-grams out of OSHO.TXT ...
LZ_predator, revision 3, written by Kaze, in fact a modified LZPRE originally written by Matt Mahoney.
HTSIZE: 268435456slots x 4bytes
Decompressed stream being delivered at 7 MB/s
Compressed stream being decompressed at 1 MB/s
Leprechaun_singleton (Fast-In-Future Greedy n-gram-Ripper), rev. 15FIXFIX, written by Svalqyatchx.
Purpose: Rips all distinct 1-grams (1-word phrases) with length 1..31 chars from incoming texts.
Feature1: All words within x-lets/n-grams are in range 1..31 chars inclusive.
Feature2: In this revision 128MB 1-way hash is used which results in 16,777,216 external B-Trees of order 3.
Feature3: In this revision 1 pass is to be made.
Feature4: If the external memory has latency 99+microseconds then !(look no further), IOPS(seek-time) rules.
Pass #1 of 1:
Size of input file with files for Leprechauning: 29
Allocating HASH memory 134,217,793 bytes ... OK
Allocating memory 381MB ... OK
Size of Input TEXTual file: 206,908,949
-; 04,565,286P/s; Phrase count: 31,957,006 of them 58,893 distinct; Done: 64/64
Bytes per second performance: 29,558,421B/s
Phrases per second performance: 4,565,286P/s
Time for putting phrases into trees: 7 second(s)
Flushing UNsorted phrases: 100%; Shaking trees performance: 00,117,786P/s
Time for shaking phrases from trees: 1 second(s)
Leprechaun: Current pass done.
Total memory needed for one pass: 5,511KB
Total distinct phrases: 58,893
Total time: 7 second(s)
Total performance: 4,565,286P/s i.e. phrases per second
Leprechaun: Done.
Leprechaun_doubleton (Fast-In-Future Greedy n-gram-Ripper), rev. 15FIXFIX, written by Svalqyatchx.
Purpose: Rips all distinct 2-grams (2-word phrases) with length 5..41 chars from incoming texts.
Feature1: All words within x-lets/n-grams are in range 1..31 chars inclusive.
Feature2: In this revision 128MB 1-way hash is used which results in 16,777,216 external B-Trees of order 3.
Feature3: In this revision 1 pass is to be made.
Feature4: If the external memory has latency 99+microseconds then !(look no further), IOPS(seek-time) rules.
Pass #1 of 1:
Size of input file with files for Leprechauning: 29
Allocating HASH memory 134,217,793 bytes ... OK
Allocating memory 381MB ... OK
Size of Input TEXTual file: 206,908,949
\; 01,795,528P/s; Phrase count: 26,932,927 of them 1,558,906 distinct; Done: 64/64
Bytes per second performance: 13,793,929B/s
Phrases per second performance: 1,795,528P/s
Time for putting phrases into trees: 15 second(s)
Flushing UNsorted phrases: 100%; Shaking trees performance: 00,519,635P/s
Time for shaking phrases from trees: 6 second(s)
Leprechaun: Current pass done.
Total memory needed for one pass: 169,082KB
Total distinct phrases: 1,558,906
Total time: 21 second(s)
Total performance: 1,282,520P/s i.e. phrases per second
Leprechaun: Done.
...
0,005,084 lao
...
0,002,545 tao
0,000,135 taoism
0,000,339 taoist
0,000,001 taoistic
0,000,121 taoists
...
Volume in drive D is S640_Vol5
Volume Serial Number is F85D-148B
Directory of D:\_KAZE\LZ_predator
07/25/2012 04:24 AM 2,968,813 OSHO_contexts_Lao.txt
07/25/2012 04:24 AM 1,409,448 OSHO_contexts_Tao.txt
2 File(s) 4,378,261 bytes
0 Dir(s) 29,634,269,184 bytes free
D:\_KAZE\LZ_predator>type OSHO_contexts_Lao.txt|more
LZ_predator, revision 3, written by Kaze, in fact a modified LZPRE originally written by Matt Mahoney.
HTSIZE: 268435456slots x 4bytes
Allocating 768MB ... OK
Decompressed stream being delivered at 72 MB/s
Compressed stream being decompressed at 19 MB/s
Number Of Lines: 2459508
Longest Line: 162
Original/Compressed ratio: 3.81:1
Performance of memcpy() for block 206908949 bytes in length: 3853 MB/s
Writing the decompressed stream at once ...
Input Pattern (it is case-sensitive; hit only 'Enter' to skip):
Context #0,000,000,001 (480bytes or less long) holding the 'Lao' pattern:
[...lternating
every month between Hindi and English. His discourses offer insights into all the major spiritual
paths, including Yoga, Zen, Taoism, Tantra and Sufism. He also speaks on Gautam Buddha, Jesus,
Lao Tzu, and other mystics. These discourses have been collected into over 300 volumes and
10/28/07 Copyright Osho International Foundation 1994
Osho's books on CD-ROM, publis...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,002 (480bytes or less long) holding the 'Lao' pattern:
[...can jump, just like a cat
jumps on a mouse and catches hold of it.
Truth cannot be delivered, there is no way to deliver it. Once delivered it is dead, it has already
become untrue.
Lao Tzu insisted on not saying anything about the truth his whole life. Whenever someone asked
about truth he would say many things, but he would not say anything about the truth; he would avoid
it. In the end he was...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,003 (480bytes or less long) holding the 'Lao' pattern:
[...
it. In the end he was forced to say something. Disciples, lovers, said he should write because he had
known something which was rarely known, he had become something which was unique -- there
would be no Lao Tzu again. So he wrote a small book, Tao Te Ching, but the first thing he said in it
was, "Tao cannot be said, Truth cannot be uttered. And the moment you utter it, it is already false."
And then he said, "Now I ...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,004 (480bytes or less long) holding the 'Lao' pattern:
[... GIVEN WITH WORDS I HAVE
GIVEN TO YOU; BUT WITH THIS FLOWER, I GIVE TO MAHAKASHYAP THE KEY TO THIS
TEACHING."
To all teachings, not only for a Buddha but for all masters -- Jesus, Mahavira, Lao Tzu -- the key
cannot be given through verbal communication, the key cannot be delivered through the mind.
Nothing can be said about it. The more you say the more difficult it becomes to deliver, because a
...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,005 (480bytes or less long) holding the 'Lao' pattern:
[... incident where he behaved illogically, where he did something which was mysterious. He was not a
mysterious man at all. You cannot find another master who was less mysterious.
Jesus was very mysterious, Lao Tzu was absolutely mysterious. Buddha was plain, transparent;
no mystery surrounds him, no smoke is allowed. His flame burns clear and bright, absolutely
transparent, smokeless. This was the only thing that seeme...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,006 (480bytes or less long) holding the 'Lao' pattern:
[...ained in it. Those are
utterances of tremendous value, but no philosophy is woven around them, no system is created.
Those are atomic utterances. And the substratum of them all is that nothing can be said. Just like Lao
Tzu's TAO TE KING: The Tao that can be uttered is no longer Tao. The truth that is said is no
longer truth. Truth said becomes untrue -- said and it becomes false. Now what to do? How to
understand?
...] /OSHO.TXT (197MB) discourses/
Context #0,000,000,007 (480bytes or less long) holding the 'Lao' pattern:
[...y else --
^C
D:\_KAZE\LZ_predator>LZ_predator_Microsoft_v16_Ox_32bit.exe
LZ_predator, revision 3, written by Kaze, in fact a modified LZPRE originally written by Matt Mahoney.
Usage: LZ_predator c|C|d|D input output
Note1: The appropriate extension is .LZMM.
Note2: Option 'C' compresses a bit worse but with Near-Distance priority.
Note3: Option 'D' allows to search into decompressed in RAM text content (CRLF, LF endings allowed but no ASCII code 000) and to print all contexts holding a specified pattern.
Example:
C:\Program Files (x86)\Monstrous_Jesters\LZpredator>LZ_predator_Microsoft_v16_Ox_64bit.exe D OSHO.TXT.LZMM "OSHO.TXT (197MB) discourses"
LZ_predator, revision 3, written originally by Matt Mahoney, modified by Kaze.
HTSIZE: 268435456slots x 4bytes
Allocating 768MB ... OK
Decompressed stream being delivered at 84 MB/s
Compressed stream being decompressed at 22 MB/s
Number Of Lines: 2459508
Longest Line: 162
Original/Compressed ratio: 3.81:1
Performance of memcpy() for block 206908949 bytes in length: 3798 MB/s
Writing the decompressed stream at once ...
Input Pattern (it is case-sensitive; hit only 'Enter' to skip): provide access
Context #0,000,000,001 (480bytes or less long) holding the 'provide access' pattern:
[...
electronic repository of understanding and knowledge on meditation and its techniques. It is also
much more, a complete, world view of the New Man and a new way of life. The purpose of this
CD-ROM is to provide access to Osho's words, ideas and vision, and to make them available to as
many people as possible.
...] /OSHO.TXT (197MB) discourses/
C:\Program Files (x86)\Monstrous_Jesters\LZpredator>
D:\_KAZE\LZ_predator>