Michael L Cole

Experience in Software Development


Experiencewith Specialized Data Compression

Home - About the invention Recursive data compression described in expired* U.S. Patent 5,488,364

Page 2 - Generalized recursive data compression using a recursive structure

Page 3 - Analysis of the Recursive data compression method

Page 4 - More Analytical results and Recursive data compression claims

Page 5 - Multimedia Search

Page 6 - News Headlines and News Today Search

Page 7 - Newsroom Automation Search

Page 8 - Recursive, Statistically Random Data Compression And Creating Algorithmic Incompressible Data

Page 9 - Everything Random Search

Page 10 - Data Search

Appendix A - Patents which reference Recursive Compression Patent 5,488,364

Appendix C - ANALYSIS OF COMPUTABLE REDUCIBILITY MODELS ANALOGOUS TO THAT DESCRIBED IN U.S. PATENT 5,488,364 (Long version)

Appendix C -(Short version)

This website covers some of my private recursive compression research since completing work on my most famous invention "Recursive data compression" which was described in expired* U.S. Patent 5,488,364. The Recursive data compression patent describes an original adaptive model for data compression, it teaches a method of adapting the data to fit the data compressor. Abstractly, that method is to algorithmically configure a predetermined criteria as a boolean operation key and to leverage that selective configuration in a manner indicative of encrypting but instead of encrypting the information, the method reorders the input data by directing the contextual rearrangement (intelligent morphing) of the input data into binary permutation blocks. When the intelligently designed morphed permutation blocks are combined with the boolean operation keys they result in reversibly ordered combinational blocks. As the sole inventor, I transferred fifty-percent interest in the Recursive data compression method in 1994, prior to it being patented. The transfer was an exchange viewed (by me) as a way to procure patent protection for my invention. However, I discovered in 2006 that the patent had expired in 2000, because the assignee of the patent had decided to not pay the fees to maintain the patent, thereby, rendering my investment virtually worthless. So I have worked alone on the ideas expressed here since 2007, in a reinvestment of my individual effort.

A little reverse engineering and computable reducibility models analogous to the method illustrate that it extended ranges of file permutations which can be *-zipped by up to 153,000 percent. In 1994, that method introduced the concept of recursively splitting binary segments (into a sequence of packets) as a transform mechanism that allows the resulting binary subdivisions to compress. The Recursive data compression method demonstrates (and is complemented by the accompanying detailed procedures) that created redundancy is in part the result of the lossless conversion of n symbols to n-1 (minus one or greater) symbols, without compensation for a one byte keyword which is the methods aggregate overhead. One symbol consists of a unique fixed-length string of binary digits.

unique partition structure

Systematical Background: Forming a recursive structure by reversibly setting, splitting and re-configuring composite binary input.

patent quote explains process of bit progression

Illustration A - As in the Recursive data compression patent Series 2 (below) was formed by decomposing the compositional structure of the eight binary digits that make-up the eight-bit symbols of Series 1 (defined below). The new composition was formed by the distribution of the decomposed binary structure of Series 1 by being keyword configured positionally (bits 2^7 through 2^0) and sequentially (bits 1-8 each byte) in order of occurrence (high to Low). The newly formed symbology reduces the number of symbols by approximately 98% from 256 to five while retaining a completely lossless system of numerical notation.

Chart 2 - n256 to n5

Table n256 to n5

Series iteration

n256 to n4

chart n256 to n4

Series 1 Data

modulability mapping

Recursive compression and Intrinsic Order

The lossless "Recursive data compression" method described in U.S. Patent 5,488,364 uses a keyword (8 bits) to reconfigure composite binary information. The recursive configuration is accomplished by separating bits into pairs of blocks (a and b) each part having a zero to eight bit range, equaling 16 bits of bilateral distribution. The binary digit one in the keyword represents a logical operation, a zero represents the inverse or no operation (ignored).

Below is the mathematical logic behind a Keyword formulation (8 bits) as an agency for reversible decomposability (Table 1a-1b)

mathematic definition showing re-configuration keyword 256.

mathematic definition showing re-configuration keyword 256b.

mathematic definition showing re-configuration keyword 256c.

Intricate bit distributions are established based on the keyword configured patterns. "The patterns may be generated and compared in any suitable manner, and the method of counting through all possible numeric values is only one example of a suitable method" according to the patent.

Information which is input is systematically configured according to this keyword by being reduced to variable length clustered packets and then distributed into one or more distinct pairs and tracked by a reversible key mask notation.

block function

Lblock function

According to the patent, the "reconfiguring method comprises the steps of receiving input data; manipulating the input data in a manner that ensures that a majority of bits in the input data have a predetermined binary state; forming one or more keywords; and forming one or more pairs of blocks."

By mapping eight bits of information into nine bits of space (example below) the resulting logical map has a different entropy (by means of a simplified binary composition which is achieved by recursively splitting binary segments as a transform mechanism) than that which existed preceding the use of the re-configuration method. According to the patent, an effort is then made to compress the reordered composition "using any suitable compression method known in the art." To recursively compress data "these steps are repeated until the desired compression ratio is attained or until no further increases are observed in the compression ratio." Also, the re-configuration may be performed "before any compression has been performed" and "need not utilize the same compression algorithm on each iteration."

Table 1a

showing re-configuration block result.

Table 1b - Reversible decomposability of bytes into differentiated coupled sets (variable length strings) relating to a system of numeration having 2 as its base.

decomposability table

1020 clustered bit states.

1020 total A and B clustered bit states (columns 5 and 6 in Table 1a )

1020 clustered bit states.

1020 clustered bits states

510 clustered bit states.

510 total A or B clustered bit states (column 5 or 6 in Table 1a )

510 clustered bit states.

510 clustered bits states

Evaluation table

Table of 510 clustered bit states.

Evaluation table shows 510 sets by extracting column 2 elements from column 1 elements. Demonstrates logical keyword function.

AB clustered bit states.

2304 total AB clustered bit states (column 7 in Table 1a)

AB clustered bit states divided by key variation

Total AB clustered bit states (column 7 total in Table 1a) divided by keyword variation (shown on top of this page and illustrated across in Table 1a) equals nine.

A and B divergence of clustered bits.

Block A and B divergence of clustered bits (columns 1 and 2 in Table 1a) total 72 bits per Recursive data keyword.

Block A and B divergence of 72 clustered bit states.

Illustration 2a binary block segments

This binary variation is derived from the 52-byte string (top line of example). It demonstrates the lossless conversion from 27 symbols (top line) to 12 symbols (binary block and below).

showing 27:12 symbol conversion

showing symbol conversion equ

showing symbol conversion x equals

This single recursive data compression variation (representative, see Table 1a) has many opportunities to succeed in microcosm; on strings of 256 bytes in length, containing 256 different symbols forming random sequences.

The probability of matching at least one bit per eight bits is .(point)996094

for 256 symbols.

A 256 byte string comprised of 256 repeatable symbols each randomly selected at a random probability of 0.(point)996094 has a probability of not matching one bit (per eight bits) zero to three occurrences per string at probability of 0.(point)981252 or algebraically:

256 symbols (point)981252 or algebraically:

Individual probability of the number of times that there is no match per 256 byte string comprised of the (as described) 256 repeatable symbols.

individual

The table below shows the random probabilities of the distribution of bytes from Column A and Column B. No match (as described) occurs each time a byte value from Column A (0-127) has an adjoining invertible match (i.e. 0, 255), this requires a byte value from Column B (128-255).

Probabilities Table for 8 bytes

The input for the next multi-part illustration is the first twelve characters from Illustration 2a which demonstrated that redundancy can be formed as a result of the recursive conversion of n symbols to n-1 (one or greater) symbols.

Illustration 2b

example

In the example the conversion steps through a succession of recursive compression stages from seven eight bit (sub-division to seven 4 bit) symbols to nine eight bit (sub-division to five 4 bit) symbols.

Illustration 2b-2

Example of sub-division to five (4 bit symbols)

example 5 symbols

Same data as used in Illustration 2b

example sub 7 symbols

Illustration 2b-3

The patterns of succession can also be counted as they progress further through recursive sub-divisions and symbol stages.

example 6 symbols

Example of recursive structure conversion step to six (12 bit symbols) and recursive sub-division to seven (6 bit symbols)

Same data as used in Illustration 2b

An analysis of Recursive data compression (using the most primitive 8-bit keyword structure, no Sub-Structure (Illustration 1.g, 1.h) and utilizing only *-zip files) revealed that for every uncompressed file that compresses (at least three bytes using *-zip) there are 255 other file permutations which will compress equally ( 2 bytes).

Despite the fact that *-zipping a file ("a") which is zero bytes in length results in a *-zip file a minimum of twenty-two bytes in length, and despite the fact that I consider that to be a prima facie argument that all *-zip files are in theory compressible, I did not interpose any adaptation for that into this equational analysis.

Notwithstanding, there was an aggregate increase of 25,500 percent more *-zip files when using Recursive data compression. Additionally, for each *-zip file that re-compresses (by at least three bytes using *-zip) there are an additional 255 file permutations which will compress equally ( 2 bytes). i.e., If an arbitrary proportional assignment of one-percent of N length *-zip files can be re-compressed by at least three bytes using *-zip then the number of file permutations that can be compressed by at least 1 byte is (255 * 1%) or 2.55 times the aggregate number of *-zip files which are N bytes in length. That is another increase of 25,500 percent.

patent quote references step 8, reorder, reconfigure

Using the reverse engineering in this example allows the statistics to be traced. i.e., Assume a one megabyte file has a *-zip image which is a quarter-megabyte in length. Then 256 copies of the *-zip image are made, each with a unique 8-bit keyword appended to it, plus one optional ancillary byte. Each new quarter-megabyte file would then represent one (of 256) unique permutation of the original file. The result is a corresponding compression ratio of 4:1 for each of these 256 files.

patent quote explains process of bit progression

Therefore, if *-zip can compress an arbitrary proportional assignment of one-percent (1%) of all uncompressed files N bytes in length (for every N) by at least three bytes then the least number of Recursive *-zip files that can be compressed at the same compression ratio using "Recursive data compression" is (256 * 1%) or 2.56 times (25,600 percent) the aggregate number of all *-zip files. The ratio is not changed when using a more realistic proportional assignment than one-percent.

In addition, if these *-zip files re-compress (referenced above) then there are at least 65534 extra compressible file permutations (not 255) in individual relation to these files, provided that they re-compressed by at least three bytes even if these files will not re-compress again.

unique partition structure

illustration 1.a

ifsp-pq2

Illustration 1.g Sub-Structure IFSP - - Analysis (above) of quasi-optimum compression using IFSP Sub-Structure nets (3*25500) 76,500 percent more Recursively compressible *-zip files.

IFSP Optimal

vusp

Illustration 1.h Sub-Structure VUSP - Analysis (above) of quasi-optimum compression using VUSP Sub-Structure nets (6*25500) 153,000 percent more Recursively compressible *-zip files.

VUSP Optimal

The following definition is an abstract of how a method built on a secondary compressibility model was conceived (as I currently view it) from the computable algorithmic perspective. It illustrates how one model of structure consolidation in order to achieve computable reducibility is analogous in structure to "Recursive data compression."

computable reducibility model of structure consolidation.

illustration 4.g - Equational analysis of quasi-optimum compression for the recursive computable reducibility model

Equational analysis

Partition delineated for 4 bits (below).

method results for n = 4.

illustration 4.a (above)

computable reducibility model of structure consolidation.

Computable reducibility model of recursive structure consolidation delineated for 4 bits by illustration 4.a - 4.b in contrast with "Recursive data compression" secondary compressibility model (refer to illustration 1.a).

computable algorithmic definition n=n.

The computable reducibility model of structure consolidation is implemented by recursively splitting binary segments as a transform mechanism and in its simplest form is analogous to the created redundancy strategy implemented by the recursive data compression secondary compressibility model when the created redundancy is (in part) the result of the lossless conversion of n symbols to n-1 (minus one or greater) symbols.

Conversely, when the created redundancy of the secondary compressibility model is relied upon (in part) as the result of the lossless conversion of n symbols to n 1 (plus one or greater) symbols it becomes clear that the computable reducibility model of structure consolidation is analogous to (and must be implemented at the time of) keyword formulation. Therefore, I suggest that both sides of the analogy should be taken under advisement when devising variations to algorithms for formulating the keyword for Recursive data compression.

computable reducibility model of structure consolidation.

Computable reducibility model of structure consolidation for 8 bits (delineated for 4 bits by illustration 4.a - 4.b) in contrast with "Recursive data compression" secondary compressibility model (illustration 1.a).

computable algorithmic definition n=n.

As described in U.S. Patent 5,488,364, Recursive data compression, recursively splitting binary segments as a transform mechanism can create redundancy and as stated above this created redundancy is in part the result of the lossless conversion of n symbols to n-1 (minus one or greater) symbols. But as described in U.S. Patent 5,488,364 you can also recursively compress a file using *-zip where the file compresses (a program reported) one-hundred percent and the resulting *-zip file will also compress another (program reported) seventy to eighty-plus percent. This is accomplished by escalating the number symbols (n 1 or greater). This is more difficult to rationalize but the principles are the same. Therefore, the first subject will be, recursively compressing binary subdivisions that compress because they consist of less symbols.

One symbol consists of a unique fixed-length string of binary digits. A block of symbols constitutes binary subdivision. In this scenario the resulting binary subdivisions compress because of they are less complex because they consist of less symbols.

In the table below where column "A" holds numeral values greater than one. The relationship that binds the terms in column "B" to the corresponding value in the column "A" is formed by their equivalence (the sum of their parts). There are far fewer values in column "A" than there are (formed by sets) in column "B." This suggests that there are far fewer irreducible values (the sum of the terms) than there are reducible (non-random) values (the composition of the terms that equal the sum).

Table of sums for complexity theory

The structure above is "a partition" and Peter Gustav Lejeune Dirichlet's (1805-1859) "Pigeonhole Principle." (Also known as "Dirichlet's Principle", "Dirichlet's Drawer Principle" and "Dirichlet's Letter Box Principle") communicates that the partition of a set of "n" objects into fewer than "n" (single object) mutually exclusive subsets is not possible.

Arithmetic based partition structures and recursive compression may seem to have no place in modern reducibility theory, especially, since the use of arithmetic partitions based on the sum of their parts will propose an inconsistent presumption as to the number of irreducible values (random incompressible strings). A suitable homogeneous remedy for this would be to provide the same structure using terms that produce a product instead of a sum.

Table of products for complexity theory

The product table produces more irreducible values (random incompressible strings) 50/50 as anticipated. This mirrors the notions of contemporary algorithmic theories of reducibility.

To see exactly how all this works with illustrated examples, please read about reducing the number of symbols in a string using lossless recursive data compression.

When the example uses n=4 the correspondence results in 12 extra members.

(n = 4) 28 members

illustration 4.b (above)

Illustration of recursive compression utilizing a designed recursive structure reduces the number of symbols in a string by ninety-eight percent (98%)

Series 1 Data

Illustration shows how recursive compression utilizing a designed recursive structure reduces the number of symbols in a string by ninety-eight percent (98%)

According to the US Patent Office, expired* U.S. Patent 5,488,364, is referenced by the following patents:

 

1 - 8,364,836 Withdrawn, System and methods for accelerated data storage and retrieval

2 - 8,326,984, Selective compression for network connections

3 - 8,321,445, Generating content snippets using a tokenspace repository

4 - 8,275,909, Adaptive compression

5 - 8,275,897, System and methods for accelerated data storage and retrieval

6 - 8,117,173, Efficient chunking algorithm

7 - 8,112,619, Systems and methods for accelerated loading of operating systems and application programs

8 - 8,112,496, Efficient algorithm for finding candidate objects for remote differential compression

9 - 8,090,936, Systems and methods for accelerated loading of operating systems and application programs

10 - 8,073,926, Virtual machine image server

11 - 8,073,047, Bandwidth sensitive data compression and decompression

12 - 8,054,879, Bandwidth sensitive data compression and decompression

13 - 8,024,483, Selective compression for network connections

14 - 8,010,668, Selective compression for network connections

15 - 7,882,084, Compression of data transmitted over a network

16 - 7,873,065, Selectively enabling network packet concatenation based on metrics

17 - 7,849,462, Image server

18 - 7,783,781, Adaptive compression

19 - 7,777,651, System and method for data feed acceleration and encryption

20 - 7,714,747, Data compression systems and methods

21 - 7,705,753, Methods, systems and computer-readable media for compressing data

22 - 7,613,787, Efficient algorithm for finding candidate objects for remote differential compression

23 - 7,555,531, Efficient algorithm and protocol for remote differential compression

24 - 7,370,120, Method and system for reducing network latency in data communication

25 - 7,321,952, System and method for data phase of memory and power efficient mechanism for fast table lookup

26 - 7,296,114, Control of memory and power efficient mechanism for fast table lookup

27 - 7,296,113, Memory and power efficient mechanism for fast table lookup

28 - 7,292,162, Data coding system and method

29 - 6,301,394, Method and apparatus for compressing data

30 - 6,008,657, Method for inspecting the elements of piping systems by electromagnetic waves

31 - 5,977,889, Optimization of data representations for transmission of storage using differences from reference data

32 - 5,666,560, Storage method and hierarchical padding structure for direct access storage device (DASD) data compression

33 - 5,627,534, Dual stage compression of bit mapped image data using refined run length and LZ compression

Continue

Recursive data compression, Random Data compression and Dirichlet's Pigeonhole Principle

pigeonhole principle equation

Random data compression is like a magician pulling two rabbits out of one hat.

FACT: THE PHRASE RANDOM DATA COMPRESSION NEVER APPEARS IN THE RECURSIVE DATA COMPRESSION PATENT. IT DOES NOT EVEN CONTAIN THE PHRASE RANDOM DATA.

If the pigeonhole principle counting argument equation as applied in many textbooks for recursive compression is not disproved by my following equation, then I guess that I am a dummy. Because the counting argument equation below clearly shows that (as applied in many textbooks for recursive compression) the pigeonhole principle counting argument equation does not work in at least one instance.

 

pigeonhole principle proof equation 2

Reference: Intrinsic order of dispersion in recursive compression, dual combinational elements for KEYWORD: "00000000" with "11111111" in any value dispersion table, illustrating two empty (null) bit states (containing no elements and no members) equivalent to one bit.

* According to USPTO.GOV, Patent 5,488,364, Serial Number 08/203,998, expired January 30, 2000.

nonarithmetic-operations dot com (nonarithmetic-operations.com)(tm) are trademarks of Michael L Cole for computer and software related products and services.

stoneageworkshop dot com (stoneageworkshop.com)(tm) are trademarks of Michael L Cole for computer and software related products and services.

Michael Cole, saw202@nonarithmetic-operations.com

(c) Copyright 2007-2011 - Cole, Michael L., All Rights Reserved.

Country of origin, United States of America