Random linear fountain code versus pre-computed collection of “good” checksum packets of constant degree

Random linear fountain code versus pre-computed collection of “good” checksum packets of constant degree

Emin Gabrielyan

2006-04-06

Abstract:

In order to deliver a file through a lossy network we search for a pre-computed “good” set of various checksum packets each representing a distinct subset of the file’s source packets. We prepare much more checksum packets than there are source packets in the file, and we transmit those checksum packets instead of the source packets themselves. We show that the losses are recoverable as far as the receiver collects a slightly more checksum packets as there are packets in the source file.

Introduction:

A file can be chopped into equally sized K packets. We can transmit to the receiver the K source packets of the file, but we can also transmit checksum packets of various subsets of those K source packets. If each transmitted checksum packet is computed from randomly selected source packets, then the receiver will be able to recover the file (solving a system of linear equations) after collection of an average of only packets. This method is called random linear fountain code. With the random linear fountain code the sender can perpetually generate different checksum packets, some of which can be lost in the communication channel. Irrespectively to the number of source packets in the file, in order to completely recover the file, the receiver needs an average excess of only 1.6 packets.

An alternative to the random linear fountain code is fixing to a constant the degree of all transmitted checksum packets. The degree of a checksum packet is the number of source packets involved in computation of that checksum packet. The average number of excess packets needed for the successful decoding depends on the given constant degree and also on the number of source packets in the file. For a file with 30 source packets, the chart below shows the average number of excess packets as a function of the degree of random checksum packets.

[ xls | csv | cpp ]

A similar chart is given for a file chopped into 20 packets:

[ xls | csv | cpp ]

Instead of generating random checksum packets of a constant degree d the sender can pre-compute a “good” set of checksum packets all having the desired degree d. For this purpose we cyclically order all source packets. Each checksum packet is placed in association with one source packet in the cycle. The d source packets for computing a given checksum packet are determined by d integer shifts from the associated source packet. Thus with d shift values we can create K checksum packets, one for each associated source packet. Usually the first shift value is 0, such that the associated source packet is directly involved in the computation of the corresponding checksum packet. With another set of d shift values we can create yet another set of K checksum packets. We provide examples in which the sender prepares checksum packets three times as much as there are source packets in the file. Thus to prepare all required checksum packets we need three distinct sets of d shift values, which are generated randomly. Each set of pre-computed checksum packets is evaluated by the average number of excess packets needed for the successful recovery of the file. For transmitting 20 source packets, the chart below shows several hundreds distinct pre-computed sets of checksum packets of degree 5. They are sorted in a descending order according the average number of excess packets needed for the successful decoding.

[ xls | txt | cpp ]

A similar chart for checksum packets of degree 7 for a file with 30 source packets is presented in the figure below:

[ xls | csv | cpp ]

For the best pre-computed set of checksum packets we build the file recovery probability curve as a function of the number of excess packets. For the comparison we provide the recovery probability curves for the random linear fountain code and for random checksum packets of a constant degree. We also provide the recovery probability curve when the transmitted packets are random copies of the best pre-computed collection of checksum packets (i.e. repetitions are possible). The chart below shows the decoding probability curves for a file chopped into 10 packets and transmitted with checksums of degree 3:

[ xls | cpp | out ]

The pre-computed set of checksum packets performs best of all. However, the method randomly transmitting the copies of the same pre-computed set has the weakest decoding performance.

Similarly, the chart below shows the decoding probability curves for a transmission of 20 source packets with checksums of degree 5:

[ xls | cpp | out ]

Finally, the chart below shows the decoding probability curves for a transmission of 30 source packets with checksums of degree 7:

[ xls | cpp | out ]

Conclusion:

For transmission of a file over a lossy network, we propose a method for pre-computing a “good” set of checksum packets all having a constant degree. This method performs better than the random linear fountain code; however the number of transmitted packets is limited.

Related links:

- a C++ program generating various pre-computed sets of checksum packets and computing their average number of excess packets for a successful decoding [ cpp | csv | xls ]

- Decoding with Luby Transform (LT) method and recording to file are commented in the previous version, the uncommented version is here [cpp.txt]

- a C++ program computing the decoding probability curves for 30 source packets transmitted with pre-computed “good” set of checksums of degree 7 [ cpp | out | xls ]

- Decoding with LT method is commented in the previous version, the uncommented version is here [cpp.txt]

- MS Word version of this page [doc]

- Michael Luby, “LT codes”, The 43rd Annual IEEE Symposium on Foundations of Computer Science 2002, FOCS’02, November 16 – 19, pages 271 – 280 [Luby02]

- David J. C. MacKay, “Fountain codes”, Communications, IEE Proceedings Volume 152 Issue 6, December 2005, pages 1062 – 1068 [MacKay05]

- Gaussian elimination [ MathWorld | PlanetMath | Wikipedia ]

- Comparison of random linear fountain with LT method [ ch | us ]

* * *

US – Mirror

CH – Mirror