.. | ||
README.md | ||
unpletter_c.c | ||
unpletter.cpp |
Unpletter is a C and C++ version of the unpacker for the Pletter compression scheme.
By Sylvain Glaize, 2024.
Distributed as the same license as the Pletter packer. See license.txt on the root folder.
Pletter format explained
Pletter is a compression scheme for the Z80 CPU, similar to ZX0 but with different choices in terms of compression and a slightly different format.
The format intertwines two streams of data:
- the byte bitstream, which present only complete bytes to read (read 8 bit by 8 bit),
- the variable bitstream, which contains numbers in a variable number of bit format.
Thus, the algorithm maintain a pair of pointers : one for the byte bitstream, and one for the variable bitstream (the pointer to the bitstream is actually itself two pointers, one for the pointed byte and one for the bit inside the byte).
When reading a byte from the byte bitstream, the algorithm will get it and advance the pointer to the next byte.
When reading a number from the variable bitstream, the algorithm with read the necessary bits, and when it depletes the current byte bits, it will jump to the next byte pointed by the byte stream pointer, and adjust this other pointer accordingly (move it one byte further).
Numbers in the variable bitstream are encoded with an interlaced elias gamma code. Elias gamma code is described here: https://en.wikipedia.org/wiki/Elias_gamma_coding.
The interlaced way of encoding it is by interleaving the bits of the first part (the unary encoded power of 2) with the bits of the second part (the binary encoded number of the rest). Also, in Pletter, the bits are inverted compared to what is described on the Wikipedia page for the unary part.
Example:
- 5 encoding is
00 1 01
(2 to the power of 2, to which is added 1) - Inverted, it is
11 0 01
(the bits are reversed) - interlaced, it is
10 11 0
. This way, the algorithm can read the bit by pairs and stops when it encounters a 0 for the first bit of the pair.
In Pletter, the variable stream contains the information on the next part to decode (0 means the next byte is to be copied as is ; 1 means it is a back reference block). It also encodes the length of the back reference block (the distance is taken from the byte stream, but can be completed by extra bits when the distance is 128 or greater).
Change by the version found in the dsk2rom
In this version, the three bits normally written at the start of the algorithm to encoded the additional bit number (q_value) for the back reference distance are ommitted.
The algorithm fixes this value to 2, and thus the alrigthm starts by copiing the first byte as literal.