Discussion:
Block compression
Add Reply
Jeff Johnson via bitcoin-dev
2017-11-27 02:11:27 UTC
Reply
Permalink
Raw Message
I'm new to this mailing list and apologize if this has been suggested
before. I was directed from the Bitcoin core github to this mailing list
for suggestions.

I'd just like to post a possible solution that increases the amount of data
in a block without actually increasing the size on disk or the size in
memory or the size transmitted over the Internet. Simply applying various
compression algorithms, I was able to achieve about a 50% compression
ratio. Here are my findings on a recent Bitcoin block using max compression
for all methods:

Raw block
998,198 bytes

Gzip
521,212 bytes (52% ratio)
(needs 2MB to decompress).

LZMA
415,308 bytes (41% ratio)
(1MB dictionary, needs 3MB to decompress)

- ZStandard: 469,179 bytes (47% ratio)
(1MB memory to decompress)

- LZ4: 641,063 bytes (64% ratio)
(32-64K to decompress)

The compression time on my modest laptop (2 years old) was "instant". I ran
all from the command line and did not notice any lag as I pressed enter to
do the compression, so easily less than a second. But compression time
doesn't matter, decompression time is what matters as blocks will be
decompressed billions of times more than they will be compressed.
Decompression speed for LZ4 is the fastest of the above methods, at 3.3GB /
second, slightly less than half the speed of memcpy, see char at (
https://github.com/lz4/lz4).

If decompression speed, CPU and memory usage is a concern, LZ4 is a no
brainer. You basically get a 33% larger block size for "free". But
ZStandard, in my opinion, makes the most sense as it offers greater than
50% compression ratio with a very good decompression ratio of 900MB /
second.

If this were implemented in the Bitcoin protocol, there would need to be a
place to specify the compression type in a set of bits somewhere, so that
future compression algorithms could potentially be added.

Miners could do nothing and keep sending blocks as is, and these blocks
would have "no compression" as the type of compression, just as today. Or
they could opt in to compress blocks and choose how many transactions they
want to stuff into the block, keeping the compressed size under the limit.

The bitcoin client code would also need to be able to handle the
appropriate compression bits, and limits of signature data, etc. modified
to deal with the compression.

I understand schnorr signatures are on the roadmap as a 25% compression
gain which is great, I suspect that schnorr signatures would compress even
further when compressed with the above compression methods.

Here is a link to the block that I compressed:
https://mega.nz/#!YPIF2KTa!4FxxLvqzjqIftrkhXwSC2h4G4Dolk8dLteNUolEtq98

Thanks for reading, best wishes to all.

-- Jeff Johnson
lonsdale aseaday via bitcoin-dev
2017-11-27 02:32:04 UTC
Reply
Permalink
Raw Message
Hi, Block compression brings some problems witch need to check and you can visit:
https://bitcointalk.org/index.php?topic=88208.0 and https://bitcointalk.org/index.php?topic=204283.0

________________________________________
发件人: bitcoin-dev-***@lists.linuxfoundation.org <bitcoin-dev-***@lists.linuxfoundation.org> 代表 Jeff Johnson via bitcoin-dev <bitcoin-***@lists.linuxfoundation.org>
发送时间: 2017年11月27日 10:11
收件人: bitcoin-***@lists.linuxfoundation.org
主题: [bitcoin-dev] Block compression

I'm new to this mailing list and apologize if this has been suggested before. I was directed from the Bitcoin core github to this mailing list for suggestions.

I'd just like to post a possible solution that increases the amount of data in a block without actually increasing the size on disk or the size in memory or the size transmitted over the Internet. Simply applying various compression algorithms, I was able to achieve about a 50% compression ratio. Here are my findings on a recent Bitcoin block using max compression for all methods:

Raw block
998,198 bytes

Gzip
521,212 bytes (52% ratio)
(needs 2MB to decompress).

LZMA
415,308 bytes (41% ratio)
(1MB dictionary, needs 3MB to decompress)

- ZStandard: 469,179 bytes (47% ratio)
(1MB memory to decompress)

- LZ4: 641,063 bytes (64% ratio)
(32-64K to decompress)

The compression time on my modest laptop (2 years old) was "instant". I ran all from the command line and did not notice any lag as I pressed enter to do the compression, so easily less than a second. But compression time doesn't matter, decompression time is what matters as blocks will be decompressed billions of times more than they will be compressed. Decompression speed for LZ4 is the fastest of the above methods, at 3.3GB / second, slightly less than half the speed of memcpy, see char at (https://github.com/lz4/lz4).

If decompression speed, CPU and memory usage is a concern, LZ4 is a no brainer. You basically get a 33% larger block size for "free". But ZStandard, in my opinion, makes the most sense as it offers greater than 50% compression ratio with a very good decompression ratio of 900MB / second.

If this were implemented in the Bitcoin protocol, there would need to be a place to specify the compression type in a set of bits somewhere, so that future compression algorithms could potentially be added.

Miners could do nothing and keep sending blocks as is, and these blocks would have "no compression" as the type of compression, just as today. Or they could opt in to compress blocks and choose how many transactions they want to stuff into the block, keeping the compressed size under the limit.

The bitcoin client code would also need to be able to handle the appropriate compression bits, and limits of signature data, etc. modified to deal with the compression.

I understand schnorr signatures are on the roadmap as a 25% compression gain which is great, I suspect that schnorr signatures would compress even further when compressed with the above compression methods.

Here is a link to the block that I compressed: https://mega.nz/#!YPIF2KTa!4FxxLvqzjqIftrkhXwSC2h4G4Dolk8dLteNUolEtq98

Thanks for reading, best wishes to all.

-- Jeff Johnson
Marco Pontello via bitcoin-dev
2017-11-27 12:08:05 UTC
Reply
Permalink
Raw Message
Hi Jeff!


On Mon, Nov 27, 2017 at 3:11 AM, Jeff Johnson via bitcoin-dev <
Post by Jeff Johnson via bitcoin-dev
Raw block
998,198 bytes
Gzip
521,212 bytes (52% ratio)
(needs 2MB to decompress).
I don't know how you got that raw block, but it seems a bit odd.
If you look at it in an hex editor, you'll notice that every odd byte is 0,
and that explain the unusual high compression ratio.

Bye!
Jonas Schnelli via bitcoin-dev
2017-11-27 20:49:07 UTC
Reply
Permalink
Raw Message
Hi Jeff

There where previous discussions about similar approaches [1] [2].

I’m not sure if compression should be built into the protocol.
My humble understanding of it, is, that it should be built into different layers.

If bandwidth is a concern, then on the fly gzip compression like apaches mod_deflate could be something. But I expect fast propagation is often more important then a ~30% bandwidth reduction.
Bandwidth may be a concern for historical blocks transmission. If you continue the proposal, I think you should focus on historical blocks.

If disk space is a concern, then the database layer should handle the compression.

Thanks
—
</jonas>


[1] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-November/011692.html <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-November/011692.html>
[2] https://github.com/bitcoin/bitcoin/pull/6973 <https://github.com/bitcoin/bitcoin/pull/6973>
Post by lonsdale aseaday via bitcoin-dev
I'm new to this mailing list and apologize if this has been suggested before. I was directed from the Bitcoin core github to this mailing list for suggestions.
Raw block
998,198 bytes
Gzip
521,212 bytes (52% ratio)
(needs 2MB to decompress).
LZMA
415,308 bytes (41% ratio)
(1MB dictionary, needs 3MB to decompress)
- ZStandard: 469,179 bytes (47% ratio)
(1MB memory to decompress)
- LZ4: 641,063 bytes (64% ratio)
(32-64K to decompress)
The compression time on my modest laptop (2 years old) was "instant". I ran all from the command line and did not notice any lag as I pressed enter to do the compression, so easily less than a second. But compression time doesn't matter, decompression time is what matters as blocks will be decompressed billions of times more than they will be compressed. Decompression speed for LZ4 is the fastest of the above methods, at 3.3GB / second, slightly less than half the speed of memcpy, see char at (https://github.com/lz4/lz4 <https://github.com/lz4/lz4>).
If decompression speed, CPU and memory usage is a concern, LZ4 is a no brainer. You basically get a 33% larger block size for "free". But ZStandard, in my opinion, makes the most sense as it offers greater than 50% compression ratio with a very good decompression ratio of 900MB / second.
If this were implemented in the Bitcoin protocol, there would need to be a place to specify the compression type in a set of bits somewhere, so that future compression algorithms could potentially be added.
Miners could do nothing and keep sending blocks as is, and these blocks would have "no compression" as the type of compression, just as today. Or they could opt in to compress blocks and choose how many transactions they want to stuff into the block, keeping the compressed size under the limit.
The bitcoin client code would also need to be able to handle the appropriate compression bits, and limits of signature data, etc. modified to deal with the compression.
I understand schnorr signatures are on the roadmap as a 25% compression gain which is great, I suspect that schnorr signatures would compress even further when compressed with the above compression methods.
Here is a link to the block that I compressed: https://mega.nz/#!YPIF2KTa!4FxxLvqzjqIftrkhXwSC2h4G4Dolk8dLteNUolEtq98 <https://mega.nz/#!YPIF2KTa!4FxxLvqzjqIftrkhXwSC2h4G4Dolk8dLteNUolEtq98>
Thanks for reading, best wishes to all.
-- Jeff Johnson
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Loading...