[bitcoin-dev] "Compressed" headers stream

Discussion:

Riccardo Casatta via bitcoin-dev

2017-08-28 15:50:23 UTC

Hi everyone,

the Bitcoin headers are probably the most condensed and important piece of
data in the world, their demand is expected to grow.

When sending a stream of continuous block headers, a common case in IBD and
in disconnected clients, I think there is a possible optimization of the
transmitted data:
The headers after the first could avoid transmitting the previous hash
cause the receiver could compute it by double hashing the previous header
(an operation he needs to do anyway to verify PoW).
In a long stream, for example 2016 headers, the savings in bandwidth are
about 32/80 ~= 40%
without compressed headers 2016*80=161280 bytes
with compressed headers 80+2015*48=96800 bytes

What do you think?

In OpenTimestamps calendars we are going to use this compression to give
lite-client a reasonable secure proofs (a full node give higher security
but isn't feasible in all situations, for example for in-browser
verification)
To speed up sync of a new client Electrum starts with the download of a file
<https://headers.electrum.org/blockchain_headers> ~36MB containing the
first 477637 headers.
For this kind of clients could be useful a common http API with fixed
position chunks to leverage http caching. For example /headers/2016/0
returns the headers from the genesis to the 2015 header included while
/headers/2016/1 gives the headers from the 2016th to the 4031.
Other endpoints could have chunks of 20160 blocks or 201600 such that with
about 10 http requests a client could fast sync the headers

--
Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>

Greg Sanders via bitcoin-dev

2017-08-28 16:13:11 UTC

Permalink

Is there any reason to believe that you need Bitcoin "full security" at all
for timestamping?

On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <

Post by Riccardo Casatta via bitcoin-dev
Hi everyone,
the Bitcoin headers are probably the most condensed and important piece of
data in the world, their demand is expected to grow.
When sending a stream of continuous block headers, a common case in IBD
and in disconnected clients, I think there is a possible optimization of
The headers after the first could avoid transmitting the previous hash
cause the receiver could compute it by double hashing the previous header
(an operation he needs to do anyway to verify PoW).
In a long stream, for example 2016 headers, the savings in bandwidth are
about 32/80 ~= 40%
without compressed headers 2016*80=161280 bytes
with compressed headers 80+2015*48=96800 bytes
What do you think?
In OpenTimestamps calendars we are going to use this compression to give
lite-client a reasonable secure proofs (a full node give higher security
but isn't feasible in all situations, for example for in-browser
verification)
To speed up sync of a new client Electrum starts with the download of a
file <https://headers.electrum.org/blockchain_headers> ~36MB containing
the first 477637 headers.
For this kind of clients could be useful a common http API with fixed
position chunks to leverage http caching. For example /headers/2016/0
returns the headers from the genesis to the 2015 header included while
/headers/2016/1 gives the headers from the 2016th to the 4031.
Other endpoints could have chunks of 20160 blocks or 201600 such that with
about 10 http requests a client could fast sync the headers
--
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Greg Sanders via bitcoin-dev

2017-08-28 16:26:48 UTC

Permalink

Well, if anything my question may bolster your use-case. If there's a
heavier chain that is invalid, I kind of doubt it matters for timestamping
reasons.

/digression

On Mon, Aug 28, 2017 at 12:25 PM, Riccardo Casatta <

Post by Greg Sanders via bitcoin-dev
Is there any reason to believe that you need Bitcoin "full security" at
all for timestamping?

This is a little bit out of the main topic of the email which is the
savings in bandwidth in transmitting headers, any comment about that?
P.S. As a personal experience timestamping is nowadays used to prove date
and integrity of private databases containing a lot of value, so yes, in
that cases I will go with Bitcoin "full security"

Post by Greg Sanders via bitcoin-dev
On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <

Post by Riccardo Casatta via bitcoin-dev
Hi everyone,
the Bitcoin headers are probably the most condensed and important piece
of data in the world, their demand is expected to grow.
When sending a stream of continuous block headers, a common case in IBD
and in disconnected clients, I think there is a possible optimization of
The headers after the first could avoid transmitting the previous hash
cause the receiver could compute it by double hashing the previous header
(an operation he needs to do anyway to verify PoW).
In a long stream, for example 2016 headers, the savings in bandwidth are
about 32/80 ~= 40%
without compressed headers 2016*80=161280 bytes
with compressed headers 80+2015*48=96800 bytes
What do you think?
In OpenTimestamps calendars we are going to use this compression to give
lite-client a reasonable secure proofs (a full node give higher security
but isn't feasible in all situations, for example for in-browser
verification)
To speed up sync of a new client Electrum starts with the download of a
file <https://headers.electrum.org/blockchain_headers> ~36MB containing
the first 477637 headers.
For this kind of clients could be useful a common http API with fixed
position chunks to leverage http caching. For example /headers/2016/0
returns the headers from the genesis to the 2015 header included while
/headers/2016/1 gives the headers from the 2016th to the 4031.
Other endpoints could have chunks of 20160 blocks or 201600 such that
with about 10 http requests a client could fast sync the headers
--
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Peter Todd via bitcoin-dev

2017-09-04 14:10:17 UTC

Permalink

Post by Greg Sanders via bitcoin-dev
Well, if anything my question may bolster your use-case. If there's a
heavier chain that is invalid, I kind of doubt it matters for timestamping
reasons.

Timestamping can easily be *more* vulnerable to malicious miners than financial
applications for a number of reasons, including the fact that there's no
financial feedback loop of miners destroying the value of the coins they
produce - timestamping is a non-financial piggy-back application that doesn't
directly interact with the Bitcoin economy, beyond a trival number of timestamp
transactions.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Gregory Maxwell via bitcoin-dev

2017-08-28 17:12:15 UTC

Permalink

On Mon, Aug 28, 2017 at 3:50 PM, Riccardo Casatta via bitcoin-dev

You are leaving a lot of bytes on the table.

The bits field can only change every 2016 blocks (4 bytes per header),
the timestamp can not be less than the median of the last 11 and is
usually only a small amount over the last one (saves 2 bytes per
header), the block version is usually one of the last few (save 3
bytes per header).

But all these things improvements are just a constant factor. I think
you want the compact SPV proofs described in the appendix of the
sidechains whitepaper which creates log scaling proofs.

Kalle Rosenbaum via bitcoin-dev

2017-08-28 17:54:59 UTC

Permalink

2017-08-28 19:12 GMT+02:00 Gregory Maxwell via bitcoin-dev <

Post by Gregory Maxwell via bitcoin-dev
The bits field can only change every 2016 blocks (4 bytes per header),
the timestamp can not be less than the median of the last 11 and is
usually only a small amount over the last one (saves 2 bytes per
header), the block version is usually one of the last few (save 3
bytes per header).

... and I guess the nonce can be arbitrarily truncated as well, just brute
force the missing bits :-P.

Post by Gregory Maxwell via bitcoin-dev
But all these things improvements are just a constant factor. I think
you want the compact SPV proofs described in the appendix of the
sidechains whitepaper which creates log scaling proofs.

I think that my blog post on compact spv proofs can be helpful also. It
tries to make the pretty compact formulations in the sidechains paper a bit
more graspable by normal people.

http://popeller.io/index.php/2016/09/15/compact-spv-proofs/

Kalle

Peter Todd via bitcoin-dev

2017-09-04 14:06:44 UTC

Permalink

Post by Gregory Maxwell via bitcoin-dev
You are leaving a lot of bytes on the table.
The bits field can only change every 2016 blocks (4 bytes per header),
the timestamp can not be less than the median of the last 11 and is
usually only a small amount over the last one (saves 2 bytes per
header), the block version is usually one of the last few (save 3
bytes per header).
But all these things improvements are just a constant factor. I think
you want the compact SPV proofs described in the appendix of the
sidechains whitepaper which creates log scaling proofs.

Note that I'm already planning on OpenTimestamps having infrastructure for
trusted validity attestations; log scaling proofs alone only prove total work,
not validity. Timestamping has all kinds of very dubious security properties
when done via proof-of-work, due to various ways that miners can get away with
inaccurate block times. In particular, setting a block time backwards is
something that miners can do, particularly with majority hashing power, which
is the exact thing we're trying to prevent in a timestamp proof.

This all makes me dubious about risking further weakening of this already weak
security with compact SPV proofs; we'd need a lot more analysis to understand
what we're risking. Also note that we can ship a known-good
sum-merkle-tree tip hash with the software, which further reduces initial
download bandwidth needed to get the block headers on top of this obviously
safe eliding of redundant hashes.

--
https://petertodd.org 'peter'[:-1]@petertodd.org