If the 20 byte SHA1 is now considered insecure (with good reason), what about RIPEMD-160 which is the foundation of Bitcoin addresses?
Date: Fri, 24 Feb 2017 11:04:54 +0100
Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to
attakcs by third-parties, not just repo maintainers
Content-Type: text/plain; charset="UTF-8"
Post by Aymeric Vitte via bitcoin-devI have not worked on this since some time, so that's just thoughts,
but maybe it can render things much more difficult
than???????computing two files until the same hash is found
You basically rely on the idea that specific collisions are more
difficult to find.?This trick or similar tricks will not help.?(And
actually, the more files you add to the hash, the more freedom you give
the attacker.)
Even if certain collisions are more difficult to find today (which is
certainly true), the general rule is that someone will prove you wrong
in a year.
Even if ignore security entirely, switching to new hash function is
much simpler trying to fix the usage of a broken hash function.
Relying on SHA1 is hopeless. We have to get rid of it.
Best,
Tim
------------------------------
Message: 2
Date: Fri, 24 Feb 2017 16:18:43 +0100
Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to
attakcs by third-parties, not just repo maintainers
Content-Type: text/plain; charset=utf-8
Not sure that you really read deeply what I sent, because stating that
hashing files continuously instead of hashing the intermediate steps
just gives more latitude to the attacker can't be true when the attacker
has absolutely no control over the past files
I did not write this as a workaround to fix SHA1, which will be dead
soon or later but as maybe some general concept that could possibly help
whatever hash function you are using for objects that are not frozen but
extending (ie the original email stating that trees might be some kind
of worse candidates for collisions reminded me this), indeed it makes no
sense to patch SHA1 or play around, but this kind of proposal could
accompany the defunct
The drawback is that you have to keep the hash state when you close the
latest hash computation in order to start the next one
Then the question is: knowing the hash state, is it as easy to find a
collision between two files that will be computed in the next round than
finding a collision between two files only?
Knowing that you can probably modify the hash state with some
unpredictable patterns
Most likely the answer is: no, it's (astronomically?) more difficult
Please take it as a suggestion that might be explored (ps: I have the
code for this if needed) rather than an affirmation, still amazed as
shown in the few links provided (among others) that each time I raise
this subject nobody really pays attention (what's the use case?, etc)
and by the fact that it's apparently used by only one project in the
world and not supported by any library
Post by Aymeric Vitte via bitcoin-devPost by Aymeric Vitte via bitcoin-devI have not worked on this since some time, so that's just thoughts,
but maybe it can render things much more difficult
than computing two files until the same hash is found
You basically rely on the idea that specific collisions are more
difficult to find. This trick or similar tricks will not help. (And
actually, the more files you add to the hash, the more freedom you give
the attacker.)
Even if certain collisions are more difficult to find today (which is
certainly true), the general rule is that someone will prove you wrong
in a year.
Even if ignore security entirely, switching to new hash function is
much simpler trying to fix the usage of a broken hash function.
Relying on SHA1 is hopeless. We have to get rid of it.
Best,
Tim
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
--
Zcash wallets made simple: https://github.com/Ayms/zcash-wallets
Bitcoin wallets made simple: https://github.com/Ayms/bitcoin-wallets
Get the torrent dynamic blocklist: http://peersm.com/getblocklist
Check the 10 M passwords list: http://peersm.com/findmyass
Anti-spies and private torrents, dynamic blocklist: http://torrent-live.org
Peersm : http://www.peersm.com
torrent-live: https://github.com/Ayms/torrent-live
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
------------------------------
Message: 3
Date: Fri, 24 Feb 2017 17:30:49 +0100
Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to
attakcs by third-parties, not just repo maintainers
Content-Type: text/plain; charset="UTF-8"
Post by Aymeric Vitte via bitcoin-devNot sure that you really read deeply what I sent, because stating
that
hashing files continuously instead of hashing the intermediate steps
just gives more latitude to the attacker can't be true when the
attacker
has absolutely no control over the past files
What prevents the attacker to provide different past files when talking
to parties who are still in the initial state?
Then the question is: knowing the hash state, is it as easy to find a
Post by Aymeric Vitte via bitcoin-devcollision between two files that will be computed in the next round
than
finding a collision between two files only?
With the original usage of the hash function, the hash state is always
the initial state. Now that the attacker has some control over the hash
state even. In other words, if the original use of the hash function
was vulnerable, then your scheme is vulnerable for the initial state.
Concrete attack: If you can find x != y with H(x) = H(y), then you can
also find m, x != y, with H(m||x) = H(m||y), just by setting m = "".
Not sure if this is the right place to discuss that issue though...
Best,
Tim
------------------------------
Message: 4
Date: Fri, 24 Feb 2017 18:29:50 +0100
Subject: Re: [bitcoin-dev] SHA1 collisions make Git vulnerable to
attakcs by third-parties, not just repo maintainers
Content-Type: text/plain; charset=windows-1252
??? apparently we are not discussing the same thing
Maybe I did not provide the right links (reading them again I myself
don't find them so clear), see maybe again
https://github.com/whatwg/streams/issues/33#issuecomment-28045860
a - b - c -d
hash(a)
hash(a+b)
etc
update a --> keep the remaining bytes a_ (+ hash state 1) --> digest
a=hash(a)
update a_+b from hash state 1--> keep the remaining bytes b_ (+ hash
state 2) --> digest a_+b=hash(a+b)
etc
Basically that's similar to a real time progressive hash of chunks of a
file that you are streaming and therefore don't know what will come next
(per opposition to hashing a file that you already have), this could
apply to trees
hash(a)
hash(hash(a) +hash(b))
etc
There is no initial state, and the attacker can't modify what was
already hashed, to make it more difficult you can probably modify the
hash state N
Post by Aymeric Vitte via bitcoin-devPost by Aymeric Vitte via bitcoin-devNot sure that you really read deeply what I sent, because stating
that
hashing files continuously instead of hashing the intermediate steps
just gives more latitude to the attacker can't be true when the
attacker
has absolutely no control over the past files
What prevents the attacker to provide different past files when talking
to parties who are still in the initial state?
Then the question is: knowing the hash state, is it as easy to find a
Post by Aymeric Vitte via bitcoin-devcollision between two files that will be computed in the next round
than
finding a collision between two files only?
With the original usage of the hash function, the hash state is always
the initial state. Now that the attacker has some control over the hash
state even. In other words, if the original use of the hash function
was vulnerable, then your scheme is vulnerable for the initial state.
Concrete attack: If you can find x != y with H(x) = H(y), then you can
also find m, x != y, with H(m||x) = H(m||y), just by setting m = "".
Not sure if this is the right place to discuss that issue though...
Best,
Tim
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
--
Zcash wallets made simple: https://github.com/Ayms/zcash-wallets
Bitcoin wallets made simple: https://github.com/Ayms/bitcoin-wallets
Get the torrent dynamic blocklist: http://peersm.com/getblocklist
Check the 10 M passwords list: http://peersm.com/findmyass
Anti-spies and private torrents, dynamic blocklist: http://torrent-live.org
Peersm : http://www.peersm.com
torrent-live: https://github.com/Ayms/torrent-live
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
------------------------------
Message: 5
Date: Fri, 24 Feb 2017 14:20:19 -0800
Cc: Bitcoin Protocol Discussion
Subject: Re: [bitcoin-dev] A Better MMR Definition
Content-Type: text/plain; charset="utf-8"
So your idea is to cluster entries by entry time because newer things are
more likely to leave and updating multiple things near each other is
cheaper?
That can be done with my tool. Instead of using hashes for the values being
stored, you use position entries. The first entry gets a value of all
zeros, the next one a one followed by all zeros, then the next two
correspond to the first two with the second bit flipped to one, then the
next four the first four with the third bit flipped to one, etc. It
probably performs a little bit better to do it two bits at a time instead
of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101,
0110, 0111, 1001, etc. If you were to really use this you'd probably want
to to add some optimizations to use the fact that the terminals fit in 64
bits instead of 256, but it mostly works unchanged, and gets whatever
benefits there are to this clustering plus the high performance
implementation tricks I've built which I keep complaining that nobody's
giving feedback on.
I'm not sold on this being a win: The empirical access patterns are
unknown, it requires an extra cache miss per lookup to find the entry
number, it may be that everything is optimized well enough without it for
there to be no meaningful gains, and it's a bunch of extra complexity. What
should be done is that a plain vanilla UTXO set solution is optimized as
well as it can be first, and then the insertion ordering trick is tried as
an optimization to see if it's an improvement. Without that baseline
there's no meaningful basis for comparison, and I'm quite confident that a
naive implementation which just allocates individual nodes will
underperform the thing I've come up with, even without adding optimizations
related to fitting in 64 bits.
Post by Aymeric Vitte via bitcoin-devPost by Aymeric Vitte via bitcoin-devGlad we're on the same page with regard to what's possible in TXO
commitments.
Secondly, am I correct in saying your UTXO commitments scheme requires
random
access? While you describe it as a "merkle set", obviously to be
merkelized
Post by Aymeric Vitte via bitcoin-devit'll have to have an ordering of some kind. What do you propose that
ordering
to be?
The ordering is by the bits in the hash. Technically it's a Patricia
Trie.
Post by Aymeric Vitte via bitcoin-devI'm using 'merkle tree' to refer to basically anything with a hash root.
The hash of what? The values in the set?
Post by Aymeric Vitte via bitcoin-devMaybe more specifically, what exact values do you propose to be in the
set?
Post by Aymeric Vitte via bitcoin-devThat is unspecified in the implementation, it just takes a 256 bit value
which is presumably a hash of something. The intention is to nail down a
simple format and demonstrate good performance and leave those semantics
to
Post by Aymeric Vitte via bitcoin-deva higher layer. The simplest thing would be to hash together the txid and
output number.
Ok, so let's assume the values in the set are the unspent outpoints.
Since we're ordering by the hash of the values in the set, outpoints will
be
distributed uniformly in the set, and thus the access pattern of data in
the
set is uniform.
Now let's fast-forward 10 years. For the sake of argument, assume that for
every 1 UTXO in the set that corresponds to funds in someone's wallet that
are
likely to be spent, there are 2^12 = 4096 UTXO's that have been permanently
lost (and/or created in spam attacks) and thus will never be spent.
Since lost UTXO's are *also* uniformly distributed, if I'm processing a new
block that spends 2^12 = 4096 UTXO's, on average for each UTXO spent, I'll
have to update log2(4096) = 12 more digests than I would have had those
"dead"
UTXO's not existed.
Concretely, imagine our UTXO set had just 8 values in it, and we were
updating
#
/ \
/ \
/ \
/ \
/ \
# #
/ \ / \
/ \ / \
# . . #
/ \ / \ / \ / \
. X . . . . X .
To mark two coins as spent, we've had to update 5 inner nodes.
Now let's look at what happens in an insertion-ordered TXO commitment
scheme.
For sake of argument, let's assume the best possible case, where every UTXO
spent in that same block was recently created. Since the UTXO's are
recently
created, chances are almost every single one of those "dead" UTXO's will
have
been created in the past. Thus, since this is an insertion-ordered data
structure, those UTXO's exist in an older part of the data structure that
our
new block doesn't need to modify at all.
Concretely, again let's imagine a TXO commitment with 8 values in it, and
two
#
/ \
/ \
/ \
/ \
/ \
. #
/ \ / \
/ \ / \
. . . #
/ \ / \ / \ / \
. . . . . . X X
To mark two coins as spent, we've only had to update 3 inner nodes; while
our
tree is higher with those lost coins, those extra inner nodes are amortised
across all the coins we have to update.
The situation gets even better when we look at the *new* UTXO's that our
block
creates. Suppose our UTXO set has size n. To mark a single coin as spent,
we
have to update log2(n) inner nodes. We do get to amortise this a bit at
the top
levels in the tree, but even if we assume the amortisation is totally free,
we're updating at least log2(n) - log2(m) inner nodes "under" the amortised
nodes at the top of the tree for *each* new node.
Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to
the
data set goes in the same place - the end. So almost none of the existing
data
needs to be touched to add the new UTXOs. Equally, the hashing required
for the
new UTXO's can be done in an incremental fashion that's very L1/L2 cache
friendly.
tl;dr: Precisely because access patterns in TXO commitments are *not*
uniform,
I think we'll find that from a L1/L2/etc cache perspective alone, TXO
commitments will result in better performance than UTXO commitments.
Now it is true that Bitcoin's current design means we'll need a map of
confirmed outpoints to TXO insertion order indexes. But it's not
particularly
hard to add that "metadata" to transactions on the P2P layer in the same
way
that segwit added witnesses to transactions without modifying how txids
were
calculated; if you only connect to peers who provide you with TXO index
information in blocks and transactions, you don't need to keep that map
yourself.
it's
just a 8-byte max index rather than a 40 byte outpoint.
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20170224/63ab2731/attachment.html>
------------------------------
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
End of bitcoin-dev Digest, Vol 21, Issue 34
*******************************************