TL;DR I'll be updating the fast Merkle-tree spec to use a different

IV, using (for infrastructure compatability reasons) the scheme

provided by Peter Todd.

This is a specific instance of a general problem where you cannot

trust scripts given to you by another party. Notice that we run into

the same sort of problem when doing key aggregation, in which you must

require the other party to prove knowledge of the discrete log before

using their public key, or else key cancellation can occur.

With script it is a little bit more complicated as you might want

zero-knowledge proofs of hash pre-images for HTLCs as well as proofs

of DL knowledge (signatures), but the basic idea is the same. Multi-

party wallet level protocols for jointly constructing scriptPubKeys

should require a 'delinearization' step that proves knowledge of

information necessary to complete each part of the script, as part of

proving the safety of a construct.

I think my hangup before in understanding the attack you describe was

in actualizing it into a practical attack that actually escalates the

attacker's capabilities. If the attacker can get you to agree to a

MAST policy that is nothing more than a CHECKSIG over a key they

presumably control, then they don't need to do any complicated

grinding. The attacker in that scenario would just actually specify a

key they control and take the funds that way.

Where this presumably leads to an actual exploit is when you specify a

script that a curious counter-party actually takes the time to

investigate and believes to be secure. For example, a script that

requires a signature or pre-image revelation from that counter-party.

That would require grinding not a few bytes, but at minimum 20-33

bytes for either a HASH160 image or the counter-party's key.

If I understand the revised attack description correctly, then there

is a small window in which the attacker can create a script less than

55 bytes in length, where nearly all of the first 32 bytes are

selected by the attacker, yet nevertheless the script seems safe to

the counter-party. The smallest such script I was able to construct

was the following:

<fake-pubkey> CHECKSIGVERIFY HASH160 <preimage> EQUAL

This is 56 bytes and requires only 7 bits of grinding in the fake

pubkey. But 56 bytes is too large. Switching to secp256k1 serialized

32-byte pubkeys (in a script version upgrade, for example) would

reduce this to the necessary 55 bytes with 0 bits of grinding. A

smaller variant is possible:

DUP HASH160 <fake-pubkey-hash> EQUALVERIFY CHECKSIGVERIFY HASH160 <preimage> EQUAL

This is 46 bytes, but requires grinding 96 bits, which is a bit less

plausible.

Belts and suspenders are not so terrible together, however, and I

think there is enough of a justification here to look into modifying

the scheme to use a different IV for hash tree updates. This would

prevent even the above implausible attacks.

*Post by Mark Friedenbach via bitcoin-dev*I've been puzzling over your email since receiving it. I'm not sure it

is possible to perform the attack you describe with the tree structure

specified in the BIP. If I may rephrase your attack, I believe you are

Want: An innocuous script and a malign script for which

double-SHA256(innocuous)

is equal to either

fast-SHA256(double-SHA256(malign) || r) or

fast-SHA256(r || double-SHA256(malign))

or fast-SHA256(fast-SHA256(double-SHA256(malign) || r1) || r0)

or fast-SHA256(fast-SHA256(r1 || double-SHA256(malign)) || r0)

or ...

where r is a freely chosen 32-byte nonce. This would allow the

attacker to reveal the innocuous script before funds are sent to the

MAST, then use the malign script to spend.

Because of the double-SHA256 construction I do not see how this can be

accomplished without a full break of SHA256.

The particular scenario I'm imagining is a collision between

double-SHA256(innocuous)

and

fast-SHA256(fast-SHA256(fast-SHA256(double-SHA256(malign) || r2) || r1) || r0).

where innocuous is a Bitcoin Script that is between 32 and 55 bytes long.

Observe that when data is less than 55 bytes then double-SHA256(data) = fast-SHA256(fast-SHA256(padding-SHA256(data)) || 0x8000...100) (which is really the crux of the matter).

Therefore, to get our collision it suffices to find a collision between

padding-SHA256(innocuous)

and

fast-SHA256(double-SHA256(malign) || r2) || r1

r1 can freely be set to the second half of padding-SHA256(innocuous), so it suffices to find a collision between

fast-SHA256(double-SHA256(malign) || r2)

and the first half of padding-SHA256(innocuous) which is equal to the first 32 bytes of innocuous.

Imagine the first opcode of innocuous is the push of a value that the attacker claims to be his 33-byte public key.

So long as the attacker doesn't need to prove that they know the discrete log of this pubkey, they can grind r2 until the result of fast-SHA256(double-SHA256(malign) || r2) contains the correct first couple of bytes for the script header and the opcode for a 33-byte push. I believe that is only about 3 or 4 bytes of they need to grind out.