Discussion:
Merkle branch verification & tail-call semantics for generalized MAST
Add Reply
Mark Friedenbach via bitcoin-dev
2017-09-07 00:38:55 UTC
Reply
Permalink
Raw Message
I would like to propose two new script features to be added to the
bitcoin protocol by means of soft-fork activation. These features are
a new opcode, MERKLE-BRANCH-VERIFY (MBV) and tail-call execution
semantics.

In brief summary, MERKLE-BRANCH-VERIFY allows script authors to force
redemption to use values selected from a pre-determined set committed
to in the scriptPubKey, but without requiring revelation of unused
elements in the set for both enhanced privacy and smaller script
sizes. Tail-call execution semantics allows a single level of
recursion into a subscript, providing properties similar to P2SH while
at the same time more flexible.

These two features together are enough to enable a range of
applications such as tree signatures (minus Schnorr aggregation) as
described by Pieter Wuille [1], and a generalized MAST useful for
constructing private smart contracts. It also brings privacy and
fungibility improvements to users of counter-signing wallet/vault
services as unique redemption policies need only be revealed if/when
exceptional circumstances demand it, leaving most transactions looking
the same as any other MAST-enabled multi-sig script.

I believe that the implementation of these features is simple enough,
and the use cases compelling enough that we could BIP 8/9 rollout of
these features in relatively short order, perhaps before the end of
the year.

I have written three BIPs to describe these features, and their
associated implementation, for which I now invite public review and
discussion:

Fast Merkle Trees
BIP: https://gist.github.com/maaku/41b0054de0731321d23e9da90ba4ee0a
Code: https://github.com/maaku/bitcoin/tree/fast-merkle-tree

MERKLEBRANCHVERIFY
BIP: https://gist.github.com/maaku/bcf63a208880bbf8135e453994c0e431
Code: https://github.com/maaku/bitcoin/tree/merkle-branch-verify

Tail-call execution semantics
BIP: https://gist.github.com/maaku/f7b2e710c53f601279549aa74eeb5368
Code: https://github.com/maaku/bitcoin/tree/tail-call-semantics

Note: I have circulated this idea privately among a few people, and I
will note that there is one piece of feedback which I agree with but
is not incorporated yet: there should be a multi-element MBV opcode
that allows verifying multiple items are extracted from a single
tree. It is not obvious how MBV could be modified to support this
without sacrificing important properties, or whether should be a
separate multi-MBV opcode instead.

Kind regards,
Mark Friedenbach
Johnson Lau via bitcoin-dev
2017-09-08 09:21:22 UTC
Reply
Permalink
Raw Message
Some comments with the tail-call execution semantics BIP:

Tail-call execution semantics require “unclean stake”, i.e. final stake with more than one item. However, “unclean stake” is invalid (not just non-standard) in BIP141, so you could only use it with legacy P2SH (which is totally pointless….). A different design like OP_EVAL might be needed, or you need a new witness script version.

I think you have also missed the sigOp counting of the executed script. As you can’t count it without executing the script, the current static analysability is lost. This was one of the reasons for OP_EVAL being rejected. Since sigOp is a per-block limit, any OP_EVAL-like operation means block validity will depend on the precise outcome of script execution (instead of just pass or fail), which is a layer violation.

(An alternative is to make sigOp a per-input limit instead of per-block limit, just like the 201 nOp limit. But this is a very different security model)

Witness script versioning is by design fully compatible with P2SH and BIP173, so there will be no hurdle for existing wallets to pay to BIP114. Actually it should be completely transparent to them.

For code complexity, the minimal BIP114 could be really simple, like <30 lines of code? It looks complex now because it does much more than simply hiding scripts in a hash.
Post by Mark Friedenbach via bitcoin-dev
I would like to propose two new script features to be added to the
bitcoin protocol by means of soft-fork activation. These features are
a new opcode, MERKLE-BRANCH-VERIFY (MBV) and tail-call execution
semantics.
In brief summary, MERKLE-BRANCH-VERIFY allows script authors to force
redemption to use values selected from a pre-determined set committed
to in the scriptPubKey, but without requiring revelation of unused
elements in the set for both enhanced privacy and smaller script
sizes. Tail-call execution semantics allows a single level of
recursion into a subscript, providing properties similar to P2SH while
at the same time more flexible.
These two features together are enough to enable a range of
applications such as tree signatures (minus Schnorr aggregation) as
described by Pieter Wuille [1], and a generalized MAST useful for
constructing private smart contracts. It also brings privacy and
fungibility improvements to users of counter-signing wallet/vault
services as unique redemption policies need only be revealed if/when
exceptional circumstances demand it, leaving most transactions looking
the same as any other MAST-enabled multi-sig script.
I believe that the implementation of these features is simple enough,
and the use cases compelling enough that we could BIP 8/9 rollout of
these features in relatively short order, perhaps before the end of
the year.
I have written three BIPs to describe these features, and their
associated implementation, for which I now invite public review and
Fast Merkle Trees
BIP: https://gist.github.com/maaku/41b0054de0731321d23e9da90ba4ee0a
Code: https://github.com/maaku/bitcoin/tree/fast-merkle-tree
MERKLEBRANCHVERIFY
BIP: https://gist.github.com/maaku/bcf63a208880bbf8135e453994c0e431
Code: https://github.com/maaku/bitcoin/tree/merkle-branch-verify
Tail-call execution semantics
BIP: https://gist.github.com/maaku/f7b2e710c53f601279549aa74eeb5368
Code: https://github.com/maaku/bitcoin/tree/tail-call-semantics
Note: I have circulated this idea privately among a few people, and I
will note that there is one piece of feedback which I agree with but
is not incorporated yet: there should be a multi-element MBV opcode
that allows verifying multiple items are extracted from a single
tree. It is not obvious how MBV could be modified to support this
without sacrificing important properties, or whether should be a
separate multi-MBV opcode instead.
Kind regards,
Mark Friedenbach
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Mark Friedenbach via bitcoin-dev
2017-09-12 02:03:42 UTC
Reply
Permalink
Raw Message
My apologies for a delay in responding to emails on this list; I have
been fighting a cold.

(Also my apologies to Johnson Lau, as I forgot to forward this to the list.)
Tail-call execution semantics require "unclean stake" , i.e. final
stake with more than one item. However, "unclean stake" is invalid
(not just non-standard) in BIP141, so you could only use it with
legacy P2SH (which is totally pointless....). A different design
like OP_EVAL might be needed, or you need a new witness script
version.
I believe you meant "unclean stack," and you are correct. This was
also pointed out last tuesday by a participant at the in-person
CoreDev meetup where the idea was presented.

This doesn't kill the idea, it just complicates the implementation
slightly. A simple fix would be to allow tail-recursion to occur if
the stack is not clean (as can happen with legacy P2SH as you point
out, or yet to be defined version 1+ forms of segwit script), OR if
there is a single item on the stack and the alt-stack is not empty.
For segwit v0 scripts you then have to move any arguments to the alt
stack before ending the redeem script, leaving just the policy script
on the main stack.
I think you have also missed the sigOp counting of the executed
script. As you can't count it without executing the script, the
current static analysability is lost. This was one of the reasons
for OP_EVAL being rejected. Since sigOp is a per-block limit, any
OP_EVAL-like operation means block validity will depend on the
precise outcome of script execution (instead of just pass or fail),
which is a layer violation.
I disagree with this design requirement.

The SigOp counting method used by bitcoin is flawed. It incorrectly
limits not the number of signature operations necessary to validate a
block, but rather the number of CHECKSIGs potentially encountered in
script execution, even if in an unexecuted branch. (Admitedly MAST
makes this less of an issue, but there are still useful compact
scripts that use if/else constructs to elide a CHECKSIG.) Nor will it
account for aggregation when that feature is added, or properly
differentiate between signature operations that can be batched and
those that can not.

Additionally there are other resources used by script that should be
globally limited, such as hash operations, which are not accounted for
at this time and cannot be statically assessed, even by the flawed
mechanism by which SigOps are counted. I have maintained for some time
that bitcoin should move from having multiple separate global limits
(weight and sigops, hashed bytes in XT/Classic/BCH) to a single linear
metric that combines all of these factors with appropriate
coefficients.

A better way of handling this problem, which works for both SigOps and
HashOps, is to have the witness commit to the maximum resources
consumed by validation of the spend of the coin, to relay this data
with the transaction and include it in the SigHash, and to use the
committed maximum for block validation. This could be added in a
future script version upgrade. This change would also resolve the
issue that led to the clean stack rule in segwit, allowing future
versions of script to use tail-call recursion without involving the
alt-stack.

Nevertheless it is constructive feedback that the current draft of the
BIP and its implementation do not count SigOps, at all. There are a
couple of ways this can be fixed by evaluating the top-level script
and then doing static analysis of the resulting policy script, or by
running the script and counting operations actually performed.

Additionally, it is possible that we take this time to re-evaluate
whether we should be counting SigOps other than for legacy consensus
rule compliance. The speed of verification in secp256k1 has made
signature operations no longer the chief concern in block validation
times.
Witness script versioning is by design fully compatible with P2SH
and BIP173, so there will be no hurdle for existing wallets to pay
to BIP114. Actually it should be completely transparent to them.
This is correct. Your feedback will be incorporated.
For code complexity, the minimal BIP114 could be really simple, like
<30 lines of code? It looks complex now because it does much more
than simply hiding scripts in a hash.
Is there a repo that contains the latest implementation of BIP 114,
for comparison purposes?

Kind regards,
Mark Friedenbach
Bryan Bishop via bitcoin-dev
2017-09-12 02:13:24 UTC
Reply
Permalink
Raw Message
On Mon, Sep 11, 2017 at 9:03 PM, Mark Friedenbach via bitcoin-dev
Post by Mark Friedenbach via bitcoin-dev
I believe you meant "unclean stack," and you are correct. This was
also pointed out last tuesday by a participant at the in-person
CoreDev meetup where the idea was presented.
http://diyhpl.us/wiki/transcripts/bitcoin-core-dev-tech/2017-09-07-merkleized-abstract-syntax-trees/
Post by Mark Friedenbach via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
For code complexity, the minimal BIP114 could be really simple, like
<30 lines of code? It looks complex now because it does much more
than simply hiding scripts in a hash.
Is there a repo that contains the latest implementation of BIP 114,
for comparison purposes?
original bip114:
https://github.com/bitcoin/bips/blob/775f26c02696e772dac4060aa092d35dedbc647c/bip-0114.mediawiki
revised bip114: https://github.com/jl2012/bips/blob/vault/bip-0114.mediawiki
https://github.com/jl2012/bitcoin/commits/vault
from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-September/014963.html

- Bryan
http://heybryan.org/
1 512 203 0507
Johnson Lau via bitcoin-dev
2017-09-12 08:55:59 UTC
Reply
Permalink
Raw Message
Post by Mark Friedenbach via bitcoin-dev
My apologies for a delay in responding to emails on this list; I have
been fighting a cold.
(Also my apologies to Johnson Lau, as I forgot to forward this to the list.)
Tail-call execution semantics require "unclean stake" , i.e. final
stake with more than one item. However, "unclean stake" is invalid
(not just non-standard) in BIP141, so you could only use it with
legacy P2SH (which is totally pointless....). A different design
like OP_EVAL might be needed, or you need a new witness script
version.
I believe you meant "unclean stack," and you are correct. This was
also pointed out last tuesday by a participant at the in-person
CoreDev meetup where the idea was presented.
This doesn't kill the idea, it just complicates the implementation
slightly. A simple fix would be to allow tail-recursion to occur if
the stack is not clean (as can happen with legacy P2SH as you point
out, or yet to be defined version 1+ forms of segwit script), OR if
there is a single item on the stack and the alt-stack is not empty.
For segwit v0 scripts you then have to move any arguments to the alt
stack before ending the redeem script, leaving just the policy script
on the main stack.
This is ugly and actually broken, as different script path may require different number of stack items, so you don’t know how many OP_TOALTSTACK do you need. Easier to just use a new witness version
Post by Mark Friedenbach via bitcoin-dev
I think you have also missed the sigOp counting of the executed
script. As you can't count it without executing the script, the
current static analysability is lost. This was one of the reasons
for OP_EVAL being rejected. Since sigOp is a per-block limit, any
OP_EVAL-like operation means block validity will depend on the
precise outcome of script execution (instead of just pass or fail),
which is a layer violation.
I disagree with this design requirement.
The SigOp counting method used by bitcoin is flawed. It incorrectly
limits not the number of signature operations necessary to validate a
block, but rather the number of CHECKSIGs potentially encountered in
script execution, even if in an unexecuted branch. (Admitedly MAST
makes this less of an issue, but there are still useful compact
scripts that use if/else constructs to elide a CHECKSIG.) Nor will it
account for aggregation when that feature is added, or properly
differentiate between signature operations that can be batched and
those that can not.
Additionally there are other resources used by script that should be
globally limited, such as hash operations, which are not accounted for
at this time and cannot be statically assessed, even by the flawed
mechanism by which SigOps are counted. I have maintained for some time
that bitcoin should move from having multiple separate global limits
(weight and sigops, hashed bytes in XT/Classic/BCH) to a single linear
metric that combines all of these factors with appropriate
coefficients.
I like the idea to have an unified global limit and suggested a way to do it (https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-January/013472.html). But I think this is off-topic here.
Post by Mark Friedenbach via bitcoin-dev
A better way of handling this problem, which works for both SigOps and
HashOps, is to have the witness commit to the maximum resources
consumed by validation of the spend of the coin, to relay this data
with the transaction and include it in the SigHash, and to use the
committed maximum for block validation. This could be added in a
future script version upgrade. This change would also resolve the
issue that led to the clean stack rule in segwit, allowing future
versions of script to use tail-call recursion without involving the
alt-stack.
Nevertheless it is constructive feedback that the current draft of the
BIP and its implementation do not count SigOps, at all. There are a
couple of ways this can be fixed by evaluating the top-level script
and then doing static analysis of the resulting policy script, or by
running the script and counting operations actually performed.
In any case, I think maintaining static analysability for global limit(s) is very important. Ethereum had to give up their DAO softfork plan at the last minute, exactly due to the lack of this: http://hackingdistributed.com/2016/06/28/ethereum-soft-fork-dos-vector/

Otherwise, one could attack relay and mining nodes by sending many small size txs with many sigops, forcing them to validate, and discard due to insufficient fees.

Technically it might be ok if we commit the total validation cost (sigop + hashop + whatever) as the first witness stack item, but that’d take more space and I’m not sure if it is desirable. Anyway, giving up static analysability for scripts is a fundamental change to our existing model.
Post by Mark Friedenbach via bitcoin-dev
Additionally, it is possible that we take this time to re-evaluate
whether we should be counting SigOps other than for legacy consensus
rule compliance. The speed of verification in secp256k1 has made
signature operations no longer the chief concern in block validation
times.
Without the limit I think we would be DoS-ed to dead
Post by Mark Friedenbach via bitcoin-dev
Witness script versioning is by design fully compatible with P2SH
and BIP173, so there will be no hurdle for existing wallets to pay
to BIP114. Actually it should be completely transparent to them.
This is correct. Your feedback will be incorporated.
For code complexity, the minimal BIP114 could be really simple, like
<30 lines of code? It looks complex now because it does much more
than simply hiding scripts in a hash.
Is there a repo that contains the latest implementation of BIP 114,
for comparison purposes?
You can find it here: https://github.com/jl2012/bitcoin/commits/vault
https://github.com/jl2012/bitcoin/commit/f3f201d232d3995db38e09b171e4d1dea8d04ad2

But this does more than your proposal as it allows users adding extra scripts when spending a coin. The rationale is described in the revised BIP114:
https://github.com/jl2012/bips/blob/vault/bip-0114.mediawiki#Additional_scripts_in_witness

So to make it functionally comparable with your proposal, the IsMSV0Stack() function is not needed. The new 249-254 lines in interpreter.cpp could be removed. The new 1480-1519 lines could be replaced by a few lines copied from the existing P2WSH code. I can make a minimal version if you want to see how it looks like
Post by Mark Friedenbach via bitcoin-dev
Kind regards,
Mark Friedenbach
Mark Friedenbach via bitcoin-dev
2017-09-12 19:57:10 UTC
Reply
Permalink
Raw Message
Post by Johnson Lau via bitcoin-dev
This is ugly and actually broken, as different script path may
require different number of stack items, so you don't know how many
OP_TOALTSTACK do you need. Easier to just use a new witness version
DEPTH makes this relatively easy to do. Just repeat the following for
the maximum number of stack elements that might be used:

DEPTH 1SUB IF SWAP TOALTSTACK ENDIF

There are probably more compact alternatives.

Using a new script version is easier, but not faster. There's a number
of things that might be fixed in a v1 upgrade, and various design
decisions to sort out regarding specification of a witness version
(version in the witness rather than the scriptPubKey).

Tree signatures and MAST are immediately useful to many services,
however, and I would hate to delay usage by six months to a year or
more by serializing dependencies instead of doing them in parallel.
Post by Johnson Lau via bitcoin-dev
Otherwise, one could attack relay and mining nodes by sending many
small size txs with many sigops, forcing them to validate, and
discard due to insufficient fees.
Technically it might be ok if we commit the total validation cost
(sigop + hashop + whatever) as the first witness stack item
That is what I'm suggesting. And yes, there are changes that would
have to be made to the p2p layer and transaction processing to handle
this safely. I'm arguing that the cost of doing so is worth it, and a
better path forward.
Post by Johnson Lau via bitcoin-dev
Without the limit I think we would be DoS-ed to dead
4MB of secp256k1 signatures takes 10s to validate on my 5 year old
laptop (125,000 signatures, ignoring public keys and other things that
would consume space). That's much less than bad blocks that can be
constructed using other vulnerabilities.
Post by Johnson Lau via bitcoin-dev
So to make it functionally comparable with your proposal, the
IsMSV0Stack() function is not needed. The new 249-254 lines in
interpreter.cpp could be removed. The new 1480-1519 lines could be
replaced by a few lines copied from the existing P2WSH code. I can
make a minimal version if you want to see how it looks like
That's alright, I don't think it's necessary to purposefully restrict
one to compare them head to head with the same features. They are
different proposals with different pros and cons.

Kind regards,
Mark Friedenbach
Karl Johan Alm via bitcoin-dev
2017-09-12 23:27:36 UTC
Reply
Permalink
Raw Message
On Wed, Sep 13, 2017 at 4:57 AM, Mark Friedenbach via bitcoin-dev
Post by Mark Friedenbach via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
Without the limit I think we would be DoS-ed to dead
4MB of secp256k1 signatures takes 10s to validate on my 5 year old
laptop (125,000 signatures, ignoring public keys and other things that
would consume space). That's much less than bad blocks that can be
constructed using other vulnerabilities.
Sidenote-ish, but I also believe it would be fairly trivial to keep a
per UTXO tally and demand additional fees when trying to respend a
UTXO which was previously "spent" with an invalid op count. I.e. if
you sign off on an input for a tx that you know is bad, the UTXO in
question will be penalized proportionately to the wasted ops when
included in another transaction later. That would probably kill that
DoS attack as the attacker would effectively lose bitcoin every time,
even if it was postponed until they spent the UTXO. The only thing
clients would need to do is to add a fee rate penalty ivar and a
mapping of outpoint to penalty value, probably stored as a separate
.dat file. I think.
Peter Todd via bitcoin-dev
2017-09-13 09:41:07 UTC
Reply
Permalink
Raw Message
Post by Karl Johan Alm via bitcoin-dev
On Wed, Sep 13, 2017 at 4:57 AM, Mark Friedenbach via bitcoin-dev
Post by Mark Friedenbach via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
Without the limit I think we would be DoS-ed to dead
4MB of secp256k1 signatures takes 10s to validate on my 5 year old
laptop (125,000 signatures, ignoring public keys and other things that
would consume space). That's much less than bad blocks that can be
constructed using other vulnerabilities.
Sidenote-ish, but I also believe it would be fairly trivial to keep a
per UTXO tally and demand additional fees when trying to respend a
UTXO which was previously "spent" with an invalid op count. I.e. if
you sign off on an input for a tx that you know is bad, the UTXO in
question will be penalized proportionately to the wasted ops when
included in another transaction later. That would probably kill that
DoS attack as the attacker would effectively lose bitcoin every time,
even if it was postponed until they spent the UTXO. The only thing
clients would need to do is to add a fee rate penalty ivar and a
mapping of outpoint to penalty value, probably stored as a separate
.dat file. I think.
Ethereum does something quite like this; it's a very bad idea for a few
reasons:

1) If you bailed out of verifying a script due to wasted ops, how did you know the
transaction trying to spend that txout did in fact come from the owner of it?

2) How do you verify that transactions were penalized correctly without *all*
nodes re-running the DoS script?

3) If the DoS is significant enough to matter on a per-node level, you're going
to have serious problems anyway, quite possibly so serious that the attacker
manages to cause consensus to fail. They can then spend the txouts in a block
that does *not* penalize their outputs, negating the deterrent.
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Adán Sánchez de Pedro Crespo via bitcoin-dev
2017-09-11 20:37:55 UTC
Reply
Permalink
Raw Message
Coincidentally, the kind of Merkle tree that Mark describes in his
proposal is exactly the one that we use at Stampery.

The Stampery BTA whitepaper[1] includes pseudocode for many of the
algorithms outlined by this proposal, including fast-SHA256, the tree
building process and the inclusion proving routine.

The wording is slightly different but the logic is just the same, so I
hope it helps future implementations in case of eventual adoption.


[1]
https://s3.amazonaws.com/stampery-cdn/docs/Stampery-BTA-v6-whitepaper.pdf


Best,
--
Adán Sánchez de Pedro Crespo
CTO, Stampery Inc.
San Francisco - Madrid
T: +34 663 163 375
Mark Friedenbach via bitcoin-dev
2017-09-19 00:46:30 UTC
Reply
Permalink
Raw Message
As some of you may know, the MAST proposal I sent to the mailing list
on September 6th was discussed that the in-person CoreDev meetup in
San Francisco. In this email I hope to summarize the outcome of that
discussion. As chatham house rules were in effect, I will refrain from
attributing names to this summary..

* An introductory overview of the BIPs was presented, for the purpose
of familiarizing the audience with what they are attempting to
accomplish and how they do so.

* There was a discussion of a single vs multi-element MBV opcode. It
was put forward that there should perhaps be different opcodes for
the sake of script analysis, since a multi-element MBV will
necessarily consume a variable number of inputs. However it was
countered that if the script encodes the number of elements as an
integer push to the top of the stack immediately before the opcode,
then static analyzability is maintained in such instances. I took
the action item to investigate what an ideal serialization format
would be for a multi-element proof, which is the only thing holding
back a multi-element MBV proposal.

* It was pointed out that the non-clean-stack tail-call semantics is
not compatible with segwit's consensus-enforcement of the clean
stack rule. Some alternatives were suggested, such as changing
deployment mechanisms. After the main discussion session it was
observed that tail-call semantics could still be maintained if the
alt stack is used for transferring arguments to the policy script. I
will be updating the BIP and example implementation accordingly.

* The observation was made that single-layer tail-call semantics can
be thought of as really being P2SH with user-specified hashing. If
the P2SH script template had been constructed slightly differently
such as to not consume the script, it would even have been fully
compatible with tail-call semantics.

* It was mentioned that using script versioning to deploy a MAST
template allows for saving 32 bytes of witness per input, as the
root hash is contained directly in the output being spent. The
downside however is losing the permissionless innovation that comes
with a programmable hashing mechanism.

* The discussion generally drifted into a wider discussion about
script version upgrades and related issues, such as whether script
versions should exist in the witness as well, and the difference in
meaning between the two. This is an important subject, but only of
relevance in far as using a script version upgrade to deploy MAST
would add significant delay from having to sort through these issues
first.

This feedback led to some minor tweaks to the proposal, which I will
be making, as well as the major feature request of a multi-element
MERKLE-BLOCK-VERIFY opcode which requires a little bit more effort to
accomplish. I will report back to this list again when that work is
done.

Sincerely,
Mark Friedenbach
Luke Dashjr via bitcoin-dev
2017-09-19 03:09:08 UTC
Reply
Permalink
Raw Message
On Tuesday 19 September 2017 12:46:30 AM Mark Friedenbach via bitcoin-dev
After the main discussion session it was observed that tail-call semantics
could still be maintained if the alt stack is used for transferring
arguments to the policy script.
Isn't this a bug in the cleanstack rule?

(Unrelated...)

Another thing that came up during the discussion was the idea of replacing all
the NOPs and otherwise-unallocated opcodes with a new OP_RETURNTRUE
implementation, in future versions of Script. This would immediately exit the
program (perhaps performing some semantic checks on the remainder of the
Script) with a successful outcome.

This is similar to CVE-2010-5141 in a sense, but since signatures are no
longer Scripts themselves, it shouldn't be exploitable.

The benefit of this is that it allows softforking in ANY new opcode, not only
the -VERIFY opcode variants we've been doing. That is, instead of merely
terminating the Script with a failure, the new opcode can also remove or push
stack items. This is because old nodes, upon encountering the undefined
opcode, will always succeed immediately, allowing the new opcode to do
literally anything from that point onward.

Luke
Mark Friedenbach via bitcoin-dev
2017-09-19 07:33:54 UTC
Reply
Permalink
Raw Message
Post by Luke Dashjr via bitcoin-dev
On Tuesday 19 September 2017 12:46:30 AM Mark Friedenbach via bitcoin-dev
After the main discussion session it was observed that tail-call semantics
could still be maintained if the alt stack is used for transferring
arguments to the policy script.
Isn't this a bug in the cleanstack rule?
Well in the sense that "cleanstack" doesn't do what it says, sure.

However cleanstack was introduced as a consensus rule to prevent a
possible denial of service vulnerability where a third party could
intercept any* transaction broadcast and arbitrarily add data to the
witness stack, since witness data is not covered by a checksig.

Cleanstack as-is accomplishes this because any extra items on the
stack would pass through all realistic scripts, remaining on the stack
and thereby violating the rule. There is no reason to prohibit extra
items on the altstack as those items can only arrive there
purposefully as an action of the script itself, not a third party
malleation of witness data. You could of course use DEPTH to write a
script that takes a variable number of parameters and sends them to
the altstack. Such a script would be malleable if those extra
parameters are not used. But that is predicated on the script being
specifically written in such a way as to be vulnerable; why protect
against that?

There are other solutions to this problem that could have been taken
instead, such as committing to the number of items or maximum size of
the stack as part of the sighash data, but cleanstack was the approach
taken. Arguably for a future script version upgrade one of these other
approaches should be taken to allow for shorter tail-call scripts.

Mark

* Well, almost any. You could end the script with DEPTH EQUAL and that
is a compact way of ensuring the stack is clean (assuming the script
finished with just "true" on the stack). Nobody does this however
and burning two witness bytes of every redeem script going forward
as a protective measure seems like an unnecessary ask.
Sergio Demian Lerner via bitcoin-dev
2017-09-22 20:32:56 UTC
Reply
Permalink
Raw Message
Post by Mark Friedenbach via bitcoin-dev
There are other solutions to this problem that could have been taken
instead, such as committing to the number of items or maximum size of
the stack as part of the sighash data, but cleanstack was the approach
taken.
The lack of signed maximum segwit stack size was one of the objections to
segwit I presented last year. This together with the unlimited segwit stack
size.

However, committing to the maximum stack size (in bytes) for an input is
tricky. The only place where this could be packed is in sequence_no, with a
soft-fork. E.g. when transaction version is 2 and and only when lock_time
is zero.

For transactions with locktime >0, we could soft-fork so transactions add a
last zero-satoshi output whose scriptPub contains OP_RETURN and followed by
N VarInts, containing the maximum stack size of each input.
Normally, for a 400 byte, 2-input transaction, this will add 11 bytes, or a
2.5% overhead.
Post by Mark Friedenbach via bitcoin-dev
Arguably for a future script version upgrade one of these other
approaches should be taken to allow for shorter tail-call scripts.
Mark
* Well, almost any. You could end the script with DEPTH EQUAL and that
is a compact way of ensuring the stack is clean (assuming the script
finished with just "true" on the stack). Nobody does this however
and burning two witness bytes of every redeem script going forward
as a protective measure seems like an unnecessary ask.
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Mark Friedenbach via bitcoin-dev
2017-09-22 21:11:03 UTC
Reply
Permalink
Raw Message
Post by Mark Friedenbach via bitcoin-dev
There are other solutions to this problem that could have been taken
instead, such as committing to the number of items or maximum size of
the stack as part of the sighash data, but cleanstack was the approach
taken.
The lack of signed maximum segwit stack size was one of the objections to segwit I presented last year. This together with the unlimited segwit stack size.
However, committing to the maximum stack size (in bytes) for an input is tricky. The only place where this could be packed is in sequence_no, with a soft-fork. E.g. when transaction version is 2 and and only when lock_time is zero.
For transactions with locktime >0, we could soft-fork so transactions add a last zero-satoshi output whose scriptPub contains OP_RETURN and followed by N VarInts, containing the maximum stack size of each input.
Normally, for a 400 byte, 2-input transaction, this will add 11 bytes, or a 2.5% overhead.
There’s no need to put it in the transaction itself. You put it in the witness and it is either committed to as part of the witness (in which case it has to hold for all possible spend paths), or at spend time by including it in the data signed by CHECKSIG.
Sergio Demian Lerner via bitcoin-dev
2017-09-22 21:32:00 UTC
Reply
Permalink
Raw Message
But generally before one signs a transaction one does not know the
signature size (which may be variable). One can only estimate the maximum
size.
On Sep 22, 2017, at 1:32 PM, Sergio Demian Lerner <
Post by Mark Friedenbach via bitcoin-dev
There are other solutions to this problem that could have been taken
instead, such as committing to the number of items or maximum size of
the stack as part of the sighash data, but cleanstack was the approach
taken.
The lack of signed maximum segwit stack size was one of the objections to
segwit I presented last year. This together with the unlimited segwit stack
size.
However, committing to the maximum stack size (in bytes) for an input is
tricky. The only place where this could be packed is in sequence_no, with a
soft-fork. E.g. when transaction version is 2 and and only when lock_time
is zero.
For transactions with locktime >0, we could soft-fork so transactions add
a last zero-satoshi output whose scriptPub contains OP_RETURN and followed
by N VarInts, containing the maximum stack size of each input.
Normally, for a 400 byte, 2-input transaction, this will add 11 bytes, or a 2.5% overhead.
There’s no need to put it in the transaction itself. You put it in the
witness and it is either committed to as part of the witness (in which case
it has to hold for all possible spend paths), or at spend time by including
it in the data signed by CHECKSIG.
Mark Friedenbach via bitcoin-dev
2017-09-22 21:39:45 UTC
Reply
Permalink
Raw Message
You generally know the witness size to within a few bytes right before signing. Why would you not? You know the size of ECDSA signatures. You can be told the size of a hash preimage by the other party. It takes some contriving to come up with a scheme where one party has variable-length signatures of their chosing
But generally before one signs a transaction one does not know the signature size (which may be variable). One can only estimate the maximum size.
Sergio Demian Lerner via bitcoin-dev
2017-09-22 21:54:39 UTC
Reply
Permalink
Raw Message
If the variable size increase is only a few bytes, then three possibilities
arise:

- one should allow signatures to be zero padded (to reach the maximum size)
and abandon strict DER encoding

- one should allow spare witness stack elements (to pad the size to match
the maximum size) and remove the cleanstack rule. But this is tricky
because empty stack elements must be counted as 1 byte.

- signers must loop the generation of signatures until the signature
generated is of its maximum size.
Post by Mark Friedenbach via bitcoin-dev
You generally know the witness size to within a few bytes right before
signing. Why would you not? You know the size of ECDSA signatures. You can
be told the size of a hash preimage by the other party. It takes some
contriving to come up with a scheme where one party has variable-length
signatures of their chosing
On Sep 22, 2017, at 2:32 PM, Sergio Demian Lerner <
But generally before one signs a transaction one does not know the
signature size (which may be variable). One can only estimate the maximum
size.
Mark Friedenbach via bitcoin-dev
2017-09-22 22:07:33 UTC
Reply
Permalink
Raw Message
There is no harm in the value being a maximum off by a few bytes.
- one should allow signatures to be zero padded (to reach the maximum size) and abandon strict DER encoding
- one should allow spare witness stack elements (to pad the size to match the maximum size) and remove the cleanstack rule. But this is tricky because empty stack elements must be counted as 1 byte.
- signers must loop the generation of signatures until the signature generated is of its maximum size.
Pieter Wuille via bitcoin-dev
2017-09-22 22:09:07 UTC
Reply
Permalink
Raw Message
On Fri, Sep 22, 2017 at 2:54 PM, Sergio Demian Lerner via bitcoin-dev <
Post by Sergio Demian Lerner via bitcoin-dev
If the variable size increase is only a few bytes, then three
- one should allow signatures to be zero padded (to reach the maximum
size) and abandon strict DER encoding
- one should allow spare witness stack elements (to pad the size to match
the maximum size) and remove the cleanstack rule. But this is tricky
because empty stack elements must be counted as 1 byte.
- signers must loop the generation of signatures until the signature
generated is of its maximum size.
Or (my preference);

- Get rid of DER encoding alltogether and switch to fixed size signatures.

Cheers,
--
Pieter
Johnson Lau via bitcoin-dev
2017-09-20 05:13:04 UTC
Reply
Permalink
Raw Message
Post by Luke Dashjr via bitcoin-dev
On Tuesday 19 September 2017 12:46:30 AM Mark Friedenbach via bitcoin-dev
After the main discussion session it was observed that tail-call semantics
could still be maintained if the alt stack is used for transferring
arguments to the policy script.
Isn't this a bug in the cleanstack rule?
(Unrelated...)
Another thing that came up during the discussion was the idea of replacing all
the NOPs and otherwise-unallocated opcodes with a new OP_RETURNTRUE
implementation, in future versions of Script. This would immediately exit the
program (perhaps performing some semantic checks on the remainder of the
Script) with a successful outcome.
This is similar to CVE-2010-5141 in a sense, but since signatures are no
longer Scripts themselves, it shouldn't be exploitable.
The benefit of this is that it allows softforking in ANY new opcode, not only
the -VERIFY opcode variants we've been doing. That is, instead of merely
terminating the Script with a failure, the new opcode can also remove or push
stack items. This is because old nodes, upon encountering the undefined
opcode, will always succeed immediately, allowing the new opcode to do
literally anything from that point onward.
Luke
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
I have implemented OP_RETURNTRUE in an earlier version of MAST (BIP114) but have given up the idea, for 2 reasons:

1. I’ve updated BIP114 to allow inclusion of scripts in witness, and require them to be signed. In this way users could add additional conditions for the validity of a signature. For example, with OP_CHECKBLOCKHASH, it is possible to make the transaction valid only in the specified chain. (More discussion in https://github.com/jl2012/bips/blob/vault/bip-0114.mediawiki#Additional_scripts_in_witness <https://github.com/jl2012/bips/blob/vault/bip-0114.mediawiki#Additional_scripts_in_witness> )

2. OP_RETURNTRUE does not work well with signature aggregation. Signature aggregation will collect (pubkey, message) pairs in a tx, combine them, and verify with one signature. However, consider the following case:

OP_RETURNTRUE OP_IF <pubkey> OP_CHECKSIGVERIFY OP_ENDIF OP_TRUE

For old nodes, the script terminates at OP_RETURNTRUE, and it will not collect the (pubkey, message) pair.

If we use a softfork to transform OP_RETURNTRUE into OP_17 (pushing the number 17 to the stack), new nodes will collect the (pubkey, message) pair and try to aggregate with other pairs. This becomes a hardfork.

--------
Technically, we could create ANY op code with an OP_NOP. For example, if we want OP_MUL, we could have OP_MULVERIFY, which verifies if the 3rd stack item is the product of the top 2 stack items. Therefore, OP_MULVERIFY OP_2DROP is functionally same as OP_MUL, which removes the top 2 items and returns the product. The problem is it takes more witness space.

If we don’t want this ugliness, we could use a new script version for every new op code we add. In the new BIP114 (see link above), I suggest to move the script version to the witness, which is cheaper.
Mark Friedenbach via bitcoin-dev
2017-09-20 19:29:17 UTC
Reply
Permalink
Raw Message
If we don’t want this ugliness, we could use a new script version for every new op code we add. In the new BIP114 (see link above), I suggest to move the script version to the witness, which is cheaper.
To be clear, I don’t think it is so much that the version should be moved to the witness, but rather that there are two separate version values here — one in the scriptPubKey which specifies the format and structure of the segwit commitment itself, and another in the witness which gates functionality in script or whatever else is used by that witness type. Segwit just unfortunately didn’t include the latter, an oversight that should be corrected on the on the next upgrade opportunity.

The address-visible “script version” field should probably be renamed “witness type” as it will only be used in the future to encode how to check the witness commitment in the scriptPubKey against the data provided in the witness. Upgrades and improvements to the features supported by those witness types won’t require new top-level witness types to be defined. Defining a new opcode, even one with modifies the stack, doesn’t change the hashing scheme used by the witness type.

v0,32-bytes is presently defined to calculate the double-SHA256 hash of the top-most serialized item on the stack, and compare that against the 32-byte commitment value. Arguably it probably should have hashed the top two values, one of which would have been the real script version. This could be fixed however, even without introducing a new witness type. Do a soft-fork upgrade that checks if the witness redeem script is push-only, and if so then pop the last push off as the script version (>= 1), and concatenate the rest to form the actual redeem script. We inherit a little technical debt from having to deal with push limits, but we avoid burning v0 in an upgrade to v1 that does little more than add a script version.

v1,32-bytes would then be used for a template version of MAST, or whatever other idea comes along that fundamentally changes the way the witness commitment is calculated.

Mark
Johnson Lau via bitcoin-dev
2017-09-21 03:58:05 UTC
Reply
Permalink
Raw Message
Post by Mark Friedenbach via bitcoin-dev
If we don’t want this ugliness, we could use a new script version for every new op code we add. In the new BIP114 (see link above), I suggest to move the script version to the witness, which is cheaper.
To be clear, I don’t think it is so much that the version should be moved to the witness, but rather that there are two separate version values here — one in the scriptPubKey which specifies the format and structure of the segwit commitment itself, and another in the witness which gates functionality in script or whatever else is used by that witness type. Segwit just unfortunately didn’t include the latter, an oversight that should be corrected on the on the next upgrade opportunity.
The address-visible “script version” field should probably be renamed “witness type” as it will only be used in the future to encode how to check the witness commitment in the scriptPubKey against the data provided in the witness. Upgrades and improvements to the features supported by those witness types won’t require new top-level witness types to be defined. Defining a new opcode, even one with modifies the stack, doesn’t change the hashing scheme used by the witness type.
v0,32-bytes is presently defined to calculate the double-SHA256 hash of the top-most serialized item on the stack, and compare that against the 32-byte commitment value. Arguably it probably should have hashed the top two values, one of which would have been the real script version. This could be fixed however, even without introducing a new witness type. Do a soft-fork upgrade that checks if the witness redeem script is push-only, and if so then pop the last push off as the script version (>= 1), and concatenate the rest to form the actual redeem script. We inherit a little technical debt from having to deal with push limits, but we avoid burning v0 in an upgrade to v1 that does little more than add a script version.
v1,32-bytes would then be used for a template version of MAST, or whatever other idea comes along that fundamentally changes the way the witness commitment is calculated.
Mark
This is exactly what I suggest with BIP114. Using v1, 32-byte to define the basic structure of Merklized Script, and define the script version inside the witness

Johnson
Luke Dashjr via bitcoin-dev
2017-09-21 04:11:49 UTC
Reply
Permalink
Raw Message
Post by Johnson Lau via bitcoin-dev
2. OP_RETURNTRUE does not work well with signature aggregation. Signature
aggregation will collect (pubkey, message) pairs in a tx, combine them,
OP_RETURNTRUE OP_IF <pubkey> OP_CHECKSIGVERIFY OP_ENDIF OP_TRUE
For old nodes, the script terminates at OP_RETURNTRUE, and it will not
collect the (pubkey, message) pair.
If we use a softfork to transform OP_RETURNTRUE into OP_17 (pushing the
number 17 to the stack), new nodes will collect the (pubkey, message) pair
and try to aggregate with other pairs. This becomes a hardfork.
This seems like a problem for signature aggregation to address, not a problem
for OP_RETURNTRUE... In any case, I don't think it's insurmountable. Signature
aggregation can simply be setup upfront, and have the Script verify inclusion
of keys in the aggregation?
Post by Johnson Lau via bitcoin-dev
Technically, we could create ANY op code with an OP_NOP. For example, if we
want OP_MUL, we could have OP_MULVERIFY, which verifies if the 3rd stack
item is the product of the top 2 stack items. Therefore, OP_MULVERIFY
OP_2DROP is functionally same as OP_MUL, which removes the top 2 items and
returns the product. The problem is it takes more witness space.
This is another approach, and one that seems like a good idea in general. I'm
not sure it actually needs to take more witness space - in theory, such stack
items could be implied if the Script engine is designed for it upfront. Then
it would behave as if it were non-verify, while retaining backward
compatibility.

Luke
Johnson Lau via bitcoin-dev
2017-09-21 08:02:42 UTC
Reply
Permalink
Raw Message
Post by Luke Dashjr via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
2. OP_RETURNTRUE does not work well with signature aggregation. Signature
aggregation will collect (pubkey, message) pairs in a tx, combine them,
OP_RETURNTRUE OP_IF <pubkey> OP_CHECKSIGVERIFY OP_ENDIF OP_TRUE
For old nodes, the script terminates at OP_RETURNTRUE, and it will not
collect the (pubkey, message) pair.
If we use a softfork to transform OP_RETURNTRUE into OP_17 (pushing the
number 17 to the stack), new nodes will collect the (pubkey, message) pair
and try to aggregate with other pairs. This becomes a hardfork.
This seems like a problem for signature aggregation to address, not a problem
for OP_RETURNTRUE... In any case, I don't think it's insurmountable. Signature
aggregation can simply be setup upfront, and have the Script verify inclusion
of keys in the aggregation?
I think it’s possible only if you spend more witness space to store the (pubkey, message) pairs, so that old clients could understand the aggregation produced by new clients. But this completely defeats the purpose of doing aggregation.

We use different skills to save space. For example, we use 1-byte SIGHASH flag to imply the 32-byte message. For maximal space saving, sig aggregation will also rely on such skills. However, the assumption is that all signatures aggregated must follow exactly the same set of rules.
Post by Luke Dashjr via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
Technically, we could create ANY op code with an OP_NOP. For example, if we
want OP_MUL, we could have OP_MULVERIFY, which verifies if the 3rd stack
item is the product of the top 2 stack items. Therefore, OP_MULVERIFY
OP_2DROP is functionally same as OP_MUL, which removes the top 2 items and
returns the product. The problem is it takes more witness space.
This is another approach, and one that seems like a good idea in general. I'm
not sure it actually needs to take more witness space - in theory, such stack
items could be implied if the Script engine is designed for it upfront. Then
it would behave as if it were non-verify, while retaining backward
compatibility.
Sounds interesting but I don’t get it. For example, how could you make a OP_MUL out of OP_NOP?
Post by Luke Dashjr via bitcoin-dev
Luke
Luke Dashjr via bitcoin-dev
2017-09-21 16:33:16 UTC
Reply
Permalink
Raw Message
Post by Johnson Lau via bitcoin-dev
I think it’s possible only if you spend more witness space to store the
(pubkey, message) pairs, so that old clients could understand the
aggregation produced by new clients. But this completely defeats the
purpose of doing aggregation.
SigAgg is a softfork, so old clients *won't* understand it... am I missing
something?

For example, perhaps the lookup opcode could have a data payload itself (eg,
like pushdata opcodes do), and the script can be parsed independently from
execution to collect the applicable ones.
Post by Johnson Lau via bitcoin-dev
Post by Luke Dashjr via bitcoin-dev
This is another approach, and one that seems like a good idea in general.
I'm not sure it actually needs to take more witness space - in theory,
such stack items could be implied if the Script engine is designed for
it upfront. Then it would behave as if it were non-verify, while
retaining backward compatibility.
Sounds interesting but I don’t get it. For example, how could you make a
OP_MUL out of OP_NOP?
The same as your OP_MULVERIFY at the consensus level, except new clients would
execute it as an OP_MUL, and inject pops/pushes when sending such a
transaction to older clients. The hash committed to for the script would
include the inferred values, but not the actual on-chain data. This would
probably need to be part of some kind of MAST-like softfork to be viable, and
maybe not even then.

Luke
Johnson Lau via bitcoin-dev
2017-09-21 17:38:01 UTC
Reply
Permalink
Raw Message
Post by Luke Dashjr via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
I think it’s possible only if you spend more witness space to store the
(pubkey, message) pairs, so that old clients could understand the
aggregation produced by new clients. But this completely defeats the
purpose of doing aggregation.
SigAgg is a softfork, so old clients *won't* understand it... am I missing
something?
For example, perhaps the lookup opcode could have a data payload itself (eg,
like pushdata opcodes do), and the script can be parsed independently from
execution to collect the applicable ones.
I think the current idea of sigagg is something like this: the new OP_CHECKSIG still has 2 arguments: top stack must be a 33-byte public key, and the 2nd top stack item is signature. Depends on the sig size, it returns different value:

If sig size is 0, it returns a 0 to the top stack
If sig size is 1, it is treated as a SIGHASH flag, and the SignatureHash() “message” is calculated. It sends the (pubkey, message) pair to the aggregator, and always returns a 1 to the top stack
If sig size is >1, it is treated as the aggregated signature. The last byte is SIGHASH flag. It sends the (pubkey, message) pair and the aggregated signature to the aggregator, and always returns a 1 to the top stack.

If all scripts pass, the aggregator will combine all pairs to obtain the aggkey and aggmsg, and verify against aggsig. A tx may have at most 1 aggsig.

(The version I presented above is somewhat simplified but should be enough to illustrate my point)

So if we have this script:

OP_1 OP_RETURNTRUE <pubkey> OP_CHECKSIG

Old clients would stop at the OP_RETURNTRUE, and will not send the pubkey to the aggregator

If we softfork OP_RETURNTRUE to something else, even as OP_NOP11, new clients will send the (key, msg) pair to the aggregator. Therefore, the aggregator of old and new clients will see different data, leading to a hardfork.

OTOH, OP_NOP based softfork would not have this problem because it won’t terminate script and return true.
Post by Luke Dashjr via bitcoin-dev
Post by Johnson Lau via bitcoin-dev
Post by Luke Dashjr via bitcoin-dev
This is another approach, and one that seems like a good idea in general.
I'm not sure it actually needs to take more witness space - in theory,
such stack items could be implied if the Script engine is designed for
it upfront. Then it would behave as if it were non-verify, while
retaining backward compatibility.
Sounds interesting but I don’t get it. For example, how could you make a
OP_MUL out of OP_NOP?
The same as your OP_MULVERIFY at the consensus level, except new clients would
execute it as an OP_MUL, and inject pops/pushes when sending such a
transaction to older clients. The hash committed to for the script would
include the inferred values, but not the actual on-chain data. This would
probably need to be part of some kind of MAST-like softfork to be viable, and
maybe not even then.
Luke
I don’t think it’s worth the code complexity, just to save a few bytes of data sent over wire; and to be a soft fork, it still takes the block space.

Maybe we could create many OP_DROPs and OP_2DROPs, so new VERIFY operations could pop the stack. This saves 1 byte and also looks cleaner.

Another approach is to use a new script version for every new non-verify type operation. Problem is we will end up with many versions. Also, signatures from different versions can’t be aggregated. (We may have multiple aggregators in a transaction)
Loading...