Discussion:
Soft-forks and schnorr signature aggregation
Add Reply
Anthony Towns via bitcoin-dev
2018-03-21 04:06:18 UTC
Reply
Permalink
Raw Message
Hello world,

There was a lot of discussion on Schnorr sigs and key and signature
aggregation at the recent core-dev-tech meeting (one relevant conversation
is transcribed at [0]).

Quick summary, with more background detail in the corresponding footnotes:
signature aggregation is awesome [1], and the possibility of soft-forking
in new opcodes via OP_RETURN_VALID opcodes (instead of OP_NOP) is also
awesome [2].

Unfortunately doing both of these together may turn out to be awful.

RETURN_VALID and Signature Aggregation
--------------------------------------

Bumping segwit script versions and redefining OP_NOP opcodes are
fairly straightforward to deal with even with signature aggregation,
the straightforward implementation of both combined is still a soft-fork.

RETURN_VALID, unfortunately, has a serious potential pitfall: any
aggregatable signature operations that occur after it have to go into
separate buckets.

As an example of why this is the case, imagine introducing a covenant
opcode that pulls a potentially complicated condition from the stack
(perhaps, "an output pays at least 50000 satoshi to address xyzzy"),
checks the condition against the transaction, and then pushes 1 (or 0)
back onto the stack indicating compliance with the covenant (or not).
You might then write a script allowing a single person to spend the coins
if they comply with the covenant, and allow breaking the covenant with
someone else's sign-off in addition. You could write this as:

pubkey1 CHECKSIGVERIFY
cond CHECKCOVENANT IFDUP NOTIF pubkey2 CHECKSIG ENDIF

If you pass the covenant, you supply "SIGHASHALL|BUCKET_1" and aggregate
the signature for pubkey1 into bucket1 and you're set; otherwise you supply
"SIGHASHALL|BUCKET_1 SIGHASHALL|BUCKET_1" and aggregate signatures for both
pubkey1 and pubkey2 into bucket1 and you're set. Great!

But this isn't a soft-fork: old nodes would see this script as:

pubkey1 CHECKSIGVERIFY
cond RETURN_VALID IFDUP NOTIF pubkey2 CHECKSIG ENDIF

which it would just interpret as:

pubkey1 CHECKSIGVERIFY cond RETURN_VALID

which is fine if the covenant was passing; but no good if the covenant
didn't pass -- they'd be expecting the aggregted sig to just be for
pubkey1 when it's actually pubkey1+pubkey2, so old nodes would fail the
tx and new nodes would accept it, making it a hard fork.

Solution 0a / 0b
----------------

There are two obvious solutions here:

0a) Just be very careful to ensure any aggregated signatures that
are conditional on an redefined RETURN_VALID opcode go into later
buckets, but be careful about having separate sets of buckets every
time a soft-fork introduces a new redefined opcode. Probably very
complicated to implement correctly, and essentially doubles the
number of buckets you have to potentially deal with every time you
soft fork in a new opcode.

0b) Alternatively, forget about the hope that RETURN_VALID
opcodes could be converted to anything, and just reserve OP_NOP
opcodes and convert them to CHECK_foo_VERIFY opcodes just as we
have been doing, and when we can't do that bump the segwit witness
version for a whole new version of script. Or in twitter speak:
"non-verify upgrades should be done with new script versions" [3]

I think with a little care we can actually salvage RETURN_VALID though!

Solution 1
----------

You don't actually have to write your scripts in ways that can cause
this problem, as long as you're careful. In particular, the problem only
occurs if you do aggregatable CHECKSIG operations after "RETURN_VALID"
-- if you do all the CHECKSIGs first, then all nodes will be checking
for the same signatures, and there's no problem. So you could rewrite
the script above as:

pubkey1 CHECKSIGVERIFY
IF pubkey2 CHECKSIG ENDIF
cond CHECKCOVENANT OR

which is redeemable either by:

sig1 0 [and covenant is met]
sig1 1 sig2 [covenant is not checked]

The witness in this case is essentially committing to the execution path
that would have been taken in the first script by a fully validating node,
then the new script checks all the signatures, and then validates that the
committed execution path was in fact the one that was meant to be taken.

If people are clever enough to write scripts this way, I believe you
can make RETURN_VALID soft-fork safe simply by having every soft-forked
RETURN_VALID operation set a state flag that makes every subsequent
CHECKSIG operation require a non-aggregated sig.

The drawback of this approach is that if the script is complicated
(eg it has multiple IF conditions, some of which are nested), it may be
difficult to write the script to ensure the signatures are checked in the
same combination as the later logic actually requires -- you might have
to store the flag indicating whether you checked particular signatures
on the altstack, or use DUP and PICK/ROLL to organise it on the stack.

Solution 2
----------

We could make that simpler for script authors by making dedicated opcodes
to help with "do all the signatures first" and "check the committed
execution path against reality" steps. I think a reasonable approach
would be something like:

0b01 pubkey2 pubkey1 2 CHECK_AGGSIG_VERIFY
cond CHECKCOVENANT 0b10 CHECK_AGG_SIGNERS OR

which is redeemed either by:

sighash1 0 [and passing the covenant cond]
sighash2 sighash1 0b10

(I'm using the notation 0b10110 to express numbers as binary bitfields;
0b10110 = 22 eg)

That is, two new opcodes, namely:

CHECK_AGGSIG_VERIFY which takes from the stack:
- N: a count of pubkeys
- pubkey1..pubkeyN: N pubkeys
- REQ: a bitmask of which pubkeys are required to sign
- OPT: a bitmask of which optional pubkeys have signed
- sighashes: M sighashes for the pubkeys corresponding to the set
bits of (REQ|OPT)

CHECK_AGGSIG_VERIFY fails if:
- the stack doesn't have enough elements
- the aggregated signature doesn't pass
- a redefined RETURN_VALID opcode has already been seen
- a previous CHECK_AGGSIG_VERIFY has already been seen in this script

REQ|OPT is stored as state

CHECK_AGG_SIGNERS takes from the stack:
- B: a bitmask of which pubkeys are being queried
and it pushes to the stack 1 or 0 based on:
- (REQ|OPT) & B == B ? 1 : 0

A possible way to make sure the "no agg sigs after an upgraded
RETURN_VALID" behaviour works right might be to have "RETURN_VALID"
fail if CHECK_AGGSIG_VERIFY hasn't already been seen. That way once you
redefine RETURN_VALID in a soft-fork, if you have a CHECK_AGGSIG_VERIFY
after a RETURN_VALID you've either already failed (because the
RETURN_VALID wasn't after a CHECK_AGGSIG_VERIFY), or you automatically
fail (because you've already seen a CHECK_AGGSIG_VERIFY).

There would be no need to make CHECKSIG, CHECKSIGVERIFY, CHECKMULTISIG
and CHECKMULTISIGVERIFY do signature aggregation in this case. They could
be left around to allow script authors to force non-aggregate signatures
or could be dropped entirely, I think.

This construct would let you do M-of-N aggregated multisig in a fairly
straightforward manner without needing an explicit opcode, eg:

0 pubkey5 pubkey4 pubkey3 pubkey2 pubkey1 5 CHECK_AGGSIG_VERIFY
0b10000 CHECK_AGG_SIGNERS
0b01000 CHECK_AGG_SIGNERS ADD
0b00100 CHECK_AGG_SIGNERS ADD
0b00010 CHECK_AGG_SIGNERS ADD
0b00001 CHECK_AGG_SIGNERS ADD
3 NUMEQUAL

redeemable by, eg:

0b10110 sighash5 sighash3 sighash2

and a single aggregate signature by the private keys corresponding to
pubkey{2,3,5}.

Of course, another way of getting M-of-N aggregated multisig is via MAST,
which brings us to another approach...

Solution 3
----------

All we're doing above is committing to an execution path and validating
signatures for that path before checking the path was the right one. But
MAST is a great way of committing to an execution path, so another
approach would just be "don't have alternative execution paths, just have
MAST and CHECK/VERIFY codes". Taking the example I've been running with,
that would be:

branch1: 2 pubkey2 pubkey1 2 CHECKMULTISIG
branch2: pubkey1 CHECKSIGVERIFY cond CHECKCOVENANT

So long as MAST is already supported when signature aggregation becomes
possible, that works fine. The drawback is MAST can end up with lots of
branches, eg the 3-of-5 multisig check has 10 branches:

branch1: 3 pubkey3 pubkey2 pubkey1 3 CHECKMULTISIG
branch2: 3 pubkey4 pubkey2 pubkey1 3 CHECKMULTISIG
branch3: 3 pubkey5 pubkey2 pubkey1 3 CHECKMULTISIG
branch4: 3 pubkey4 pubkey3 pubkey1 3 CHECKMULTISIG
branch5: 3 pubkey5 pubkey3 pubkey1 3 CHECKMULTISIG
branch6: 3 pubkey5 pubkey4 pubkey1 3 CHECKMULTISIG
branch7: 3 pubkey4 pubkey3 pubkey2 3 CHECKMULTISIG
branch8: 3 pubkey5 pubkey3 pubkey2 3 CHECKMULTISIG
branch9: 3 pubkey5 pubkey4 pubkey2 3 CHECKMULTISIG
branch10: 3 pubkey5 pubkey4 pubkey3 3 CHECKMULTISIG

while if you want, say, 6-of-11 multisig you get 462 branches, versus
just:

0 pubkey11 pubkey10 pubkey9 pubkey8 pubkey7 pubkey6
pubkey5 pubkey4 pubkey3 pubkey2 pubkey1 11 CHECK_AGGSIG_VERIFY
0b10000000000 CHECK_AGG_SIGNERS
0b01000000000 CHECK_AGG_SIGNERS ADD
0b00100000000 CHECK_AGG_SIGNERS ADD
0b00010000000 CHECK_AGG_SIGNERS ADD
0b00001000000 CHECK_AGG_SIGNERS ADD
0b00000100000 CHECK_AGG_SIGNERS ADD
0b00000010000 CHECK_AGG_SIGNERS ADD
0b00000001000 CHECK_AGG_SIGNERS ADD
0b00000000100 CHECK_AGG_SIGNERS ADD
0b00000000010 CHECK_AGG_SIGNERS ADD
0b00000000001 CHECK_AGG_SIGNERS ADD
6 NUMEQUAL

Provided doing lots of hashes to calculate merkle paths is cheaper than
publishing to the blockchain, MAST will likely still be better though:
you'd be doing 6 pubkeys and 9 steps in the merkle path for about 15*32
bytes in MAST, versus showing off all 11 pubkeys above for 11*(32+4)
bytes, and the above is roughly the worst case for m-of-11 multisig
via MAST.

If everyone's happy to use MAST, then it could be the only solution:
drop OP_IF and friends, and require all the CHECKSIG ops to occur before
any RETURN_VALID ops: since there's no branching, that's just a matter of
reordering your script a bit and should be pretty easy for script authors.

I think there's a couple of drawbacks to this approach that it shouldn't
be the only solution:

a) we don't have a lot of experience with using MAST
b) MAST is a bit more complicated than just dealing with branches in
a script (probably solvable once (a) is no longer the case)
c) some useful scripts might be a bit cheaper expressed with
of branches and be better expressed without MAST

If other approaches than MAST are still desirable, then MAST works fine
in combination with either of the earlier solutions as far as I can see.

Summary
-------

I think something along the lines of solution 2 makes the most sense,
so I think a good approach for aggregate signatures is:

- introduce a new segwit witness version, which I'll call v2 (but which
might actually be v1 or v3 etc, of course)

- v2 must support Schnorr signature verification.

- v2 should have a "pay to public key (hash?)" witness format. direct
signatures of the transaction via the corresponding private key should
be aggregatable.

- v2 should have a "pay to script hash" witness format: probably via
taproot+MAST, possibly via graftroot as well

- v2 should support MAST scripts: again, probably via taproot+MAST

- v2 taproot shouldn't have a separate script version (ie,
the pubkey shouldn't be P+H(P,version,scriptroot)), as signatures
for later-versioned scripts couldn't be aggregated, so there's no
advantage over bumping the segwit witness version

- v2 scripts should have a CHECK_AGG_SIG_VERIFY opcode roughly as
described above for aggregating signatures, along with CHECK_AGG_SIGNERS

- CHECK{MULTI,}SIG{VERIFY,} in v2 scripts shouldn't support aggregated
signatures, and possibly shouldn't be present at all?

- v2 signers should be able to specify an aggregation bucket for each
signature, perhaps in the range 0-7 or so?

- v2 scripts should have a bunch of RETURN_VALID opcodes for future
soft-forks, constrained so that CHECK_AGG_SIG_VERIFY doesn't appear
after them. the currently disabled opcodes should be redefined as
RETURN_VALID eg.

For soft-fork upgrades from that point:

- introducing new opcodes just means redefining an RETURN_VALID opcode

- introducing new sighash versions requires bumping the segwit witness
version (to v3, etc)

- if non-interactive half-signature aggregation isn't ready to go, it
would likewise need a bump in the segwit witness version when
introduced

I think it's worth considering bundling a hard-fork upgrade something
like:

- ~5 years after v2 scripts are activated, existing p2pk/p2pkh UTXOs
(either matching the pre-segwit templates or v0 segwit p2wpkh) can
be spent via a v2-aggregated-signature (but not via taproot)
[4]

- core will maintain a config setting that allows users to prevent
that hard fork from activating via UASF up until the next release
after activation (probably with UASF-enforced miner-signalling that
the hard-fork will not go ahead)

This is already very complicated of course, but note that there's still
*more* things that need to be considered for signature aggregation:

- whether to use Bellare-Neven or muSig in the consensus-critical
aggregation algorithm

- whether to assign the aggregate sigs to inputs and plunk them in the
witness data somewhere, or to add a new structure and commitment and
worry about p2p impact

- whether there are new sighash options that should go in at the same time

- whether non-interactive half-sig aggregation can go in at the same time

That leads me to think that interactive signature aggregation is going to
take a lot of time and work, and it would make sense to do a v1-upgrade
that's "just" Schnorr (and taproot and MAST and re-enabling opcodes and
...) in the meantime. YMMV.

Cheers,
aj

[0] http://diyhpl.us/wiki/transcripts/bitcoin-core-dev-tech/2018-03-06-taproot-graftroot-etc/

[1] Signature aggregation:

Signature aggregation is cool because it lets you post a transaction
spending many inputs, but only providing a single 64 byte signature
that proves authorisation by the holders of all the private keys
for all the inputs. So the witnesses for your inputs might be:

p2wpkh: pubkey1 SIGHASH_ALL
p2wpkh: pubkey2 SIGHASH_ALL
p2wsh: "3 pubkey1 pubkey3 pubkey4 3 CHECKMULTISIG" SIGHASH_ALL SIGHASH_ALL SIGHASH_ALL

where instead of including full 65-byte signature for each CHECKSIG
operation in each input witness, you just include the ~1-byte sighash,
and provide a single 64-byte signature elsewhere, calculated either
according to the Bellare-Neven algorithm, or the muSig algorithm.

In the above case, that means going from about 500 witness bytes
for 5 public keys and 5 signatures, to about 240 witness bytes for
5 public keys and just 1 signature.

A complication here is that because the signatures are aggregated,
in order to validate *any* signature you have to be able to validate
*every* signature.

It's possible to limit that a bit, and have aggregation
"buckets". This might be something you just choose when signing, eg:

p2wpkh: pubkey1 SIGHASH_ALL|BUCKET_1
p2wpkh: pubkey2 SIGHASH_ALL|BUCKET_2
p2wsh: "3 pubkey1 pubkey3 pubkey4 3 CHECKMULTISIG" SIGHASH_ALL|BUCKET_1 SIGHASH_ALL|BUCKET_2 SIGHASH_ALL|BUCKET_2

bucket1: 64 byte sig for (pubkey1, pubkey1)
bucket2: 64 byte sig for (pubkey2, pubkey3, pubkey4)

That way you get the choice to verify both of the pubkey1 signatures
or all of the pubkey{2,3,4} signatures or all the signatures (or
none of the signatures).

This might be useful if the private key for pubkey1 is essentially
offline, and can't easily participate in an interactive protocol
-- with separate buckets the separate signatures can be generated
independently at different times, while with only one bucket,
everyone has to coordinate to produce the signature)

(For clarity: each bucket corresponds to many CHECKSIG operations,
but only contains a single 64-byte signature)

Different buckets will also be necessary when dealing with new
segwit script versions: if there are any aggregated signatures for
v1 addresses that go into bucket X, then aggregate signatures for
v2 addresses cannot go into bucket X, as that would prevent nodes
that support v1 addresses but not v2 addresses from validating
bucket X, which would prevent them from validating the v1 addresses
corresponding to that bucket, which would make the v2 upgrade a hard
fork rather than a soft fork. So each segwit version will need to
introduce a new set of aggregation buckets, which in turn reduces
the benefit you get from signature aggregation.

Note that it's obviously fine to use an aggregated signature in
response to CHECKSIGVERIFY or n-of-n CHECKMULTISIGVERIFY -- when
processing the script you just assume it succeeds, relying on the
fact that the aggregated signature will fail the entire transaction
if there was a problem. However it's also fine to use an aggregated
signature in response to CHECKSIG for most plausible scripts, since:

sig key CHECKSIG

can be treated as equivalent to

sig DUP IF key CHECKSIGVERIFY OP_1 FI

provided invalid signatures are supplied as a "false" value. So
for the purpose of this email, I'll mostly be treating CHECKSIG and
n-of-n CHECKMULTISIG as if they support aggregation.

[2] Soft-forks and RETURN_VALID:

There are two approaches for soft-forking in new opcodes that are
reasonably well understood:

1) We can bump the segwit script version, introducing a new class of
bc1 bech32 addresses, which behave however we like, but can't be
validated at all by existing nodes. This has the downside that it
effectively serialises upgrades.

2) We can redefine OP_NOP opcodes as OP_CHECK_foo_VERIFY
opcodes, along the same lines as OP_CHECKLOCKTIMEVERIFY or
OP_CHECKSEQUENCEVERIFY. This has the downside that it's pretty
restrictive in what new opcodes you can introduce.

A third approach seems possible as well though, which would combine
the benefits of both approaches: allowing any new opcode to be
introduced, and allowing different opcodes to be introduced in
concurrent soft-forks. Namely:

3) If we introduce some RETURN_VALID opcodes (in script for a new
segwit witness version), we can then redefine those as having any
behaviour we might want, including ones that manipulate the stack,
and have the change simply be a soft-fork. RETURN_VALID would
force the script to immediately succeed, in contrast to OP_RETURN
which forces the script to immediately fail.

[3] https://twitter.com/bramcohen/status/972205820275388416

[4] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/015580.html
Anthony Towns via bitcoin-dev
2018-03-21 11:21:19 UTC
Reply
Permalink
Raw Message
Good morning aj,
Good evening Zeeman!

[pulled from the bottom of your mail]
This way, rather than gathering signatures, we gather public keys for aggregate signature checking.
Sorry, I probably didn't explain it well (or at all): during the script,
you're collecting public keys and messages (ie, BIP 143 style digests)
which then go into the signing/verification algorithm to produce/check
the signature.

You do need to gather signatures from each private key holder when
producing the aggregate signature, but that happens at the wallet/p2p
level, rather than the consensus level.
I am probably wrong, but could solution 2 be simplified by using the below opcodes for aggregated signatures?
OP_ADD_AGG_PUBKEY - Adds a public key for verification of an aggregated signature.
OP_CHECK_AGG_SIG[VERIFY] - Check that the gathered public keys matches the aggregated signature.
Checking the gathered public keys match the aggregated signature is
something that only happens for the entire transaction as a whole, so
you don't need an opcode for it in the scripts, since they're per-input.

Otherwise, I think that's pretty similar to what I was already saying;
having:

SIGHASH_ALL|BUCKET_1 pubkey OP_CHECKSIG

would be adding "pubkey" and a message hash calculated via the SIGHASH_ALL
hashing rules to the list of things that the signature for bucket 1 verifies.

FWIW, the Bellare-Neven verification algorithm looks something like:

s*G = R + K (s,R is the signature)
K = sum( H(R, L, i, m) * X_i ) for i corresponding to each pubkey X_i
L = the concatenation of all the pubkeys, X_0..X_n
m = the concatenation of all the message hashes, m_0..m_n

So the way I look at it is each input puts a public key and a message hash
(X_i, m_i) into the bucket via a CHECKSIG operation (or similar), and once
you're done, you look into the bucket and there's just a single signature
(s,R) left to verify. You can't start verifying any of it until you've
looked through all the scripts because you need to know L and m before
you can do anything, and both of those require info from every part of
the aggregation.

[0] [1]
The effect is that in the OP_CHECKCOVENANT case, pre-softfork nodes will not actually do any checking.
Pre-softfork nodes not doing any checking doesn't work with cross-input
signature aggregation as far as I can see. If it did, all you would have
to do to steal people's funds is mine a non-standard transaction:

inputs:
my-millions:
pay-to-pubkey pubkey1
witness=SIGHASH_ALL|BUCKET_1
your-two-cents:
pay-to-script-hash script=[1 OP_RETURN_TRUE pubkey2 CHECKSIG]
witness=SIGHASH_ALL|BUCKET_1

bucket1: 64-random-bytes
output:
all-the-money: you

Because there's no actual soft-fork at this point every node is an "old"
node, so they all see the OP_RETURN_TRUE and stop validating signatures,
accepting the transaction as valid, and giving you all my money, despite
you being unable to actually produce my signature.

Make sense?

Cheers,
aj

[0] For completeness: constructing the signature for Bellare-Neven
requires two communication phases amongst the signers, and looks
roughly like:

1. each party generates a random variable r_i, and sharing the
corresponding curve point R_i=r_i*G and their sighash choice
(ie, m_i) with the other signers.

2. this allows each party to calculate R=sum(R_i) and m,
and hence H(R,L,i,m), at which point each party calculates a
partial signature using their respective private key, x_i:

s_i = r_i + H(R,L,i,m)*x_i

all these s_i values are then communicated to each signer.

3. these combine to give the final signature (s,R),
with s=sum(s_i), allowing each signer to verify that the signing
protocol completed successfully, and any signer can broadcast
the transaction to the blockchain

[1] muSig differs in the details, but is basically the same.
ZmnSCPxj via bitcoin-dev
2018-03-21 23:28:00 UTC
Reply
Permalink
Raw Message
Good morning aj,




​Sent with ProtonMail Secure Email.​

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
Post by Anthony Towns via bitcoin-dev
Good morning aj,
Good evening Zeeman!
[pulled from the bottom of your mail]
This way, rather than gathering signatures, we gather public keys for aggregate signature checking.
Sorry, I probably didn't explain it well (or at all): during the script,
you're collecting public keys and messages (ie, BIP 143 style digests)
which then go into the signing/verification algorithm to produce/check
the signature.
Yes, I think this is indeed what OP_CHECK_AGG_SIG really does.

What I propose is that we have two places where we aggregate public keys: one at the script level, and one at the transaction level. OP_ADD_AGG_PUBKEY adds to the script-level aggregate, then OP_CHECK_AGG_SIG adds the script-level aggregate to the transaction-level aggregate.

Unfortunately it will not work since transaction-level aggregate (which is actually what gets checked) is different between pre-fork and post-fork nodes.

It looks like signature aggregation is difficult to reconcile with script...

Regards,
ZmnSCPxj
Andrew Poelstra via bitcoin-dev
2018-03-21 12:45:21 UTC
Reply
Permalink
Raw Message
Post by Anthony Towns via bitcoin-dev
That leads me to think that interactive signature aggregation is going to
take a lot of time and work, and it would make sense to do a v1-upgrade
that's "just" Schnorr (and taproot and MAST and re-enabling opcodes and
...) in the meantime. YMMV.
Unfortunately I agree. Another complication with aggregate signatures is
that they complicate blind signature protocols such as [1]. In particular
they break the assumption "one signature can spend at most one UTXO"
meaning that a blind signer cannot tell how many coins they're authorizing
with a given signature, even if they've ensured that the key they're using
only controls UTXOs of a fixed value.

This seems solvable with creative use of ZKPs, but the fact that it's even
a problem caught me off guard, and makes me think that signature aggregation
is much harder to think about than e.g. Taproot which does not change
signature semantics at all.


Andrew



[1] https://github.com/jonasnick/scriptless-scripts/blob/blind-swaps/md/partially-blind-swap.md
--
Andrew Poelstra
Mathematics Department, Blockstream
Email: apoelstra at wpsoftware.net
Web: https://www.wpsoftware.net/andrew

"A goose alone, I suppose, can know the loneliness of geese
who can never find their peace,
whether north or south or west or east"
--Joanna Newsom
Bram Cohen via bitcoin-dev
2018-03-22 00:47:01 UTC
Reply
Permalink
Raw Message
Regarding the proposed segwit v2 with reclaiming most things as
RETURN_VALID, the net result for what's being proposed in the near future
for supporting aggregated signatures in the not-so-near future is to punt.
A number of strategies are possible for how to deal with new opcodes being
added later on, and the general strategy of making unused opcodes be
RETURN_VALID for now and figuring out how to handle it later works for all
of them. I think this is the right approach, but wanted to clarify that it
is in fact the approach being proposed.

That said, there are some subtleties to getting it right which the last
message doesn't really cover. Most unused opcodes should be reclaimed as
RETURN_VALID, but there should still be one OP_NOP and there should be a
'real' RETURN_VALID, which (a) is guaranteed to not be soft forked into
something else in the future, and (b) doesn't have any parsing weirdness.
The parsing weirdness of all the unclaimed opcodes is interesting. Because
everything in an IF clause needs to be parsed in order to find where the
ELSE is, you have a few options for dealing with an unknown opcode getting
parsed in an unexecuted section of code. They are (a) avoid the problem
completely by exterminating IF and MASTing (b) avoid the problem completely
by getting rid of IF and adding IFJUMP, IFNJUMP, and JUMP which specify a
number of bytes (this also allows for script merkleization) (c) require all
new opcodes have fixed length 1, even after they're soft forked, (d) do
almost like (c) but require that on new soft forks people hack their old
scripts to still parse properly by avoiding the OP_ELSE in inopportune
places (yuck!) (e) make it so that the unknown opcodes case a RETURN_VALID
even when they're parsed, regardless of whether they're being executed.

By far the most expedient option is (e) cause a RETURN_VALID at parse time.
There's even precedent for this sort of behavior in the other direction
with disabled opcodes causing failure at parse time even if they aren't
being executed.

A lot can be said about all the options, but one thing I feel like snarking
about is that if you get rid of IFs using MAST, then it's highly unclear
whether OP_DEPTH should be nuked as well. My feeling is that it should and
that strict parsing should require that the bottom thing in the witness
gets referenced at some point.

Hacking in a multisig opcode isn't a horrible idea, but it is very stuck
specifically on m-of-n and doesn't support more complex formulas for how
signatures can be combined, which makes it feel hacky and weird.

Also it may make sense to seriously consider BLS signatures, which have a
lot of practical benefits starting with them being noninteractively
aggregatable so you can always assume that they're aggregated instead of
requiring complex semantics to specify what's aggregated with what. My team
is working on an implementation which has several advantages over what's
currently in the published literature but it isn't quite ready for public
consumption yet. This should probably go on the pile of reasons why it's
premature to finalize a plan for aggregation at this point.
Anthony Towns via bitcoin-dev
2018-03-27 06:34:33 UTC
Reply
Permalink
Raw Message
[...] Most unused opcodes should be reclaimed as RETURN_VALID,
but there should still be one OP_NOP and there should be a 'real' RETURN_VALID,
which (a) is guaranteed to not be soft forked into something else in the
future, and (b) doesn't have any parsing weirdness.
What's the reason for those? I could see an argument for RETURN_VALID, I guess:

confA IF condB IF condC IF [pathA] RETURN_VALID ENDIF ENDIF ENDIF [pathB]

is probably simpler and saves 3 bytes compared to:

1 condA IF condB IF condC IF [pathA] NOT ENDIF ENDIF ENDIF IF [pathB] ENDIF

but that doesn't seem crazy compelling? I don't see a reason to just keep
one OP_NOP though.
The parsing weirdness of
all the unclaimed opcodes is interesting. Because everything in an IF clause
needs to be parsed in order to find where the ELSE is, you have a few options
for dealing with an unknown opcode getting parsed in an unexecuted section of
code. They are (a) avoid the problem completely by exterminating IF and MASTing
(b) avoid the problem completely by getting rid of IF and adding IFJUMP,
IFNJUMP, and JUMP which specify a number of bytes (this also allows for script
merkleization) (c) require all new opcodes have fixed length 1, even after
they're soft forked, (d) do almost like (c) but require that on new soft forks
people hack their old scripts to still parse properly by avoiding the OP_ELSE
in inopportune places (yuck!) (e) make it so that the unknown opcodes case a
RETURN_VALID even when they're parsed, regardless of whether they're being
executed.
I was figuring (c), fwiw, and assuming that opcodes will just be about
manipulating stack values and marking the script as invalid, rather than,
say, introducing new flow control ops.
By far the most expedient option is (e) cause a RETURN_VALID at parse time.
There's even precedent for this sort of behavior in the other direction with
disabled opcodes causing failure at parse time even if they aren't being
executed.
You're probably right. That still doesn't let you implement intercal's
COMEFROM statement as a new opcode, of course. :)
A lot can be said about all the options, but one thing I feel like snarking
about is that if you get rid of IFs using MAST, then it's highly unclear
whether OP_DEPTH should be nuked as well. My feeling is that it should and that
strict parsing should require that the bottom thing in the witness gets
referenced at some point.
I guess when passing the script you could perhaps check if each witness
item could have been replaced with OP_FALSE or OP_1 and still get the
same result, and consider the transaction non-standard if so?
Hacking in a multisig opcode isn't a horrible idea, but it is very stuck
specifically on m-of-n and doesn't support more complex formulas for how
signatures can be combined, which makes it feel hacky and weird.
Hmm? The opcode I suggested works just as easily with arbitrary formulas,
eg, "There must be at least 1 signer from pka{1,2,3}, and 3 signers all
up, except each of pkb{1,2,3,4,5,6} only counts for half":

0 pkb6 pkb5 pkb4 pkb3 pkb2 pkb1 pka3 pka2 pka1 9 CHECK_AGGSIG_VERIFY
(declare pubkeys)
0b111 CHECK_AGG_SIGNERS VERIFY
(one of pka{1,2,3} must sign)
0b001 CHECK_AGG_SIGNERS
0b010 CHECK_AGG_SIGNERS ADD
0b100 CHECK_AGG_SIGNERS ADD
DUP ADD
(pka{1,2,3} count double)
0b000001000 CHECK_AGG_SIGNERS ADD
0b000010000 CHECK_AGG_SIGNERS ADD
0b000100000 CHECK_AGG_SIGNERS ADD
0b001000000 CHECK_AGG_SIGNERS ADD
0b010000000 CHECK_AGG_SIGNERS ADD
0b100000000 CHECK_AGG_SIGNERS ADD
(pkb{1..6} count single)
6 EQUAL
(summing to a total of 3 doubled)

Not sure that saves it from being "hacky and weird" though...

(There are different ways you could do "CHECK_AGG_SIGNERS": for
instance, take a bitmask of keys and return the bitwise-and with the
keys that signed, or take a bitmask and just return the number of keys
matching that bitmask that signed, or take a pubkey index and return a
boolean whether that key signed)

Cheers,
aj
Bram Cohen via bitcoin-dev
2018-03-28 03:19:48 UTC
Reply
Permalink
Raw Message
Post by Anthony Towns via bitcoin-dev
[...] Most unused opcodes should be reclaimed as RETURN_VALID,
but there should still be one OP_NOP and there should be a 'real'
RETURN_VALID,
which (a) is guaranteed to not be soft forked into something else in the
future, and (b) doesn't have any parsing weirdness.
confA IF condB IF condC IF [pathA] RETURN_VALID ENDIF ENDIF ENDIF [pathB]
1 condA IF condB IF condC IF [pathA] NOT ENDIF ENDIF ENDIF IF [pathB] ENDIF
but that doesn't seem crazy compelling?
Mostly yes it's for that case and also for:

condA IF RETURN_VALID ENDIF condb IF RETURN_VALID ENDIF condc

Technically that can be done with fewer opcodes using OP_BOOLOR but maybe
in the future there will be some incentive for short circuit evaluation

But there's also the general principle that it's only one opcode and if
there are a lot of things which look like RETURN_VALID there should be one
thing which actually is RETURN_VALID
Post by Anthony Towns via bitcoin-dev
I don't see a reason to just keep one OP_NOP though.
Mostly based on momentum because there are several of them there right now.
If noone else wants to defend it I won't either.
Post by Anthony Towns via bitcoin-dev
By far the most expedient option is (e) cause a RETURN_VALID at parse
time.
There's even precedent for this sort of behavior in the other direction
with
disabled opcodes causing failure at parse time even if they aren't being
executed.
You're probably right. That still doesn't let you implement intercal's
COMEFROM statement as a new opcode, of course. :)
That can be in the hardfork wishlist :-)
Post by Anthony Towns via bitcoin-dev
A lot can be said about all the options, but one thing I feel like
snarking
about is that if you get rid of IFs using MAST, then it's highly unclear
whether OP_DEPTH should be nuked as well. My feeling is that it should
and that
strict parsing should require that the bottom thing in the witness gets
referenced at some point.
I guess when passing the script you could perhaps check if each witness
item could have been replaced with OP_FALSE or OP_1 and still get the
same result, and consider the transaction non-standard if so?
Essentially all opcodes including OP_PICK make clear at runtime how deep
they go and anything below the max depth can be safely eliminated (or used
as grounds for rejecting in strict mode). The big exception is OP_DEPTH
which totally mangles the assumptions. It's trivial to make scripts which
use OP_DEPTH which become invalid with things added below the stack then go
back to being valid again with more things added even though the individual
items are never even accessed.
Post by Anthony Towns via bitcoin-dev
Hacking in a multisig opcode isn't a horrible idea, but it is very stuck
specifically on m-of-n and doesn't support more complex formulas for how
signatures can be combined, which makes it feel hacky and weird.
Hmm? The opcode I suggested works just as easily with arbitrary formulas,
eg, "There must be at least 1 signer from pka{1,2,3}, and 3 signers all
0 pkb6 pkb5 pkb4 pkb3 pkb2 pkb1 pka3 pka2 pka1 9 CHECK_AGGSIG_VERIFY
(declare pubkeys)
0b111 CHECK_AGG_SIGNERS VERIFY
(one of pka{1,2,3} must sign)
0b001 CHECK_AGG_SIGNERS
0b010 CHECK_AGG_SIGNERS ADD
0b100 CHECK_AGG_SIGNERS ADD
DUP ADD
(pka{1,2,3} count double)
0b000001000 CHECK_AGG_SIGNERS ADD
0b000010000 CHECK_AGG_SIGNERS ADD
0b000100000 CHECK_AGG_SIGNERS ADD
0b001000000 CHECK_AGG_SIGNERS ADD
0b010000000 CHECK_AGG_SIGNERS ADD
0b100000000 CHECK_AGG_SIGNERS ADD
(pkb{1..6} count single)
6 EQUAL
(summing to a total of 3 doubled)
Not sure that saves it from being "hacky and weird" though...
That is very hacky and weird. Doing MAST on lots of possibilities is always
reasonably elegant, and it only gets problematic when the number of
possibilities is truly massive.

It's also the case that BLS can support complex key agreement schemes
without even giving away that it isn't a simple single signature. Just
saying.

Loading...