Discussion:
[bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
Peter Todd via bitcoin-dev
2016-06-20 08:56:49 UTC
Permalink
In light of Ethereum's recent problems with its imperative, account-based,
programming model, I thought I'd do a quick writeup outlining the building
blocks of the state-machine approach to so-called "smart contract" systems, an
extension of Bitcoin's own design that I personally have been developing for a
number of years now as my Proofchains/Dex research work.


# Deterministic Code / Deterministic Expressions

We need to be able to run code on different computers and get identical
results; without this consensus is impossible and we might as well just use a
central authoritative database. Traditional languages and surrounding
frameworks make determinism difficult to achieve, as they tend to be filled
with undefined and underspecified behavior, ranging from signed integer
overflow in C/C++ to non-deterministic behavior in databases. While some
successful systems like Bitcoin are based on such languages, their success is
attributable to heroic efforts by their developers.

Deterministic expression systems such as Bitcoin's scripting system and the
author's Dex project improve on this by allowing expressions to be precisely
specified by hash digest, and executed against an environment with
deterministic results. In the case of Bitcoin's script, the expression is a
Forth-like stack-based program; in Dex the expression takes the form of a
lambda calculus expression.


## Proofs

So far the most common use for deterministic expressions is to specify
conditions upon which funds can be spent, as seen in Bitcoin (particularly
P2SH, and the upcoming Segwit). But we can generalize their use to precisely
defining consensus protocols in terms of state machines, with each state
defined in terms of a deterministic expression that must return true for the
state to have been reached. The data that causes a given expression to return
true is then a "proof", and that proof can be passed from one party to another
to prove desired states in the system have been reached.

An important implication of this model is that we need deterministic, and
efficient, serialization of proof data.


## Pruning

Often the evaluation of an expression against a proof doesn't require all all
data in the proof. For example, to prove to a lite client that a given block
contains a transaction, we only need the merkle path from the transaction to
the block header. Systems like Proofchains and Dex generalize this process -
called "pruning" - with built-in support to both keep track of what data is
accessed by what operations, as well as support in their underlying
serialization schemes for unneeded data to be elided and replaced by the hash
digest of the pruned data.


# Transactions

A common type of state machine is the transaction. A transaction history is a
directed acyclic graph of transactions, with one or more genesis transactions
having no inputs (ancestors), and one or more outputs, and zero or more
non-genesis transactions with one or more inputs, and zero or more outputs. The
edges of the graph connect inputs to outputs, with every input connected to
exactly one output. Outputs with an associated input are known as spent
outputs; outputs with out an associated input are unspent.

Outputs have conditions attached to them (e.g. a pubkey for which a valid
signature must be produced), and may also be associated with other values such
as "# of coins". We consider a transaction valid if we have a set of proofs,
one per input, that satisfy the conditions associated with each output.
Secondly, validity may also require additional constraints to be true, such as
requiring the coins spent to be >= the coins created on the outputs. Input
proofs also must uniquely commit to the transaction itself to be secure - if
they don't the proofs can be reused in a replay attack.

A non-genesis transaction is valid if:

1. Any protocol-specific rules such as coins spent >= coins output are
followed.

2. For every input a valid proof exists.

3. Every input transaction is itself valid.

A practical implementation of the above for value-transfer systems like Bitcoin
could use two merkle-sum trees, one for the inputs, and one for the outputs,
with inputs simply committing to the previous transaction's txid and output #
(outpoint), and outputs committing to a scriptPubKey and output amount.
Witnesses can be provided separately, and would sign a signature committing to
the transaction or optionally, a subset of of inputs and/or outputs (with
merkle trees we can easily avoid the exponential signature validation problems
bitcoin currently has).

As so long as all genesis transactions are unique, and our hash function is
secure, all transaction outputs can be uniquely identified (prior to BIP34 the
Bitcoin protocol actually failed at this!).


## Proof Distribution

How does Alice convince Bob that she has done a transaction that puts the
system into the state that Bob wanted? The obvious answer is she gives Bob data
proving that the system is now in the desired state; in a transactional system
that proof is some or all of the transaction history. Systems like Bitcoin
provide a generic flood-fill messaging layer where all participants have the
opportunity to get a copy of all proofs in the system, however we can also
implement more fine grained solutions based on peer-to-peer message passing -
one could imagine Alice proving to Bob that she transferred title to her house
to him by giving him a series of proofs, not unlike the same way that property
title transfer can be demonstrated by providing the buyer with a series of deed
documents (though note the double-spend problem!).


# Uniqueness and Single-Use Seals

In addition to knowing that a given transaction history is valid, we also want
to know if it's unique. By that we mean that every spent output in the
transaction history is associated with exactly one input, and no other valid
spends exist; we want to ensure no output has been double-spent.

Bitcoin (and pretty much every other cryptocurrency like it) achieves this goal
by defining a method of achieving consensus over the set of all (valid)
transactions, and then defining that consensus as valid if and only if no
output is spent more than once.

A more general approach is to introduce the idea of a cryptographic Single-Use
Seal, analogous to the tamper-evidence single-use seals commonly used for
protecting goods during shipment and storage. Each individual seals is
associated with a globally unique identifier, and has two states, open and
closed. A secure seal can be closed exactly once, producing a proof that the
seal was closed.

All practical single-use seals will be associated with some kind of condition,
such as a pubkey, or deterministic expression, that needs to be satisfied for
the seal to be closed. Secondly, the contents of the proof will be able to
commit to new data, such as the transaction spending the output associated with
the seal.

Additionally some implementations of single-use seals may be able to also
generate a proof that a seal was _not_ closed as of a certain
time/block-height/etc.


## Implementations

### Transactional Blockchains

A transaction output on a system like Bitcoin can be used as a single-use seal.
In this implementation, the outpoint (txid:vout #) is the seal's identifier,
the authorization mechanism is the scriptPubKey of the output, and the proof
is the transaction spending the output. The proof can commit to additional
data as needed in a variety of ways, such as an OP_RETURN output, or
unspendable output.

This implementation approach is resistant to miner censorship if the seal's
identifier isn't made public, and the protocol (optionally) allows for the
proof transaction to commit to the sealed contents with unspendable outputs;
unspendable outputs can't be distinguished from transactions that move funds.


### Unbounded Oracles

A trusted oracle P can maintain a set of closed seals, and produce signed
messages attesting to the fact that a seal was closed. Specifically, the seal
is identified by the tuple (P, q), with q being the per-seal authorization
expression that must be satisfied for the seal to be closed. The first time the
oracle is given a valid signature for the seal, it adds that signature and seal
ID to its closed seal set, and makes available a signed message attesting to
the fact that the seal has been closed. The proof is that message (and
possibly the signature, or a second message signed by it).

The oracle can publish the set of all closed seals for transparency/auditing
purposes. A good way to do this is to make a merkelized key:value set, with the
seal identifiers as keys, and the value being the proofs, and in turn create a
signed certificate transparency log of that set over time. Merkle-paths from
this log can also serve as the closed seal proof, and for that matter, as
proof of the fact that a seal has not been closed.


### Bounded Oracles

The above has the problem of unbounded storage requirements as the closed seal
set grows without bound. We can fix that problem by requiring users of the
oracle to allocate seals in advance, analogous to the UTXO set in Bitcoin.

To allocate a seal the user provides the oracle P with the authorization
expression q. The oracle then generates a nonce n and adds (q,n) to the set of
unclosed seals, and tells the user that nonce. The seal is then uniquely
identified by (P, q, n)

To close a seal, the user provides the oracle with a valid signature over (P,
q, n). If the open seal set contains that seal, the seal is removed from the
set and the oracle provides the user with a signed message attesting to the
valid close.

A practical implementation would be to have the oracle publish a transparency
log, with each entry in the log committing to the set of all open seals with a
merkle set, as well as any seals closed during that entry. Again, merkle paths
for this log can serve as proofs to the open or closed state of a seal.

Note how with (U)TXO commitments, Bitcoin itself is a bounded oracle
implementation that can produce compact proofs.


### Group Seals

Multiple seals can be combined into one, by having the open seal commit to a
set of sub-seals, and then closing the seal over a second set of closed seal
proofs. Seals that didn't need to be closed can be closed over a special
re-delegation message, re-delegating the seal to a new open seal.

Since the closed sub-seal proof can additionally include a proof of
authorization, we have a protcol where the entity with authorization to close
the master seal has the ability to DoS attack sub-seals owners, but not the
ability to fraudulently close the seals over contents of their choosing. This
may be useful in cases where actions on the master seal is expensive - such as
seals implemented on top of decentralized blockchains - by amortising the cost
over all sub-seals.


## Atomicity

Often protocols will require multiple seals to be closed for a transaction to
be valid. If a single entity controls all seals, this is no problem: the
transaction simply isn't valid until the last seal is closed.

However if multiple parties control the seals, a party could attack another
party by failing to go through with the transaction, after another party has
closed their seal, leaving the victim with an invalid transaction that they
can't reverse.

We have a few options to resolve this problem:

### Use a single oracle

The oracle can additionally guarantee that a seal will be closed iff some other
set of seals are also closed; seals implemented with Bitcoin can provide this
guarantee. If the parties to a transaction aren't already all on the same
oracle, they can add an additional transaction reassigning their outputs to a
common oracle.

Equally, a temporary consensus between multiple mutually trusting oracles can
be created with a consensus protocol they share; this option doesn't need to
change the proof verification implementation.


### Two-phase Timeouts

If a proof to the fact that a seal is open can be generated, even under
adversarial conditions, we can make the seal protocol allow a close to be
undone after a timeout if evidence can be provided that the other seal(s) were
not also closed (in the specified way).

Depending on the implementation - especially in decentralized systems - the
next time the seal is closed, the proof it has been closed may in turn provide
proof that a previous close was in fact invalid.


# Proof-of-Publication and Proof-of-Non-Publication

Often we need to be able to prove that a specified audience was able to receive
a specific message. For example, the author's PayPub protocol[^paypub],
Todd/Taaki's timelock encryption protocol[^timelock], Zero-Knowledge Contingent
Payments[^zkcp], and Lightning, among others work by requiring a secret key to
be published publicly in the Bitcoin blockchain as a condition of collecting a
payment. At a much smaller scale - in terms of audience - in certain FinTech
applications for regulated environments a transaction may be considered invalid
unless it was provably published to a regulatory agency. Another example is
Certificate Transparency, where we consider a SSL certificate to be invalid
unless it has been provably published to a transparency log maintained by a
third-party.

Secondly, many proof-of-publication schemes also can prove that a message was
_not_ published to a specific audience. With this type of proof single-use
seals can be implemented, by having the proof consist of proof that a specified
message was not published between the time the seal was created, and the time
it was closed (a proof-of-publication of the message).

## Implementations

### Decentralized Blockchains

Here the audience is all participants in the system. However miner censorship
can be a problem, and compact proofs of non-publication aren't yet available
(requires (U)TXO commitments).

The authors treechains proposal is a particularly generic and scalable
implementation, with the ability to make trade offs between the size of
audience (security) and publication cost.

### Centralized Public Logs

Certificate Transparency works this way, with trusted (but auditable) logs run
by well known parties acting as the publication medium, who promise to allow
anyone to obtain copies of the logs.

The logs themselves may be indexed in a variety of ways; CT simply indexes logs
by time, however more efficient schemes are possible by having the operator
commit to a key:value mapping of "topics", to allow publication (and
non-publication) proofs to be created for specified topics or topic prefixes.

Auditing the logs is done by verifying that queries to the state of the log
return the same state at the same time for different requesters.

### Receipt Oracles

Finally publication can be proven by a receipt proof by the oracle, attesting
to the fact that the oracle has successfully received the message. This is
particularly appropriate in cases where the required audience is the oracle
itself, as in the FinTech regulator case.


# Validity Oracles

As transaction histories grow longer, they may become impractical to move from
one party to another. Validity oracles can solve this problem by attesting to
the validity of transactions, allowing history prior to the attested
transactions to be discarded.

A particularly generic validity oracle can be created using deterministic
expressions systems. The user gives the oracle an expression, and the oracle
returns a signed message attesting to the validity of the expression.
Optionally, the expression may be incomplete, with parts of the expression
replaced by previously generated attestations. For example, an expression that
returns true if a transaction is valid could in turn depend on the previous
transaction also being valid - a recursive call of itself - and that recursive
call can be proven with a prior attestation.

## Implementations

### Proof-of-Work Decentralized Consensus

Miners in decentralized consensus systems act as a type of validity oracle, in
that the economic incentives in the system are (supposed to be) designed to
encourage only the mining of valid blocks; a user who trusts the majority of
hashing power can trust that any transaction with a valid merkle path to a
block header in the most-work chain is valid. Existing decentralized consensus
systems like Bitcoin and Ethereum conflate the roles of validity oracle and
single-use seal/anti-replay oracle, however in principle that need not be true.


### Trusted Oracles

As the name suggests. Remote-attestation-capable trusted hardware is a
particularly powerful implementation - a conspiracy theory is that the reason
why essentially zero secure true remote attestation implementations exist is
because they'd immediately make untraceable digital currency systems easy to
implement (Finney's RPOW[^rpow] is a rare counter-example).

Note how a single-use seal oracle that supports a generic deterministic
expressions scheme for seal authorization can be easily extended to provide a
validity oracle service as well. The auditing mechanisms for a single-use seal
oracle can also be applied to validity oracles.


# Fraud Proofs

Protocols specified with deterministic expressions can easily generate "fraud
proofs", showing that claimed states/proof in the system are actually invalid.
Additionally many protocols can be specified with expressions of k*log2(n)
depth, allowing these fraud proofs to be compact.

A simple example is proving fraud in merkle-sum tree, where the validity
expression would be something like:

(defun valid? (node)
(or (== node.type leaf)
(and (== node.sum (+ node.left.sum node.right.sum))
(and (valid? node.left)
(valid? node.right)))))

To prove the above expression evaluates to true, we'll need the entire contents
of the tree. However, to prove that it evaluates to false, we only need a
subset of the tree as proving an and expression evaluates to false only
requires one side, and requires log2(n) data. Secondly, with pruning, the
deterministic expressions evaluator can automatically keep track of exactly
what data was needed to prove that result, and prune all other data when
serializing the proof.


## Validity Challenges

However how do you guarantee it will be possible to prove fraud in the first
place? If pruning is allowed, you may simply not have access to the data
proving fraud - an especially severe problem in transactional systems where a
single fraudulent transaction can counterfeit arbitrary amounts of value out of
thin air.

A possible approach is the validity challenge: a subset of proof data, with
part of the data marked as "potentially fraudulent". The challenge can be
satisfied by providing the marked data and showing that the proof in question
is in fact valid; if the challenge is unmet participants in the system can
choose to take action, such as refusing to accept additional transactions.

Of course, this raises a whole host of so-far unsolved issues, such as DoS
attacks and lost data.


# Probabilistic Validation

Protocols that can tolerate some fraud can make use of probabilistic
verification techniques to prove that the percentage of undetected fraud within
the system is less than a certain amount, with a specified probability.

A common way to do this is the Fiat-Shamir transform, which repeatedly samples
a data structure deterministically, using the data's own hash digest as a seed
for a PRNG. Let's apply this technique to our merkle-sum tree example. We'll
first need a recursive function to check a sample, weighted by value:

(defun prefix-valid? (node nonce)
(or (== node.type leaf)
(and (and (== node.sum (+ node.left.sum node.right.sum))
(> 0 node.sum)) ; mod by 0 is invalid, just like division by zero
; also could guarantee this with a type system
(and (if (< node.left.sum (mod nonce node.sum))
(prefix-valid? node.right (hash nonce))
(prefix-valid? node.left (hash nonce)))))))

Now we can combine multiple invocations of the above, in this case 256
invocations:

(defun prob-valid? (node)
(and (and (and .... (prefix-valid? node (digest (cons (digest node) 0)))
(and (and ....
(prefix-valid? node (digest (cons (digest node) 255)))

As an exercise for a reader: generalize the above with a macro, or a suitable
types/generics system.

If we assume our attacker can grind up to 128 bits, that leaves us with 128
random samples that they can't control. If the (value weighted) probability of
a given node is fraudulent q, then the chance of the attacker getting away with
fraud is (1-q)^128 - for q=5% that works out to 0.1%

(Note that the above analysis isn't particularly well done - do a better
analysis before implementing this in production!)


## Random Beacons and Transaction History Linearization

The Fiat-Shamir transform requires a significant number of samples to defeat
grinding attacks; if we have a random beacon available we can significantly
reduce the size of our probabilistic proofs. PoW blockchains can themselves act
as random beacons, as it is provably expensive for miners to manipulate the
hash digests of blocks they produce - to do so requires discarding otherwise
valid blocks.

An example where this capability is essential is the author's transaction
history linearization technique. In value transfer systems such as Bitcoin, the
history of any given coin grows quasi-exponentially as coins are mixed across
the entire economy. We can linearize the growth of history proofs by redefining
coin validity to be probabilistic.

Suppose we have a transaction with n inputs. Of those inputs, the total value
of real inputs is p, and the total claimed value of fake inputs is q. The
transaction commits to all inputs in a merkle sum tree, and we define the
transaction as valid if a randomly chosen input - weighted by value - can
itself be proven valid. Finally, we assume that creating a genuine input is a
irrevocable action which irrevocable commits to the set of all inputs, real and
fake.

If all inputs are real, 100% of the time the transaction will be valid; if all
inputs are fake, 100% of the time the transaction will be invalid. In the case
where some inputs are real and some are fake the probability that the fraud
will be detected is:

q / (q + p)

The expected value of the fake inputs is then the sum of the potential upside -
the fraud goes detected - and the potential downside - the fraud is detected
and the real inputs are destroyed:

E = q(1 - q/(q + p)) - p(q/(q + p)
= q(p/(q + p)) - p(q/(q + p)
= (q - q)(p/(q + p))
= 0

Thus so long as the random beacon is truly unpredictable, there's no economic
advantage to creating fake inputs, and it is sufficient for validity to only
require one input to be proven, giving us O(n) scaling for transaction history
proofs.


### Inflationary O(1) History Proofs

We can further improve our transaction history proof scalability by taking
advantage of inflation. We do this by occasionally allowing a transaction proof
to be considered valid without validating _any_ of the inputs; every time a
transaction is allowed without proving any inputs the size of the transaction
history proof is reset. Of course, this can be a source of inflation, but
provided the probability of this happening can be limited we can limit the
maximum rate of inflation to the chosen value.

For example, in Bitcoin as of writing every block inflates the currency supply
by 25BTC, and contains a maximum of 1MB of transaction data, 0.025BTC/KB. If we
check the prior input proof with probability p, then the expected value of a
transaction claiming to spend x BTC is:

E = x(1-p)

We can rewrite that in terms of the block reward per-byte R, and the transaction size l:

lR = x(1-p)

And solving for p:

p = 1 - lR/x

For example, for a 1KB transaction proof claiming to spending 10BTC we can omit
checking the input 0.25% of the time without allowing more monetary inflation
than the block reward already does. Secondly, this means that after n
transactions, the probability that proof shortening will _not_ happen is p^n,
which reaches 1% after 1840 transactions.

In a system like Bitcoin where miners are expected to validate, a transaction
proof could consist of just a single merkle path showing that a single-use seal
was closed in some kind of TXO commitment - probably under 10KB of data. That
gives us a history proof less than 18.4MB in size, 99% of the time, and less
than 9.2MB in size 90% of the time.

An interesting outcome of thing kind of design is that we can institutionalize
inflation fraud: the entire block reward can be replaced by miners rolling the
dice, attempting to create valid "fake" transactions. However, such a pure
implementation would put a floor on the lowest transaction fee possible, so
better to allow both transaction fee and subsidy collection at the same time.


# References

[^paypub] https://github.com/unsystem/paypub
[^timelock] https://github.com/petertodd/timelock
[^zkcp] https://bitcoincore.org/en/2016/02/26/zero-knowledge-contingent-payments-announcement/
[^rpow] https://cryptome.org/rpow.htm
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Police Terror via bitcoin-dev
2016-06-20 13:26:22 UTC
Permalink
Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer
the current protocol into lisp (inside C++), run this alternative engine
alongside the current one as an option for some years (only for fine
tuning) then eventually fade this lisp written validation code instead
of the current one.

Scheme is small and minimal, and embeds easily in C++. This could be a
better option than the libconsensus library - validation in a functional
scripting language.

That doesn't mean people can't program the validation code in other
languages (maybe they'd want to optimize), but this code would be the
standard.

It's really good how you are thinking deeply how Bitcoin can be used,
and the implications of everything. Also there's a lot of magical utopic
thinking in Ethereum, which is transhumanist nonsense that is life
denying. Bitcoin really speaks to me because it is real and a great tool
following the UNIX principle.

I wouldn't be so quick to deride good engineering over systematic
provable systems for all domains. Bitcoin being written in C++ is not a
defect. It's actually a strong language for what it does. Especially
when used correctly (which is not often and takes years to master).

With the seals idea- am I understand this correctly?: Every transaction
has a number (essentially the index starting from 0 upwards) depending
on where it is in the blockchain.

Then there is an array (probably an on disk array mapping transaction
indexes to hashes). Each hash entry in the array must be unique (the
hashes) otherwise the transaction will be denied. This is a great idea
to solve transaction hash collisions and simple to implement.

Probabilistic validation is a good idea, although the real difficulty
now seems to be writing and indexing all the blockchain data for
lookups. And validation is disabled for most of the blocks. Pruning is
only a stop gap measure (which loses data) that doesn't solve the issue
of continually growing resource consumption. Hardware and implementation
can only mitigate this so much. If only there was a way to simplify the
underlying protocol to make it more resource efficient...
Post by Peter Todd via bitcoin-dev
In light of Ethereum's recent problems with its imperative, account-based,
programming model, I thought I'd do a quick writeup outlining the building
blocks of the state-machine approach to so-called "smart contract" systems, an
extension of Bitcoin's own design that I personally have been developing for a
number of years now as my Proofchains/Dex research work.
# Deterministic Code / Deterministic Expressions
We need to be able to run code on different computers and get identical
results; without this consensus is impossible and we might as well just use a
central authoritative database. Traditional languages and surrounding
frameworks make determinism difficult to achieve, as they tend to be filled
with undefined and underspecified behavior, ranging from signed integer
overflow in C/C++ to non-deterministic behavior in databases. While some
successful systems like Bitcoin are based on such languages, their success is
attributable to heroic efforts by their developers.
Deterministic expression systems such as Bitcoin's scripting system and the
author's Dex project improve on this by allowing expressions to be precisely
specified by hash digest, and executed against an environment with
deterministic results. In the case of Bitcoin's script, the expression is a
Forth-like stack-based program; in Dex the expression takes the form of a
lambda calculus expression.
## Proofs
So far the most common use for deterministic expressions is to specify
conditions upon which funds can be spent, as seen in Bitcoin (particularly
P2SH, and the upcoming Segwit). But we can generalize their use to precisely
defining consensus protocols in terms of state machines, with each state
defined in terms of a deterministic expression that must return true for the
state to have been reached. The data that causes a given expression to return
true is then a "proof", and that proof can be passed from one party to another
to prove desired states in the system have been reached.
An important implication of this model is that we need deterministic, and
efficient, serialization of proof data.
## Pruning
Often the evaluation of an expression against a proof doesn't require all all
data in the proof. For example, to prove to a lite client that a given block
contains a transaction, we only need the merkle path from the transaction to
the block header. Systems like Proofchains and Dex generalize this process -
called "pruning" - with built-in support to both keep track of what data is
accessed by what operations, as well as support in their underlying
serialization schemes for unneeded data to be elided and replaced by the hash
digest of the pruned data.
# Transactions
A common type of state machine is the transaction. A transaction history is a
directed acyclic graph of transactions, with one or more genesis transactions
having no inputs (ancestors), and one or more outputs, and zero or more
non-genesis transactions with one or more inputs, and zero or more outputs. The
edges of the graph connect inputs to outputs, with every input connected to
exactly one output. Outputs with an associated input are known as spent
outputs; outputs with out an associated input are unspent.
Outputs have conditions attached to them (e.g. a pubkey for which a valid
signature must be produced), and may also be associated with other values such
as "# of coins". We consider a transaction valid if we have a set of proofs,
one per input, that satisfy the conditions associated with each output.
Secondly, validity may also require additional constraints to be true, such as
requiring the coins spent to be >= the coins created on the outputs. Input
proofs also must uniquely commit to the transaction itself to be secure - if
they don't the proofs can be reused in a replay attack.
1. Any protocol-specific rules such as coins spent >= coins output are
followed.
2. For every input a valid proof exists.
3. Every input transaction is itself valid.
A practical implementation of the above for value-transfer systems like Bitcoin
could use two merkle-sum trees, one for the inputs, and one for the outputs,
with inputs simply committing to the previous transaction's txid and output #
(outpoint), and outputs committing to a scriptPubKey and output amount.
Witnesses can be provided separately, and would sign a signature committing to
the transaction or optionally, a subset of of inputs and/or outputs (with
merkle trees we can easily avoid the exponential signature validation problems
bitcoin currently has).
As so long as all genesis transactions are unique, and our hash function is
secure, all transaction outputs can be uniquely identified (prior to BIP34 the
Bitcoin protocol actually failed at this!).
## Proof Distribution
How does Alice convince Bob that she has done a transaction that puts the
system into the state that Bob wanted? The obvious answer is she gives Bob data
proving that the system is now in the desired state; in a transactional system
that proof is some or all of the transaction history. Systems like Bitcoin
provide a generic flood-fill messaging layer where all participants have the
opportunity to get a copy of all proofs in the system, however we can also
implement more fine grained solutions based on peer-to-peer message passing -
one could imagine Alice proving to Bob that she transferred title to her house
to him by giving him a series of proofs, not unlike the same way that property
title transfer can be demonstrated by providing the buyer with a series of deed
documents (though note the double-spend problem!).
# Uniqueness and Single-Use Seals
In addition to knowing that a given transaction history is valid, we also want
to know if it's unique. By that we mean that every spent output in the
transaction history is associated with exactly one input, and no other valid
spends exist; we want to ensure no output has been double-spent.
Bitcoin (and pretty much every other cryptocurrency like it) achieves this goal
by defining a method of achieving consensus over the set of all (valid)
transactions, and then defining that consensus as valid if and only if no
output is spent more than once.
A more general approach is to introduce the idea of a cryptographic Single-Use
Seal, analogous to the tamper-evidence single-use seals commonly used for
protecting goods during shipment and storage. Each individual seals is
associated with a globally unique identifier, and has two states, open and
closed. A secure seal can be closed exactly once, producing a proof that the
seal was closed.
All practical single-use seals will be associated with some kind of condition,
such as a pubkey, or deterministic expression, that needs to be satisfied for
the seal to be closed. Secondly, the contents of the proof will be able to
commit to new data, such as the transaction spending the output associated with
the seal.
Additionally some implementations of single-use seals may be able to also
generate a proof that a seal was _not_ closed as of a certain
time/block-height/etc.
## Implementations
### Transactional Blockchains
A transaction output on a system like Bitcoin can be used as a single-use seal.
In this implementation, the outpoint (txid:vout #) is the seal's identifier,
the authorization mechanism is the scriptPubKey of the output, and the proof
is the transaction spending the output. The proof can commit to additional
data as needed in a variety of ways, such as an OP_RETURN output, or
unspendable output.
This implementation approach is resistant to miner censorship if the seal's
identifier isn't made public, and the protocol (optionally) allows for the
proof transaction to commit to the sealed contents with unspendable outputs;
unspendable outputs can't be distinguished from transactions that move funds.
### Unbounded Oracles
A trusted oracle P can maintain a set of closed seals, and produce signed
messages attesting to the fact that a seal was closed. Specifically, the seal
is identified by the tuple (P, q), with q being the per-seal authorization
expression that must be satisfied for the seal to be closed. The first time the
oracle is given a valid signature for the seal, it adds that signature and seal
ID to its closed seal set, and makes available a signed message attesting to
the fact that the seal has been closed. The proof is that message (and
possibly the signature, or a second message signed by it).
The oracle can publish the set of all closed seals for transparency/auditing
purposes. A good way to do this is to make a merkelized key:value set, with the
seal identifiers as keys, and the value being the proofs, and in turn create a
signed certificate transparency log of that set over time. Merkle-paths from
this log can also serve as the closed seal proof, and for that matter, as
proof of the fact that a seal has not been closed.
### Bounded Oracles
The above has the problem of unbounded storage requirements as the closed seal
set grows without bound. We can fix that problem by requiring users of the
oracle to allocate seals in advance, analogous to the UTXO set in Bitcoin.
To allocate a seal the user provides the oracle P with the authorization
expression q. The oracle then generates a nonce n and adds (q,n) to the set of
unclosed seals, and tells the user that nonce. The seal is then uniquely
identified by (P, q, n)
To close a seal, the user provides the oracle with a valid signature over (P,
q, n). If the open seal set contains that seal, the seal is removed from the
set and the oracle provides the user with a signed message attesting to the
valid close.
A practical implementation would be to have the oracle publish a transparency
log, with each entry in the log committing to the set of all open seals with a
merkle set, as well as any seals closed during that entry. Again, merkle paths
for this log can serve as proofs to the open or closed state of a seal.
Note how with (U)TXO commitments, Bitcoin itself is a bounded oracle
implementation that can produce compact proofs.
### Group Seals
Multiple seals can be combined into one, by having the open seal commit to a
set of sub-seals, and then closing the seal over a second set of closed seal
proofs. Seals that didn't need to be closed can be closed over a special
re-delegation message, re-delegating the seal to a new open seal.
Since the closed sub-seal proof can additionally include a proof of
authorization, we have a protcol where the entity with authorization to close
the master seal has the ability to DoS attack sub-seals owners, but not the
ability to fraudulently close the seals over contents of their choosing. This
may be useful in cases where actions on the master seal is expensive - such as
seals implemented on top of decentralized blockchains - by amortising the cost
over all sub-seals.
## Atomicity
Often protocols will require multiple seals to be closed for a transaction to
be valid. If a single entity controls all seals, this is no problem: the
transaction simply isn't valid until the last seal is closed.
However if multiple parties control the seals, a party could attack another
party by failing to go through with the transaction, after another party has
closed their seal, leaving the victim with an invalid transaction that they
can't reverse.
### Use a single oracle
The oracle can additionally guarantee that a seal will be closed iff some other
set of seals are also closed; seals implemented with Bitcoin can provide this
guarantee. If the parties to a transaction aren't already all on the same
oracle, they can add an additional transaction reassigning their outputs to a
common oracle.
Equally, a temporary consensus between multiple mutually trusting oracles can
be created with a consensus protocol they share; this option doesn't need to
change the proof verification implementation.
### Two-phase Timeouts
If a proof to the fact that a seal is open can be generated, even under
adversarial conditions, we can make the seal protocol allow a close to be
undone after a timeout if evidence can be provided that the other seal(s) were
not also closed (in the specified way).
Depending on the implementation - especially in decentralized systems - the
next time the seal is closed, the proof it has been closed may in turn provide
proof that a previous close was in fact invalid.
# Proof-of-Publication and Proof-of-Non-Publication
Often we need to be able to prove that a specified audience was able to receive
a specific message. For example, the author's PayPub protocol[^paypub],
Todd/Taaki's timelock encryption protocol[^timelock], Zero-Knowledge Contingent
Payments[^zkcp], and Lightning, among others work by requiring a secret key to
be published publicly in the Bitcoin blockchain as a condition of collecting a
payment. At a much smaller scale - in terms of audience - in certain FinTech
applications for regulated environments a transaction may be considered invalid
unless it was provably published to a regulatory agency. Another example is
Certificate Transparency, where we consider a SSL certificate to be invalid
unless it has been provably published to a transparency log maintained by a
third-party.
Secondly, many proof-of-publication schemes also can prove that a message was
_not_ published to a specific audience. With this type of proof single-use
seals can be implemented, by having the proof consist of proof that a specified
message was not published between the time the seal was created, and the time
it was closed (a proof-of-publication of the message).
## Implementations
### Decentralized Blockchains
Here the audience is all participants in the system. However miner censorship
can be a problem, and compact proofs of non-publication aren't yet available
(requires (U)TXO commitments).
The authors treechains proposal is a particularly generic and scalable
implementation, with the ability to make trade offs between the size of
audience (security) and publication cost.
### Centralized Public Logs
Certificate Transparency works this way, with trusted (but auditable) logs run
by well known parties acting as the publication medium, who promise to allow
anyone to obtain copies of the logs.
The logs themselves may be indexed in a variety of ways; CT simply indexes logs
by time, however more efficient schemes are possible by having the operator
commit to a key:value mapping of "topics", to allow publication (and
non-publication) proofs to be created for specified topics or topic prefixes.
Auditing the logs is done by verifying that queries to the state of the log
return the same state at the same time for different requesters.
### Receipt Oracles
Finally publication can be proven by a receipt proof by the oracle, attesting
to the fact that the oracle has successfully received the message. This is
particularly appropriate in cases where the required audience is the oracle
itself, as in the FinTech regulator case.
# Validity Oracles
As transaction histories grow longer, they may become impractical to move from
one party to another. Validity oracles can solve this problem by attesting to
the validity of transactions, allowing history prior to the attested
transactions to be discarded.
A particularly generic validity oracle can be created using deterministic
expressions systems. The user gives the oracle an expression, and the oracle
returns a signed message attesting to the validity of the expression.
Optionally, the expression may be incomplete, with parts of the expression
replaced by previously generated attestations. For example, an expression that
returns true if a transaction is valid could in turn depend on the previous
transaction also being valid - a recursive call of itself - and that recursive
call can be proven with a prior attestation.
## Implementations
### Proof-of-Work Decentralized Consensus
Miners in decentralized consensus systems act as a type of validity oracle, in
that the economic incentives in the system are (supposed to be) designed to
encourage only the mining of valid blocks; a user who trusts the majority of
hashing power can trust that any transaction with a valid merkle path to a
block header in the most-work chain is valid. Existing decentralized consensus
systems like Bitcoin and Ethereum conflate the roles of validity oracle and
single-use seal/anti-replay oracle, however in principle that need not be true.
### Trusted Oracles
As the name suggests. Remote-attestation-capable trusted hardware is a
particularly powerful implementation - a conspiracy theory is that the reason
why essentially zero secure true remote attestation implementations exist is
because they'd immediately make untraceable digital currency systems easy to
implement (Finney's RPOW[^rpow] is a rare counter-example).
Note how a single-use seal oracle that supports a generic deterministic
expressions scheme for seal authorization can be easily extended to provide a
validity oracle service as well. The auditing mechanisms for a single-use seal
oracle can also be applied to validity oracles.
# Fraud Proofs
Protocols specified with deterministic expressions can easily generate "fraud
proofs", showing that claimed states/proof in the system are actually invalid.
Additionally many protocols can be specified with expressions of k*log2(n)
depth, allowing these fraud proofs to be compact.
A simple example is proving fraud in merkle-sum tree, where the validity
(defun valid? (node)
(or (== node.type leaf)
(and (== node.sum (+ node.left.sum node.right.sum))
(and (valid? node.left)
(valid? node.right)))))
To prove the above expression evaluates to true, we'll need the entire contents
of the tree. However, to prove that it evaluates to false, we only need a
subset of the tree as proving an and expression evaluates to false only
requires one side, and requires log2(n) data. Secondly, with pruning, the
deterministic expressions evaluator can automatically keep track of exactly
what data was needed to prove that result, and prune all other data when
serializing the proof.
## Validity Challenges
However how do you guarantee it will be possible to prove fraud in the first
place? If pruning is allowed, you may simply not have access to the data
proving fraud - an especially severe problem in transactional systems where a
single fraudulent transaction can counterfeit arbitrary amounts of value out of
thin air.
A possible approach is the validity challenge: a subset of proof data, with
part of the data marked as "potentially fraudulent". The challenge can be
satisfied by providing the marked data and showing that the proof in question
is in fact valid; if the challenge is unmet participants in the system can
choose to take action, such as refusing to accept additional transactions.
Of course, this raises a whole host of so-far unsolved issues, such as DoS
attacks and lost data.
# Probabilistic Validation
Protocols that can tolerate some fraud can make use of probabilistic
verification techniques to prove that the percentage of undetected fraud within
the system is less than a certain amount, with a specified probability.
A common way to do this is the Fiat-Shamir transform, which repeatedly samples
a data structure deterministically, using the data's own hash digest as a seed
for a PRNG. Let's apply this technique to our merkle-sum tree example. We'll
(defun prefix-valid? (node nonce)
(or (== node.type leaf)
(and (and (== node.sum (+ node.left.sum node.right.sum))
(> 0 node.sum)) ; mod by 0 is invalid, just like division by zero
; also could guarantee this with a type system
(and (if (< node.left.sum (mod nonce node.sum))
(prefix-valid? node.right (hash nonce))
(prefix-valid? node.left (hash nonce)))))))
Now we can combine multiple invocations of the above, in this case 256
(defun prob-valid? (node)
(and (and (and .... (prefix-valid? node (digest (cons (digest node) 0)))
(and (and ....
(prefix-valid? node (digest (cons (digest node) 255)))
As an exercise for a reader: generalize the above with a macro, or a suitable
types/generics system.
If we assume our attacker can grind up to 128 bits, that leaves us with 128
random samples that they can't control. If the (value weighted) probability of
a given node is fraudulent q, then the chance of the attacker getting away with
fraud is (1-q)^128 - for q=5% that works out to 0.1%
(Note that the above analysis isn't particularly well done - do a better
analysis before implementing this in production!)
## Random Beacons and Transaction History Linearization
The Fiat-Shamir transform requires a significant number of samples to defeat
grinding attacks; if we have a random beacon available we can significantly
reduce the size of our probabilistic proofs. PoW blockchains can themselves act
as random beacons, as it is provably expensive for miners to manipulate the
hash digests of blocks they produce - to do so requires discarding otherwise
valid blocks.
An example where this capability is essential is the author's transaction
history linearization technique. In value transfer systems such as Bitcoin, the
history of any given coin grows quasi-exponentially as coins are mixed across
the entire economy. We can linearize the growth of history proofs by redefining
coin validity to be probabilistic.
Suppose we have a transaction with n inputs. Of those inputs, the total value
of real inputs is p, and the total claimed value of fake inputs is q. The
transaction commits to all inputs in a merkle sum tree, and we define the
transaction as valid if a randomly chosen input - weighted by value - can
itself be proven valid. Finally, we assume that creating a genuine input is a
irrevocable action which irrevocable commits to the set of all inputs, real and
fake.
If all inputs are real, 100% of the time the transaction will be valid; if all
inputs are fake, 100% of the time the transaction will be invalid. In the case
where some inputs are real and some are fake the probability that the fraud
q / (q + p)
The expected value of the fake inputs is then the sum of the potential upside -
the fraud goes detected - and the potential downside - the fraud is detected
E = q(1 - q/(q + p)) - p(q/(q + p)
= q(p/(q + p)) - p(q/(q + p)
= (q - q)(p/(q + p))
= 0
Thus so long as the random beacon is truly unpredictable, there's no economic
advantage to creating fake inputs, and it is sufficient for validity to only
require one input to be proven, giving us O(n) scaling for transaction history
proofs.
### Inflationary O(1) History Proofs
We can further improve our transaction history proof scalability by taking
advantage of inflation. We do this by occasionally allowing a transaction proof
to be considered valid without validating _any_ of the inputs; every time a
transaction is allowed without proving any inputs the size of the transaction
history proof is reset. Of course, this can be a source of inflation, but
provided the probability of this happening can be limited we can limit the
maximum rate of inflation to the chosen value.
For example, in Bitcoin as of writing every block inflates the currency supply
by 25BTC, and contains a maximum of 1MB of transaction data, 0.025BTC/KB. If we
check the prior input proof with probability p, then the expected value of a
E = x(1-p)
lR = x(1-p)
p = 1 - lR/x
For example, for a 1KB transaction proof claiming to spending 10BTC we can omit
checking the input 0.25% of the time without allowing more monetary inflation
than the block reward already does. Secondly, this means that after n
transactions, the probability that proof shortening will _not_ happen is p^n,
which reaches 1% after 1840 transactions.
In a system like Bitcoin where miners are expected to validate, a transaction
proof could consist of just a single merkle path showing that a single-use seal
was closed in some kind of TXO commitment - probably under 10KB of data. That
gives us a history proof less than 18.4MB in size, 99% of the time, and less
than 9.2MB in size 90% of the time.
An interesting outcome of thing kind of design is that we can institutionalize
inflation fraud: the entire block reward can be replaced by miners rolling the
dice, attempting to create valid "fake" transactions. However, such a pure
implementation would put a floor on the lowest transaction fee possible, so
better to allow both transaction fee and subsidy collection at the same time.
# References
[^paypub] https://github.com/unsystem/paypub
[^timelock] https://github.com/petertodd/timelock
[^zkcp] https://bitcoincore.org/en/2016/02/26/zero-knowledge-contingent-payments-announcement/
[^rpow] https://cryptome.org/rpow.htm
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
zaki--- via bitcoin-dev
2016-06-20 16:21:39 UTC
Permalink
Hi Peter,

I didn't entirely understand the process of transaction linearization.

What I see is a potential process where when the miner assembles the block,
he strips all but one sigscript per tx. The selection of which sigscript
is retained is determined by the random oracle. Is this is primary benefit
you are suggesting?

It appears to me that blocks still need to contain a list of full TX Input
and Tx Outputs with your approach. Some of the description seems to
indicate that there are opportunities to elide further data but it's
unclear to me how.

On Mon, Jun 20, 2016 at 7:14 AM Police Terror via bitcoin-dev <
Post by Police Terror via bitcoin-dev
Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer
the current protocol into lisp (inside C++), run this alternative engine
alongside the current one as an option for some years (only for fine
tuning) then eventually fade this lisp written validation code instead
of the current one.
Scheme is small and minimal, and embeds easily in C++. This could be a
better option than the libconsensus library - validation in a functional
scripting language.
That doesn't mean people can't program the validation code in other
languages (maybe they'd want to optimize), but this code would be the
standard.
It's really good how you are thinking deeply how Bitcoin can be used,
and the implications of everything. Also there's a lot of magical utopic
thinking in Ethereum, which is transhumanist nonsense that is life
denying. Bitcoin really speaks to me because it is real and a great tool
following the UNIX principle.
I wouldn't be so quick to deride good engineering over systematic
provable systems for all domains. Bitcoin being written in C++ is not a
defect. It's actually a strong language for what it does. Especially
when used correctly (which is not often and takes years to master).
With the seals idea- am I understand this correctly?: Every transaction
has a number (essentially the index starting from 0 upwards) depending
on where it is in the blockchain.
Then there is an array (probably an on disk array mapping transaction
indexes to hashes). Each hash entry in the array must be unique (the
hashes) otherwise the transaction will be denied. This is a great idea
to solve transaction hash collisions and simple to implement.
Probabilistic validation is a good idea, although the real difficulty
now seems to be writing and indexing all the blockchain data for
lookups. And validation is disabled for most of the blocks. Pruning is
only a stop gap measure (which loses data) that doesn't solve the issue
of continually growing resource consumption. Hardware and implementation
can only mitigate this so much. If only there was a way to simplify the
underlying protocol to make it more resource efficient...
Post by Peter Todd via bitcoin-dev
In light of Ethereum's recent problems with its imperative,
account-based,
Post by Peter Todd via bitcoin-dev
programming model, I thought I'd do a quick writeup outlining the
building
Post by Peter Todd via bitcoin-dev
blocks of the state-machine approach to so-called "smart contract"
systems, an
Post by Peter Todd via bitcoin-dev
extension of Bitcoin's own design that I personally have been developing
for a
Post by Peter Todd via bitcoin-dev
number of years now as my Proofchains/Dex research work.
# Deterministic Code / Deterministic Expressions
We need to be able to run code on different computers and get identical
results; without this consensus is impossible and we might as well just
use a
Post by Peter Todd via bitcoin-dev
central authoritative database. Traditional languages and surrounding
frameworks make determinism difficult to achieve, as they tend to be
filled
Post by Peter Todd via bitcoin-dev
with undefined and underspecified behavior, ranging from signed integer
overflow in C/C++ to non-deterministic behavior in databases. While some
successful systems like Bitcoin are based on such languages, their
success is
Post by Peter Todd via bitcoin-dev
attributable to heroic efforts by their developers.
Deterministic expression systems such as Bitcoin's scripting system and
the
Post by Peter Todd via bitcoin-dev
author's Dex project improve on this by allowing expressions to be
precisely
Post by Peter Todd via bitcoin-dev
specified by hash digest, and executed against an environment with
deterministic results. In the case of Bitcoin's script, the expression
is a
Post by Peter Todd via bitcoin-dev
Forth-like stack-based program; in Dex the expression takes the form of a
lambda calculus expression.
## Proofs
So far the most common use for deterministic expressions is to specify
conditions upon which funds can be spent, as seen in Bitcoin
(particularly
Post by Peter Todd via bitcoin-dev
P2SH, and the upcoming Segwit). But we can generalize their use to
precisely
Post by Peter Todd via bitcoin-dev
defining consensus protocols in terms of state machines, with each state
defined in terms of a deterministic expression that must return true for
the
Post by Peter Todd via bitcoin-dev
state to have been reached. The data that causes a given expression to
return
Post by Peter Todd via bitcoin-dev
true is then a "proof", and that proof can be passed from one party to
another
Post by Peter Todd via bitcoin-dev
to prove desired states in the system have been reached.
An important implication of this model is that we need deterministic, and
efficient, serialization of proof data.
## Pruning
Often the evaluation of an expression against a proof doesn't require
all all
Post by Peter Todd via bitcoin-dev
data in the proof. For example, to prove to a lite client that a given
block
Post by Peter Todd via bitcoin-dev
contains a transaction, we only need the merkle path from the
transaction to
Post by Peter Todd via bitcoin-dev
the block header. Systems like Proofchains and Dex generalize this
process -
Post by Peter Todd via bitcoin-dev
called "pruning" - with built-in support to both keep track of what data
is
Post by Peter Todd via bitcoin-dev
accessed by what operations, as well as support in their underlying
serialization schemes for unneeded data to be elided and replaced by the
hash
Post by Peter Todd via bitcoin-dev
digest of the pruned data.
# Transactions
A common type of state machine is the transaction. A transaction history
is a
Post by Peter Todd via bitcoin-dev
directed acyclic graph of transactions, with one or more genesis
transactions
Post by Peter Todd via bitcoin-dev
having no inputs (ancestors), and one or more outputs, and zero or more
non-genesis transactions with one or more inputs, and zero or more
outputs. The
Post by Peter Todd via bitcoin-dev
edges of the graph connect inputs to outputs, with every input connected
to
Post by Peter Todd via bitcoin-dev
exactly one output. Outputs with an associated input are known as spent
outputs; outputs with out an associated input are unspent.
Outputs have conditions attached to them (e.g. a pubkey for which a valid
signature must be produced), and may also be associated with other
values such
Post by Peter Todd via bitcoin-dev
as "# of coins". We consider a transaction valid if we have a set of
proofs,
Post by Peter Todd via bitcoin-dev
one per input, that satisfy the conditions associated with each output.
Secondly, validity may also require additional constraints to be true,
such as
Post by Peter Todd via bitcoin-dev
requiring the coins spent to be >= the coins created on the outputs.
Input
Post by Peter Todd via bitcoin-dev
proofs also must uniquely commit to the transaction itself to be secure
- if
Post by Peter Todd via bitcoin-dev
they don't the proofs can be reused in a replay attack.
1. Any protocol-specific rules such as coins spent >= coins output are
followed.
2. For every input a valid proof exists.
3. Every input transaction is itself valid.
A practical implementation of the above for value-transfer systems like
Bitcoin
Post by Peter Todd via bitcoin-dev
could use two merkle-sum trees, one for the inputs, and one for the
outputs,
Post by Peter Todd via bitcoin-dev
with inputs simply committing to the previous transaction's txid and
output #
Post by Peter Todd via bitcoin-dev
(outpoint), and outputs committing to a scriptPubKey and output amount.
Witnesses can be provided separately, and would sign a signature
committing to
Post by Peter Todd via bitcoin-dev
the transaction or optionally, a subset of of inputs and/or outputs (with
merkle trees we can easily avoid the exponential signature validation
problems
Post by Peter Todd via bitcoin-dev
bitcoin currently has).
As so long as all genesis transactions are unique, and our hash function
is
Post by Peter Todd via bitcoin-dev
secure, all transaction outputs can be uniquely identified (prior to
BIP34 the
Post by Peter Todd via bitcoin-dev
Bitcoin protocol actually failed at this!).
## Proof Distribution
How does Alice convince Bob that she has done a transaction that puts the
system into the state that Bob wanted? The obvious answer is she gives
Bob data
Post by Peter Todd via bitcoin-dev
proving that the system is now in the desired state; in a transactional
system
Post by Peter Todd via bitcoin-dev
that proof is some or all of the transaction history. Systems like
Bitcoin
Post by Peter Todd via bitcoin-dev
provide a generic flood-fill messaging layer where all participants have
the
Post by Peter Todd via bitcoin-dev
opportunity to get a copy of all proofs in the system, however we can
also
Post by Peter Todd via bitcoin-dev
implement more fine grained solutions based on peer-to-peer message
passing -
Post by Peter Todd via bitcoin-dev
one could imagine Alice proving to Bob that she transferred title to her
house
Post by Peter Todd via bitcoin-dev
to him by giving him a series of proofs, not unlike the same way that
property
Post by Peter Todd via bitcoin-dev
title transfer can be demonstrated by providing the buyer with a series
of deed
Post by Peter Todd via bitcoin-dev
documents (though note the double-spend problem!).
# Uniqueness and Single-Use Seals
In addition to knowing that a given transaction history is valid, we
also want
Post by Peter Todd via bitcoin-dev
to know if it's unique. By that we mean that every spent output in the
transaction history is associated with exactly one input, and no other
valid
Post by Peter Todd via bitcoin-dev
spends exist; we want to ensure no output has been double-spent.
Bitcoin (and pretty much every other cryptocurrency like it) achieves
this goal
Post by Peter Todd via bitcoin-dev
by defining a method of achieving consensus over the set of all (valid)
transactions, and then defining that consensus as valid if and only if no
output is spent more than once.
A more general approach is to introduce the idea of a cryptographic
Single-Use
Post by Peter Todd via bitcoin-dev
Seal, analogous to the tamper-evidence single-use seals commonly used for
protecting goods during shipment and storage. Each individual seals is
associated with a globally unique identifier, and has two states, open
and
Post by Peter Todd via bitcoin-dev
closed. A secure seal can be closed exactly once, producing a proof that
the
Post by Peter Todd via bitcoin-dev
seal was closed.
All practical single-use seals will be associated with some kind of
condition,
Post by Peter Todd via bitcoin-dev
such as a pubkey, or deterministic expression, that needs to be
satisfied for
Post by Peter Todd via bitcoin-dev
the seal to be closed. Secondly, the contents of the proof will be able
to
Post by Peter Todd via bitcoin-dev
commit to new data, such as the transaction spending the output
associated with
Post by Peter Todd via bitcoin-dev
the seal.
Additionally some implementations of single-use seals may be able to also
generate a proof that a seal was _not_ closed as of a certain
time/block-height/etc.
## Implementations
### Transactional Blockchains
A transaction output on a system like Bitcoin can be used as a
single-use seal.
Post by Peter Todd via bitcoin-dev
In this implementation, the outpoint (txid:vout #) is the seal's
identifier,
Post by Peter Todd via bitcoin-dev
the authorization mechanism is the scriptPubKey of the output, and the
proof
Post by Peter Todd via bitcoin-dev
is the transaction spending the output. The proof can commit to
additional
Post by Peter Todd via bitcoin-dev
data as needed in a variety of ways, such as an OP_RETURN output, or
unspendable output.
This implementation approach is resistant to miner censorship if the
seal's
Post by Peter Todd via bitcoin-dev
identifier isn't made public, and the protocol (optionally) allows for
the
Post by Peter Todd via bitcoin-dev
proof transaction to commit to the sealed contents with unspendable
outputs;
Post by Peter Todd via bitcoin-dev
unspendable outputs can't be distinguished from transactions that move
funds.
Post by Peter Todd via bitcoin-dev
### Unbounded Oracles
A trusted oracle P can maintain a set of closed seals, and produce signed
messages attesting to the fact that a seal was closed. Specifically, the
seal
Post by Peter Todd via bitcoin-dev
is identified by the tuple (P, q), with q being the per-seal
authorization
Post by Peter Todd via bitcoin-dev
expression that must be satisfied for the seal to be closed. The first
time the
Post by Peter Todd via bitcoin-dev
oracle is given a valid signature for the seal, it adds that signature
and seal
Post by Peter Todd via bitcoin-dev
ID to its closed seal set, and makes available a signed message
attesting to
Post by Peter Todd via bitcoin-dev
the fact that the seal has been closed. The proof is that message (and
possibly the signature, or a second message signed by it).
The oracle can publish the set of all closed seals for
transparency/auditing
Post by Peter Todd via bitcoin-dev
purposes. A good way to do this is to make a merkelized key:value set,
with the
Post by Peter Todd via bitcoin-dev
seal identifiers as keys, and the value being the proofs, and in turn
create a
Post by Peter Todd via bitcoin-dev
signed certificate transparency log of that set over time. Merkle-paths
from
Post by Peter Todd via bitcoin-dev
this log can also serve as the closed seal proof, and for that matter, as
proof of the fact that a seal has not been closed.
### Bounded Oracles
The above has the problem of unbounded storage requirements as the
closed seal
Post by Peter Todd via bitcoin-dev
set grows without bound. We can fix that problem by requiring users of
the
Post by Peter Todd via bitcoin-dev
oracle to allocate seals in advance, analogous to the UTXO set in
Bitcoin.
Post by Peter Todd via bitcoin-dev
To allocate a seal the user provides the oracle P with the authorization
expression q. The oracle then generates a nonce n and adds (q,n) to the
set of
Post by Peter Todd via bitcoin-dev
unclosed seals, and tells the user that nonce. The seal is then uniquely
identified by (P, q, n)
To close a seal, the user provides the oracle with a valid signature
over (P,
Post by Peter Todd via bitcoin-dev
q, n). If the open seal set contains that seal, the seal is removed from
the
Post by Peter Todd via bitcoin-dev
set and the oracle provides the user with a signed message attesting to
the
Post by Peter Todd via bitcoin-dev
valid close.
A practical implementation would be to have the oracle publish a
transparency
Post by Peter Todd via bitcoin-dev
log, with each entry in the log committing to the set of all open seals
with a
Post by Peter Todd via bitcoin-dev
merkle set, as well as any seals closed during that entry. Again, merkle
paths
Post by Peter Todd via bitcoin-dev
for this log can serve as proofs to the open or closed state of a seal.
Note how with (U)TXO commitments, Bitcoin itself is a bounded oracle
implementation that can produce compact proofs.
### Group Seals
Multiple seals can be combined into one, by having the open seal commit
to a
Post by Peter Todd via bitcoin-dev
set of sub-seals, and then closing the seal over a second set of closed
seal
Post by Peter Todd via bitcoin-dev
proofs. Seals that didn't need to be closed can be closed over a special
re-delegation message, re-delegating the seal to a new open seal.
Since the closed sub-seal proof can additionally include a proof of
authorization, we have a protcol where the entity with authorization to
close
Post by Peter Todd via bitcoin-dev
the master seal has the ability to DoS attack sub-seals owners, but not
the
Post by Peter Todd via bitcoin-dev
ability to fraudulently close the seals over contents of their choosing.
This
Post by Peter Todd via bitcoin-dev
may be useful in cases where actions on the master seal is expensive -
such as
Post by Peter Todd via bitcoin-dev
seals implemented on top of decentralized blockchains - by amortising
the cost
Post by Peter Todd via bitcoin-dev
over all sub-seals.
## Atomicity
Often protocols will require multiple seals to be closed for a
transaction to
Post by Peter Todd via bitcoin-dev
be valid. If a single entity controls all seals, this is no problem: the
transaction simply isn't valid until the last seal is closed.
However if multiple parties control the seals, a party could attack
another
Post by Peter Todd via bitcoin-dev
party by failing to go through with the transaction, after another party
has
Post by Peter Todd via bitcoin-dev
closed their seal, leaving the victim with an invalid transaction that
they
Post by Peter Todd via bitcoin-dev
can't reverse.
### Use a single oracle
The oracle can additionally guarantee that a seal will be closed iff
some other
Post by Peter Todd via bitcoin-dev
set of seals are also closed; seals implemented with Bitcoin can provide
this
Post by Peter Todd via bitcoin-dev
guarantee. If the parties to a transaction aren't already all on the same
oracle, they can add an additional transaction reassigning their outputs
to a
Post by Peter Todd via bitcoin-dev
common oracle.
Equally, a temporary consensus between multiple mutually trusting
oracles can
Post by Peter Todd via bitcoin-dev
be created with a consensus protocol they share; this option doesn't
need to
Post by Peter Todd via bitcoin-dev
change the proof verification implementation.
### Two-phase Timeouts
If a proof to the fact that a seal is open can be generated, even under
adversarial conditions, we can make the seal protocol allow a close to be
undone after a timeout if evidence can be provided that the other
seal(s) were
Post by Peter Todd via bitcoin-dev
not also closed (in the specified way).
Depending on the implementation - especially in decentralized systems -
the
Post by Peter Todd via bitcoin-dev
next time the seal is closed, the proof it has been closed may in turn
provide
Post by Peter Todd via bitcoin-dev
proof that a previous close was in fact invalid.
# Proof-of-Publication and Proof-of-Non-Publication
Often we need to be able to prove that a specified audience was able to
receive
Post by Peter Todd via bitcoin-dev
a specific message. For example, the author's PayPub protocol[^paypub],
Todd/Taaki's timelock encryption protocol[^timelock], Zero-Knowledge
Contingent
Post by Peter Todd via bitcoin-dev
Payments[^zkcp], and Lightning, among others work by requiring a secret
key to
Post by Peter Todd via bitcoin-dev
be published publicly in the Bitcoin blockchain as a condition of
collecting a
Post by Peter Todd via bitcoin-dev
payment. At a much smaller scale - in terms of audience - in certain
FinTech
Post by Peter Todd via bitcoin-dev
applications for regulated environments a transaction may be considered
invalid
Post by Peter Todd via bitcoin-dev
unless it was provably published to a regulatory agency. Another
example is
Post by Peter Todd via bitcoin-dev
Certificate Transparency, where we consider a SSL certificate to be
invalid
Post by Peter Todd via bitcoin-dev
unless it has been provably published to a transparency log maintained
by a
Post by Peter Todd via bitcoin-dev
third-party.
Secondly, many proof-of-publication schemes also can prove that a
message was
Post by Peter Todd via bitcoin-dev
_not_ published to a specific audience. With this type of proof
single-use
Post by Peter Todd via bitcoin-dev
seals can be implemented, by having the proof consist of proof that a
specified
Post by Peter Todd via bitcoin-dev
message was not published between the time the seal was created, and the
time
Post by Peter Todd via bitcoin-dev
it was closed (a proof-of-publication of the message).
## Implementations
### Decentralized Blockchains
Here the audience is all participants in the system. However miner
censorship
Post by Peter Todd via bitcoin-dev
can be a problem, and compact proofs of non-publication aren't yet
available
Post by Peter Todd via bitcoin-dev
(requires (U)TXO commitments).
The authors treechains proposal is a particularly generic and scalable
implementation, with the ability to make trade offs between the size of
audience (security) and publication cost.
### Centralized Public Logs
Certificate Transparency works this way, with trusted (but auditable)
logs run
Post by Peter Todd via bitcoin-dev
by well known parties acting as the publication medium, who promise to
allow
Post by Peter Todd via bitcoin-dev
anyone to obtain copies of the logs.
The logs themselves may be indexed in a variety of ways; CT simply
indexes logs
Post by Peter Todd via bitcoin-dev
by time, however more efficient schemes are possible by having the
operator
Post by Peter Todd via bitcoin-dev
commit to a key:value mapping of "topics", to allow publication (and
non-publication) proofs to be created for specified topics or topic
prefixes.
Post by Peter Todd via bitcoin-dev
Auditing the logs is done by verifying that queries to the state of the
log
Post by Peter Todd via bitcoin-dev
return the same state at the same time for different requesters.
### Receipt Oracles
Finally publication can be proven by a receipt proof by the oracle,
attesting
Post by Peter Todd via bitcoin-dev
to the fact that the oracle has successfully received the message. This
is
Post by Peter Todd via bitcoin-dev
particularly appropriate in cases where the required audience is the
oracle
Post by Peter Todd via bitcoin-dev
itself, as in the FinTech regulator case.
# Validity Oracles
As transaction histories grow longer, they may become impractical to
move from
Post by Peter Todd via bitcoin-dev
one party to another. Validity oracles can solve this problem by
attesting to
Post by Peter Todd via bitcoin-dev
the validity of transactions, allowing history prior to the attested
transactions to be discarded.
A particularly generic validity oracle can be created using deterministic
expressions systems. The user gives the oracle an expression, and the
oracle
Post by Peter Todd via bitcoin-dev
returns a signed message attesting to the validity of the expression.
Optionally, the expression may be incomplete, with parts of the
expression
Post by Peter Todd via bitcoin-dev
replaced by previously generated attestations. For example, an
expression that
Post by Peter Todd via bitcoin-dev
returns true if a transaction is valid could in turn depend on the
previous
Post by Peter Todd via bitcoin-dev
transaction also being valid - a recursive call of itself - and that
recursive
Post by Peter Todd via bitcoin-dev
call can be proven with a prior attestation.
## Implementations
### Proof-of-Work Decentralized Consensus
Miners in decentralized consensus systems act as a type of validity
oracle, in
Post by Peter Todd via bitcoin-dev
that the economic incentives in the system are (supposed to be) designed
to
Post by Peter Todd via bitcoin-dev
encourage only the mining of valid blocks; a user who trusts the
majority of
Post by Peter Todd via bitcoin-dev
hashing power can trust that any transaction with a valid merkle path to
a
Post by Peter Todd via bitcoin-dev
block header in the most-work chain is valid. Existing decentralized
consensus
Post by Peter Todd via bitcoin-dev
systems like Bitcoin and Ethereum conflate the roles of validity oracle
and
Post by Peter Todd via bitcoin-dev
single-use seal/anti-replay oracle, however in principle that need not
be true.
Post by Peter Todd via bitcoin-dev
### Trusted Oracles
As the name suggests. Remote-attestation-capable trusted hardware is a
particularly powerful implementation - a conspiracy theory is that the
reason
Post by Peter Todd via bitcoin-dev
why essentially zero secure true remote attestation implementations
exist is
Post by Peter Todd via bitcoin-dev
because they'd immediately make untraceable digital currency systems
easy to
Post by Peter Todd via bitcoin-dev
implement (Finney's RPOW[^rpow] is a rare counter-example).
Note how a single-use seal oracle that supports a generic deterministic
expressions scheme for seal authorization can be easily extended to
provide a
Post by Peter Todd via bitcoin-dev
validity oracle service as well. The auditing mechanisms for a
single-use seal
Post by Peter Todd via bitcoin-dev
oracle can also be applied to validity oracles.
# Fraud Proofs
Protocols specified with deterministic expressions can easily generate
"fraud
Post by Peter Todd via bitcoin-dev
proofs", showing that claimed states/proof in the system are actually
invalid.
Post by Peter Todd via bitcoin-dev
Additionally many protocols can be specified with expressions of
k*log2(n)
Post by Peter Todd via bitcoin-dev
depth, allowing these fraud proofs to be compact.
A simple example is proving fraud in merkle-sum tree, where the validity
(defun valid? (node)
(or (== node.type leaf)
(and (== node.sum (+ node.left.sum node.right.sum))
(and (valid? node.left)
(valid? node.right)))))
To prove the above expression evaluates to true, we'll need the entire
contents
Post by Peter Todd via bitcoin-dev
of the tree. However, to prove that it evaluates to false, we only need a
subset of the tree as proving an and expression evaluates to false only
requires one side, and requires log2(n) data. Secondly, with pruning, the
deterministic expressions evaluator can automatically keep track of
exactly
Post by Peter Todd via bitcoin-dev
what data was needed to prove that result, and prune all other data when
serializing the proof.
## Validity Challenges
However how do you guarantee it will be possible to prove fraud in the
first
Post by Peter Todd via bitcoin-dev
place? If pruning is allowed, you may simply not have access to the data
proving fraud - an especially severe problem in transactional systems
where a
Post by Peter Todd via bitcoin-dev
single fraudulent transaction can counterfeit arbitrary amounts of value
out of
Post by Peter Todd via bitcoin-dev
thin air.
A possible approach is the validity challenge: a subset of proof data,
with
Post by Peter Todd via bitcoin-dev
part of the data marked as "potentially fraudulent". The challenge can be
satisfied by providing the marked data and showing that the proof in
question
Post by Peter Todd via bitcoin-dev
is in fact valid; if the challenge is unmet participants in the system
can
Post by Peter Todd via bitcoin-dev
choose to take action, such as refusing to accept additional
transactions.
Post by Peter Todd via bitcoin-dev
Of course, this raises a whole host of so-far unsolved issues, such as
DoS
Post by Peter Todd via bitcoin-dev
attacks and lost data.
# Probabilistic Validation
Protocols that can tolerate some fraud can make use of probabilistic
verification techniques to prove that the percentage of undetected fraud
within
Post by Peter Todd via bitcoin-dev
the system is less than a certain amount, with a specified probability.
A common way to do this is the Fiat-Shamir transform, which repeatedly
samples
Post by Peter Todd via bitcoin-dev
a data structure deterministically, using the data's own hash digest as
a seed
Post by Peter Todd via bitcoin-dev
for a PRNG. Let's apply this technique to our merkle-sum tree example.
We'll
Post by Peter Todd via bitcoin-dev
(defun prefix-valid? (node nonce)
(or (== node.type leaf)
(and (and (== node.sum (+ node.left.sum node.right.sum))
(> 0 node.sum)) ; mod by 0 is invalid, just like
division by zero
Post by Peter Todd via bitcoin-dev
; also could guarantee this with a
type system
Post by Peter Todd via bitcoin-dev
(and (if (< node.left.sum (mod nonce node.sum))
(prefix-valid? node.right (hash nonce))
(prefix-valid? node.left (hash nonce)))))))
Now we can combine multiple invocations of the above, in this case 256
(defun prob-valid? (node)
(and (and (and .... (prefix-valid? node (digest (cons (digest
node) 0)))
Post by Peter Todd via bitcoin-dev
(and (and ....
(prefix-valid? node (digest (cons (digest
node) 255)))
Post by Peter Todd via bitcoin-dev
As an exercise for a reader: generalize the above with a macro, or a
suitable
Post by Peter Todd via bitcoin-dev
types/generics system.
If we assume our attacker can grind up to 128 bits, that leaves us with
128
Post by Peter Todd via bitcoin-dev
random samples that they can't control. If the (value weighted)
probability of
Post by Peter Todd via bitcoin-dev
a given node is fraudulent q, then the chance of the attacker getting
away with
Post by Peter Todd via bitcoin-dev
fraud is (1-q)^128 - for q=5% that works out to 0.1%
(Note that the above analysis isn't particularly well done - do a better
analysis before implementing this in production!)
## Random Beacons and Transaction History Linearization
The Fiat-Shamir transform requires a significant number of samples to
defeat
Post by Peter Todd via bitcoin-dev
grinding attacks; if we have a random beacon available we can
significantly
Post by Peter Todd via bitcoin-dev
reduce the size of our probabilistic proofs. PoW blockchains can
themselves act
Post by Peter Todd via bitcoin-dev
as random beacons, as it is provably expensive for miners to manipulate
the
Post by Peter Todd via bitcoin-dev
hash digests of blocks they produce - to do so requires discarding
otherwise
Post by Peter Todd via bitcoin-dev
valid blocks.
An example where this capability is essential is the author's transaction
history linearization technique. In value transfer systems such as
Bitcoin, the
Post by Peter Todd via bitcoin-dev
history of any given coin grows quasi-exponentially as coins are mixed
across
Post by Peter Todd via bitcoin-dev
the entire economy. We can linearize the growth of history proofs by
redefining
Post by Peter Todd via bitcoin-dev
coin validity to be probabilistic.
Suppose we have a transaction with n inputs. Of those inputs, the total
value
Post by Peter Todd via bitcoin-dev
of real inputs is p, and the total claimed value of fake inputs is q. The
transaction commits to all inputs in a merkle sum tree, and we define the
transaction as valid if a randomly chosen input - weighted by value - can
itself be proven valid. Finally, we assume that creating a genuine input
is a
Post by Peter Todd via bitcoin-dev
irrevocable action which irrevocable commits to the set of all inputs,
real and
Post by Peter Todd via bitcoin-dev
fake.
If all inputs are real, 100% of the time the transaction will be valid;
if all
Post by Peter Todd via bitcoin-dev
inputs are fake, 100% of the time the transaction will be invalid. In
the case
Post by Peter Todd via bitcoin-dev
where some inputs are real and some are fake the probability that the
fraud
Post by Peter Todd via bitcoin-dev
q / (q + p)
The expected value of the fake inputs is then the sum of the potential
upside -
Post by Peter Todd via bitcoin-dev
the fraud goes detected - and the potential downside - the fraud is
detected
Post by Peter Todd via bitcoin-dev
E = q(1 - q/(q + p)) - p(q/(q + p)
= q(p/(q + p)) - p(q/(q + p)
= (q - q)(p/(q + p))
= 0
Thus so long as the random beacon is truly unpredictable, there's no
economic
Post by Peter Todd via bitcoin-dev
advantage to creating fake inputs, and it is sufficient for validity to
only
Post by Peter Todd via bitcoin-dev
require one input to be proven, giving us O(n) scaling for transaction
history
Post by Peter Todd via bitcoin-dev
proofs.
### Inflationary O(1) History Proofs
We can further improve our transaction history proof scalability by
taking
Post by Peter Todd via bitcoin-dev
advantage of inflation. We do this by occasionally allowing a
transaction proof
Post by Peter Todd via bitcoin-dev
to be considered valid without validating _any_ of the inputs; every
time a
Post by Peter Todd via bitcoin-dev
transaction is allowed without proving any inputs the size of the
transaction
Post by Peter Todd via bitcoin-dev
history proof is reset. Of course, this can be a source of inflation, but
provided the probability of this happening can be limited we can limit
the
Post by Peter Todd via bitcoin-dev
maximum rate of inflation to the chosen value.
For example, in Bitcoin as of writing every block inflates the currency
supply
Post by Peter Todd via bitcoin-dev
by 25BTC, and contains a maximum of 1MB of transaction data,
0.025BTC/KB. If we
Post by Peter Todd via bitcoin-dev
check the prior input proof with probability p, then the expected value
of a
Post by Peter Todd via bitcoin-dev
E = x(1-p)
We can rewrite that in terms of the block reward per-byte R, and the
lR = x(1-p)
p = 1 - lR/x
For example, for a 1KB transaction proof claiming to spending 10BTC we
can omit
Post by Peter Todd via bitcoin-dev
checking the input 0.25% of the time without allowing more monetary
inflation
Post by Peter Todd via bitcoin-dev
than the block reward already does. Secondly, this means that after n
transactions, the probability that proof shortening will _not_ happen is
p^n,
Post by Peter Todd via bitcoin-dev
which reaches 1% after 1840 transactions.
In a system like Bitcoin where miners are expected to validate, a
transaction
Post by Peter Todd via bitcoin-dev
proof could consist of just a single merkle path showing that a
single-use seal
Post by Peter Todd via bitcoin-dev
was closed in some kind of TXO commitment - probably under 10KB of data.
That
Post by Peter Todd via bitcoin-dev
gives us a history proof less than 18.4MB in size, 99% of the time, and
less
Post by Peter Todd via bitcoin-dev
than 9.2MB in size 90% of the time.
An interesting outcome of thing kind of design is that we can
institutionalize
Post by Peter Todd via bitcoin-dev
inflation fraud: the entire block reward can be replaced by miners
rolling the
Post by Peter Todd via bitcoin-dev
dice, attempting to create valid "fake" transactions. However, such a
pure
Post by Peter Todd via bitcoin-dev
implementation would put a floor on the lowest transaction fee possible,
so
Post by Peter Todd via bitcoin-dev
better to allow both transaction fee and subsidy collection at the same
time.
Post by Peter Todd via bitcoin-dev
# References
[^paypub] https://github.com/unsystem/paypub
[^timelock] https://github.com/petertodd/timelock
[^zkcp]
https://bitcoincore.org/en/2016/02/26/zero-knowledge-contingent-payments-announcement/
Post by Peter Todd via bitcoin-dev
[^rpow] https://cryptome.org/rpow.htm
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Peter Todd via bitcoin-dev
2016-06-21 22:42:25 UTC
Permalink
Post by zaki--- via bitcoin-dev
Hi Peter,
I didn't entirely understand the process of transaction linearization.
What I see is a potential process where when the miner assembles the block,
he strips all but one sigscript per tx. The selection of which sigscript
is retained is determined by the random oracle. Is this is primary benefit
you are suggesting?
It appears to me that blocks still need to contain a list of full TX Input
and Tx Outputs with your approach. Some of the description seems to
indicate that there are opportunities to elide further data but it's
unclear to me how.
I think you've misunderstood what I'm proposing. The state machine approach I
described doesn't necessarily require blocks or even miners to exist at all.
Rather, it assumes that a single-use seal primitive is available, and a random
beacon primitive for tx linearization, and then builds a system on top of those
primitives. Transaction data - the proofs that certain states have been reached
in the system - does not need to be broadcast publicly; if Alice wants to
convince Bob that she has given him money, the only person who needs that
transaction (and transactions prior to it in the tx history) is Bob.

So as to your question about miners assembling blocks, and what blocks contain:
there doesn't need to be blocks at all! Transaction history linearization is
something your wallet would do for you.
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Peter Todd via bitcoin-dev
2016-06-23 11:21:16 UTC
Permalink
Post by Police Terror via bitcoin-dev
Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer
the current protocol into lisp (inside C++), run this alternative engine
alongside the current one as an option for some years (only for fine
tuning) then eventually fade this lisp written validation code instead
of the current one.
You know, I'm kinda regretting not making it sufficiently clear that Dex isn't
Lisp... It may look like it with all the braces, but expressions in it are
evaluated without any global state (they can be evaluated in parallel) and I've
got a lot of work ahead of me in type safety.
Post by Police Terror via bitcoin-dev
Scheme is small and minimal, and embeds easily in C++. This could be a
better option than the libconsensus library - validation in a functional
scripting language.
I'd be surprised if you could find a scheme interpreter that's sufficiently
well defined to be suitable for that; starting with an existing one and
whipping it into shape would very likely be more work than starting from
scratch.
Post by Police Terror via bitcoin-dev
That doesn't mean people can't program the validation code in other
languages (maybe they'd want to optimize), but this code would be the
standard.
Yeah, in general I'd expect most of these systems to be layered to a degree;
after all even in something like MAST you need tooling to manage the fact that
the opcodes that end up public, on-chain, are only a subset of the script.
Post by Police Terror via bitcoin-dev
I wouldn't be so quick to deride good engineering over systematic
provable systems for all domains. Bitcoin being written in C++ is not a
defect. It's actually a strong language for what it does. Especially
when used correctly (which is not often and takes years to master).
It's probably the best of a lot of bad alternatives... We use C++ not because
it's good, but because there's no other option.

In particular, we have enormous cost and risk in moving to other things due to
consensus, so making use of other languages is very difficult; my work with
dex/proofchains does not have that constraint.
Post by Police Terror via bitcoin-dev
With the seals idea- am I understand this correctly?: Every transaction
has a number (essentially the index starting from 0 upwards) depending
on where it is in the blockchain.
Then there is an array (probably an on disk array mapping transaction
indexes to hashes). Each hash entry in the array must be unique (the
hashes) otherwise the transaction will be denied. This is a great idea
to solve transaction hash collisions and simple to implement.
No, I think you've very much misunderstood things. The abstract notion of a
single-use seal doesn't even need global consensus on anything to implement; it
does not require transactions to have "indexes"
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Alex Mizrahi via bitcoin-dev
2016-06-20 22:28:48 UTC
Permalink
Post by Peter Todd via bitcoin-dev
All practical single-use seals will be associated with some kind of condition,
such as a pubkey, or deterministic expression, that needs to be satisfied for
the seal to be closed.
I think it would be useful to classify systems w.r.t. what data is
available to condition.
I imagine it might be useful if status of other seals is available.
Post by Peter Todd via bitcoin-dev
Secondly, the contents of the proof will be able to
commit to new data, such as the transaction spending the output associated with
the seal.
So basically a "condition" returns that "new data", right?
If it commits to a data in a recognizable way, then it's practically a
function which yields a tuple (valid, new_data).
If an oracle doesn't care about data then you can convert it to a predicate
using a simple projection.
But from point of view of a client, it is a function which returns a tuple.

It might help if you describe a type of the condition function.

Some related work on UTXO-based smart contracts:

1. Typecoin described in the paper
"Peer-to-peer Affine Commitment using Bitcoin" Karl Crary and Michael J.
Sullivan Carnegie Mellon University PLDI ’15, Portland June 17, 2015

I don't see the paper in open access and I've lost my copy, but there are
slides: https://www.msully.net/stuff/typecoin-slides.pdf

The paper is written by programming language researchers, and thus use
fairly complex constructs.
The idea is to use the language of linear logic, but it's actually
implemented using type-oriented programming.
So, basically, they associate logical propositions with transaction
outputs. Transactions proof that output-propositions logically follow from
input-propositions.
The paper first describes as a colored coin kind of a system, where color
values are propositions/types.
But in the implementation part it became more like a metacoin, as it uses a
complete transaction history.
A setup with a trusted server is also mentioned.

The interesting thing about Typecoin is that a contract language is based
on logic, which makes it powerful and -- I guess -- analyzable. However,
the paper doesn't mention any performance details, and I guess it's not
good.
Another problem is that it looks very unusual to people who aren't used to
type-oriented programming.

2. Generic coins
Seeing how much Typecoin people had to struggle to describe a Bitcoin-style
system I decided to describe a generalized Bitcoin-style system, so it can
be easily referenced in research. Sadly all I got so far is a draft of an
introduction/definition sections:
https://github.com/chromaway/ngcccbase/wiki/gc

In the first section I described a transaction graph model which is
supposed to be general enough to describe any kind of a transaction graph
system with explicit dependencies and no "spooky action at distance". As it
turns out, any such system can be defined in terms of few predicate
functions, however, using these functions directly might be very
inefficient.

The next section introduces a coin-based model. A coin-based system can be
described using a single function called coin kernel which is applied to a
transaction and a list of input coinstates.
It is then described how to go from a coin-based model to a
transaction-graph model.
The reverse should also be possible if we add additional restrictions on a
transaction-graph model, it's probably enough to define that coin can be
spent only once. (Partial coin spends were described in Freimarkets.)

There is a fairly shitty prototype in Haskell:
https://github.com/baldmaster/ColorCoin

3. flexichains
This is a prototype done by me more recently, the interesting thing about
it is that it unifies account-based and UTXO-based models in a single model.

We first introduce a notion of record. A record can be of an arbitrary
type, the only restriction is that it must have a key which must be unique
within a system.
Then transaction model can be introduced using two function:
txDependencies returns a list of keys of records transaction depends on
applyTx takes a transaction and a list of records it depends on and
returns either a list of records or an error.

A list of records includes
* new records which are created by a transaction
* updated records will have the same key but different content

A simple account-based system can be implement using tuples (pubkey,
balance, last_update) as records.
In an UTXO-based system records are transaction output, and they should
include a spent flag. (Obviously, records with spent flag can be pruned.)
A system with custom smart contracts can be implemented by adding some sort
of a function or bytecode to records.

A Haskell prototype is here:
https://bitbucket.org/chromawallet/flexichains/src/21059080bed6?at=develop
(It's kinda broken and incomplete, though.)
Peter Todd via bitcoin-dev
2016-06-23 11:11:52 UTC
Permalink
Post by Alex Mizrahi via bitcoin-dev
Post by Peter Todd via bitcoin-dev
All practical single-use seals will be associated with some kind of condition,
such as a pubkey, or deterministic expression, that needs to be satisfied for
the seal to be closed.
I think it would be useful to classify systems w.r.t. what data is
available to condition.
I imagine it might be useful if status of other seals is available.
Useful yes, but actually implementing that often results in systems that are
too tightly coupled to scale well.
Post by Alex Mizrahi via bitcoin-dev
Post by Peter Todd via bitcoin-dev
Secondly, the contents of the proof will be able to
commit to new data, such as the transaction spending the output associated with
the seal.
So basically a "condition" returns that "new data", right?
If it commits to a data in a recognizable way, then it's practically a
function which yields a tuple (valid, new_data).
If an oracle doesn't care about data then you can convert it to a predicate
using a simple projection.
But from point of view of a client, it is a function which returns a tuple.
What do you mean by "new data"?

The point I'm making is simply that to be useful, when you close a seal you
have to be able to close it over some data, in particular, another seal. That's
the key thing that makes the idea a useful construct for smart contacts, value
transfer/currency systems, etc.
Post by Alex Mizrahi via bitcoin-dev
It might help if you describe a type of the condition function.
I did describe some seal authorization condition functions in my more recent
post; the key thing is you'd have some kind of "checksig" operator that checks
a cryptographic signature.
<snip>

Thanks for the links! Not at all surprising to me that there's a whole bunch of
projects working along those same lines; it's the obvious way to build this
kind of stuff once you realise that the imperative, stateful, model isn't
viable.
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Alex Mizrahi via bitcoin-dev
2016-06-23 12:58:29 UTC
Permalink
Post by Peter Todd via bitcoin-dev
The point I'm making is simply that to be useful, when you close a seal you
have to be able to close it over some data, in particular, another seal. That's
the key thing that makes the idea a useful construct for smart contacts, value
transfer/currency systems, etc.
OK, your second post ("Closed Seal Sets and Truth Lists for Better Privacy
and Censorship Resistance") seems to clarify that this data is one of
arguments to the condition function.
Frankly this stuff is rather hard to follow. (Or maybe I'm dumb.)

Now I don't get scability properties. Let's consider a simplest scenario
where Alice creates some token, sends it to Bob, who sends it to Claire. So
now Claire needs to get both a proof that Alice sent it to Bob and that Bob
sent it to Claire, right? So Claire needs to verify 2 proofs, and for a
chain of N transfers one would need to verify N proofs, right?

And how it works in general:

1. Alice creates a token. To do that she constructs an unique expression
which checks her signature and signs a message "This token has such and
such meaning and its ownership originally associated with seal <hash of the
expression>" with her PGP key.
2. To transfer this token to Bob, she asks Bob for his auth expression and
sends a seal oracle a message (Alice_expression (Bob_expression .
signature)) where signatures is constructed in such a way that it evaluates
as true. Oracle stores this in a map: Alice_expression -> (Bob_expression .
signatures)
3. Bob sends token to Claire in a same way: (Bob_expression
(Claire_expression . signature))
4. Now Claire asks if Alice_expression->(Bob_expression . _) and
Bob_expression->(Claire_expression . _) are in oracle's map. She might
trust the oracle to verify signatures, but oracle doesn't understand token
semantics. Thus she needs to check if these entries were added.
If I understand correctly, Alice_expression->(Bob_expression . _) record
can be communicated in just 3 * size_of_hash_digest bytes.

So this seems to have rather bad scalability even with trusted oracles, am
I missing something?
Peter Todd via bitcoin-dev
2016-06-24 22:23:16 UTC
Permalink
Post by Alex Mizrahi via bitcoin-dev
Post by Peter Todd via bitcoin-dev
The point I'm making is simply that to be useful, when you close a seal you
have to be able to close it over some data, in particular, another seal. That's
the key thing that makes the idea a useful construct for smart contacts, value
transfer/currency systems, etc.
OK, your second post ("Closed Seal Sets and Truth Lists for Better Privacy
and Censorship Resistance") seems to clarify that this data is one of
arguments to the condition function.
Frankly this stuff is rather hard to follow. (Or maybe I'm dumb.)
Now I don't get scability properties. Let's consider a simplest scenario
where Alice creates some token, sends it to Bob, who sends it to Claire. So
now Claire needs to get both a proof that Alice sent it to Bob and that Bob
sent it to Claire, right? So Claire needs to verify 2 proofs, and for a
chain of N transfers one would need to verify N proofs, right?
Not necessarily. In my writeup I outlined two ways that those chains can be
shortened: trusted validity oracles and the probabalistic, inflationary,
history proof concept.

Equally, even if history grows over time, that's no worse than Bitcoin.
Post by Alex Mizrahi via bitcoin-dev
1. Alice creates a token. To do that she constructs an unique expression
which checks her signature and signs a message "This token has such and
such meaning and its ownership originally associated with seal <hash of the
expression>" with her PGP key.
Alice isn't _creating_ a tokne, she's _defining_ a token.
Post by Alex Mizrahi via bitcoin-dev
2. To transfer this token to Bob, she asks Bob for his auth expression and
sends a seal oracle a message (Alice_expression (Bob_expression .
signature)) where signatures is constructed in such a way that it evaluates
as true. Oracle stores this in a map: Alice_expression -> (Bob_expression .
signatures)
Nope.

In Alice's token definition, the genesis state of the token is defined to be
associated with a specific single-use seal. To transfer the token to Bob, she
asks Bob for the seal he wishes to use, and then closes the genesis seal over a
new state committing to Bob's seal.

Now Alice could construct the seal for Bob, in which case she'd just need to
know the auth expression Bob wants to use, but that's not the most fundamental
way of implementing this.

Regardless, the seal oracle doesn't need to know that any of the above is
happening; all it needs to do is spit out seal closed witnesses when the
authorization expressions are satisfied appropriately; the oracle does not and
should not know what the seals have been closed over. Whether or not the oracle
stores anything when seals are closed is an implementation decision - see my
original writeup on the unbounded vs. bounded oracle case. And of course, seals
implemented with decentralized blockchains are a different matter entirely.
Post by Alex Mizrahi via bitcoin-dev
3. Bob sends token to Claire in a same way: (Bob_expression
(Claire_expression . signature))
4. Now Claire asks if Alice_expression->(Bob_expression . _) and
Bob_expression->(Claire_expression . _) are in oracle's map. She might
trust the oracle to verify signatures, but oracle doesn't understand token
semantics. Thus she needs to check if these entries were added.
If I understand correctly, Alice_expression->(Bob_expression . _) record
can be communicated in just 3 * size_of_hash_digest bytes.
So this seems to have rather bad scalability even with trusted oracles, am
I missing something?
Yes, as I mentioned above, there exists multiple techniques that can shorten
history proofs in a variety of ways, depending on what kinds of tradeoffs your
application needs.
--
https://petertodd.org 'peter'[:-1]@petertodd.org
Loading...