Discussion:
[Bitcoin-development] [BIP] Normalized Transaction IDs
(too old to reply)
Christian Decker
2015-05-13 12:48:04 UTC
Permalink
Hi All,

I'd like to propose a BIP to normalize transaction IDs in order to address
transaction malleability and facilitate higher level protocols.

The normalized transaction ID is an alias used in parallel to the current
(legacy) transaction IDs to address outputs in transactions. It is
calculated by removing (zeroing) the scriptSig before computing the hash,
which ensures that only data whose integrity is also guaranteed by the
signatures influences the hash. Thus if anything causes the normalized ID
to change it automatically invalidates the signature. When validating a
client supporting this BIP would use both the normalized tx ID as well as
the legacy tx ID when validating transactions.

The detailed writeup can be found here:
https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.

@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.

In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can now
use template transactions upon which sequences of transactions can be built
before signing them.

I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.

I'm sure it'll take time to implement and upgrade, but I think it would be
a nice addition to the functionality and would solve a long standing
problem :-)

Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.

Regards,
Christian
Tier Nolan
2015-05-13 13:12:43 UTC
Permalink
I think this is a good way to handle things, but as you say, it is a hard
fork.

CHECKLOCKTIMEVERIFY covers many of the use cases, but it would be nice to
fix malleability once and for all.

This has the effect of doubling the size of the UTXO database. At minimum,
there needs to be a legacy txid to normalized txid map in the database.

An addition to the BIP would eliminate the need for the 2nd index. You
could require a SPV proof of the spending transaction to be included with
legacy transactions. This would allow clients to verify that the
normalized txid matched the legacy id.

The OutPoint would be {LegacyId | SPV Proof to spending tx | spending tx |
index}. This allows a legacy transaction to be upgraded. OutPoints which
use a normalized txid don't need the SPV proof.

The hard fork would be followed by a transitional period, in which both
txids could be used. Afterwards, legacy transactions have to have the SPV
proof added. This means that old transactions with locktimes years in the
future can be upgraded for spending, without nodes needing to maintain two
indexes.
Gavin Andresen
2015-05-13 13:41:44 UTC
Permalink
I think this needs more details before it gets a BIP number; for example,
which opcodes does this affect, and how, exactly, does it affect them? Is
the merkle root in the block header computed using normalized transaction
ids or normalized ids?

I think there might actually be two or three or four BIPs here:

+ Overall "what is trying to be accomplished"
+ Changes to the OP_*SIG* opcodes
+ Changes to the bloom-filtering SPV support
+ ...eventually, hard fork rollout plan

I also think that it is a good idea to have actually implemented a proposal
before getting a BIP number. At least, I find that actually writing the
code often turns up issues I hadn't considered when thinking about the
problem at a high level. And I STRONGLY believe BIPs should be descriptive
("here is how this thing works") not proscriptive ("here's how I think we
should all do it").

Finally: I like the idea of moving to a normalized txid. But it might make
sense to bundle that change with a bigger change to OP_CHECKSIG; see Greg
Maxwell's excellent talk about his current thoughts on that topic:

Post by Tier Nolan
I think this is a good way to handle things, but as you say, it is a hard
fork.
CHECKLOCKTIMEVERIFY covers many of the use cases, but it would be nice to
fix malleability once and for all.
This has the effect of doubling the size of the UTXO database. At
minimum, there needs to be a legacy txid to normalized txid map in the
database.
An addition to the BIP would eliminate the need for the 2nd index. You
could require a SPV proof of the spending transaction to be included with
legacy transactions. This would allow clients to verify that the
normalized txid matched the legacy id.
The OutPoint would be {LegacyId | SPV Proof to spending tx | spending tx
| index}. This allows a legacy transaction to be upgraded. OutPoints
which use a normalized txid don't need the SPV proof.
The hard fork would be followed by a transitional period, in which both
txids could be used. Afterwards, legacy transactions have to have the SPV
proof added. This means that old transactions with locktimes years in the
future can be upgraded for spending, without nodes needing to maintain two
indexes.
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
--
--
Gavin Andresen
Christian Decker
2015-05-13 15:24:34 UTC
Permalink
Glad you like it, I was afraid that I missed something obvious :-)

The points the two of you raised are valid and I will address them as soon
as possible. I certainly will implement this proposal so that it becomes
more concrete, but my C++ is a bit rusty and it'll take some time, so I
wanted to gauge interest first.
Post by Tier Nolan
This has the effect of doubling the size of the UTXO database. At
minimum, there needs to be a legacy txid to normalized txid map in the
database.
Post by Tier Nolan
An addition to the BIP would eliminate the need for the 2nd index. You
could require a SPV proof of the spending transaction to be included with
legacy transactions. This would allow clients to verify that the
normalized txid matched the legacy id.
Post by Tier Nolan
The OutPoint would be {LegacyId | SPV Proof to spending tx | spending tx
| index}. This allows a legacy transaction to be upgraded. OutPoints
which use a normalized txid don't need the SPV proof.

It does and I should have mentioned it in the draft, according to my
calculations a mapping legacy ID -> normalized ID is about 256 MB in size,
or at least it was at height 330'000, things might have changed a bit and
I'll recompute that. I omitted the deprecation of legacy IDs on purpose
since we don't know whether we will migrate completely or leave keep both
options viable.
Post by Tier Nolan
I think this needs more details before it gets a BIP number; for example,
which opcodes does this affect, and how, exactly, does it affect them? Is
the merkle root in the block header computed using normalized transaction
ids or normalized ids?

I think both IDs can be used in the merkle tree, since we lookup an ID in
both indices we can use both to address them and we will find them either
way.

As for the opcodes I'll have to check, but I currently don't see how they
could be affected. The OP_*SIG* codes calculate their own (more
complicated) stripped transaction before hashing and checking the
signature. The input of the stripped transaction simply contains whatever
hash was used to reference the output, so we do not replace IDs during the
operation. The stripped format used by OP_*SIG* operations does not have to
adhere to the hashes used to reference a transaction in the input.
Post by Tier Nolan
+ Overall "what is trying to be accomplished"
+ Changes to the OP_*SIG* opcodes
+ Changes to the bloom-filtering SPV support
+ ...eventually, hard fork rollout plan
I also think that it is a good idea to have actually implemented a
proposal before getting a BIP number. At least, I find that actually
writing the code often turns up issues I hadn't considered when thinking
about the problem at a high level. And I STRONGLY believe BIPs should be
descriptive ("here is how this thing works") not proscriptive ("here's how
I think we should all do it").

We can certainly split the proposal should it get too large, for now it
seems manageable, since opcodes are not affected. Bloom-filtering is
resolved by adding the normalized transaction IDs and checking for both IDs
in the filter. Since you mention bundling the change with other changes
that require a hard-fork it might be a good idea to build a separate
proposal for a generic hard-fork rollout mechanism.

If there are no obvious roadblocks and the change seems generally a good
thing I will implement it in Bitcoin Core :-)

Regards,
Chris
Post by Tier Nolan
I think this needs more details before it gets a BIP number; for example,
which opcodes does this affect, and how, exactly, does it affect them? Is
the merkle root in the block header computed using normalized transaction
ids or normalized ids?
+ Overall "what is trying to be accomplished"
+ Changes to the OP_*SIG* opcodes
+ Changes to the bloom-filtering SPV support
+ ...eventually, hard fork rollout plan
I also think that it is a good idea to have actually implemented a
proposal before getting a BIP number. At least, I find that actually
writing the code often turns up issues I hadn't considered when thinking
about the problem at a high level. And I STRONGLY believe BIPs should be
descriptive ("here is how this thing works") not proscriptive ("here's how
I think we should all do it").
Finally: I like the idea of moving to a normalized txid. But it might make
sense to bundle that change with a bigger change to OP_CHECKSIG; see Greg
http://youtu.be/Gs9lJTRZCDc
Post by Tier Nolan
I think this is a good way to handle things, but as you say, it is a hard
fork.
CHECKLOCKTIMEVERIFY covers many of the use cases, but it would be nice to
fix malleability once and for all.
This has the effect of doubling the size of the UTXO database. At
minimum, there needs to be a legacy txid to normalized txid map in the
database.
An addition to the BIP would eliminate the need for the 2nd index. You
could require a SPV proof of the spending transaction to be included with
legacy transactions. This would allow clients to verify that the
normalized txid matched the legacy id.
The OutPoint would be {LegacyId | SPV Proof to spending tx | spending tx
| index}. This allows a legacy transaction to be upgraded. OutPoints
which use a normalized txid don't need the SPV proof.
The hard fork would be followed by a transitional period, in which both
txids could be used. Afterwards, legacy transactions have to have the SPV
proof added. This means that old transactions with locktimes years in the
future can be upgraded for spending, without nodes needing to maintain two
indexes.
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
--
--
Gavin Andresen
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Tier Nolan
2015-05-13 16:18:24 UTC
Permalink
On Wed, May 13, 2015 at 4:24 PM, Christian Decker <
Post by Christian Decker
It does and I should have mentioned it in the draft, according to my
calculations a mapping legacy ID -> normalized ID is about 256 MB in size,
or at least it was at height 330'000, things might have changed a bit and
I'll recompute that. I omitted the deprecation of legacy IDs on purpose
since we don't know whether we will migrate completely or leave keep both
options viable.
There are around 20 million UTXOs. At 2*32 bytes per entry, that is more
than 1GB. There are more UTXOs than transactions, but 256MB seems a little
low.

I think both IDs can be used in the merkle tree, since we lookup an ID in
Post by Christian Decker
both indices we can use both to address them and we will find them either
way.
The id that is used to sign should be used in the merkle tree. The hard
fork should simply be to allow transactions that use the normalized
transaction hash.
Post by Christian Decker
As for the opcodes I'll have to check, but I currently don't see how they
could be affected.
Agreed, the transaction is simply changed and all the standard rules apply.
Post by Christian Decker
We can certainly split the proposal should it get too large, for now it
seems manageable, since opcodes are not affected.
Right it is just a database update. The undo info also needs to be changed
so that both txids are included.
Post by Christian Decker
Bloom-filtering is resolved by adding the normalized transaction IDs and
checking for both IDs in the filter.
Yeah, if a transaction spends with a legacy txid, it should still match if
the normalized txid is included in the filter.
Post by Christian Decker
Since you mention bundling the change with other changes that require a
hard-fork it might be a good idea to build a separate proposal for a
generic hard-fork rollout mechanism.

That would be useful. On the other hand, we don't want to make them to
easy.

I think this is a good choice for a hard fork test, since it is
uncontroversial. With a time machine, it would have been done this way at
the start.

What about the following:

The reference client is updated so that it uses version 2 transactions by
default (but it can be changed by user). A pop-up could appear for the GUI.

There is no other change.

All transactions in blocks 375000 to 385000 are considered votes and
weighted by bitcoin days destroyed (max 60 days).

If > 75% of the transactions by weight are version 2, then the community
are considered to support the hard fork.

There would need to be a way to protect against miners censoring
transactions/votes.

Users could submit their transactions directly to a p2p tallying system.
The coin would be aged based on the age in block 375000 unless included in
the blockchain. These votes don't need to be ordered and multiple votes
for the same coin would only count once.

In fact, votes could just be based on holding in block X.

This is an opinion poll rather than a referendum though.

Assuming support of the community, the hard fork can then proceed in a
similar way to the way a soft fork does.

Devs update the reference client to produce version 4 blocks and version 3
transactions. Miners could watch version 3 transactions to gauge user
interest and use that to help decide if they should update.

If 750 of the last 1000 blocks are version 4 or higher, reject blocks with
transactions of less than version 3 in version 4 blocks

This means that legacy clients will be slow to confirm their
transactions, since their transactions cannot go into version 4 blocks.
This is encouragement to upgrade.

If 950 of the last 1000 blocks are version 4 or higher, reject blocks with
transactions of less than version 3 in all blocks

This means that legacy nodes can no longer send transactions but can
still receive. Transactions received from other legacy nodes would remain
unconfirmed.

If 990 of the last 1000 blocks are version 4 or higher, reject version 3 or
lower blocks

This is the point of no return. Rejecting version 3 blocks means that
the next rule is guaranteed to activate within the next 2016 blocks.
Legacy nodes remain on the main chain, but cannot send. Miners mining with
legacy clients are (soft) forked off the chain.

If 1000 of the last 1000 blocks are version 4 or higher and the difficulty
retarget has just happened, activate hard fork rule

This hard forks legacy nodes off the chain. 99% of miners support this
change and users have been encouraged to update. The block rate for the
non-forked chain is ast most 1% of normal. Blocks happen every 16 hours.
By timing activation after a difficulty retarget, it makes it harder for
the other fork to adapt to the reduced hash rate.
Luke Dashjr
2015-05-13 16:34:52 UTC
Permalink
I think this hardfork is dead-on-arrival given the ideas for OP_CHECKSIG
softforking. Instead of referring to previous transactions by a normalised
hash, it makes better sense to simply change the outpoints in the signed data
and allow nodes to hotfix dependent transactions when/if they are malleated.
Furthermore, the approach of using a hash of scriptPubKey in the input rather
than an outpoint also solves dependencies in the face of intentional
malleability (respending with a higher fee, or CoinJoin, for a few examples).

These aren't barriers to making the proposal or being assigned a BIP number if
you want to go forward with that, but you may wish to reconsider spending time
on it.

Luke
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to address
transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the current
(legacy) transaction IDs to address outputs in transactions. It is
calculated by removing (zeroing) the scriptSig before computing the hash,
which ensures that only data whose integrity is also guaranteed by the
signatures influences the hash. Thus if anything causes the normalized ID
to change it automatically invalidates the signature. When validating a
client supporting this BIP would use both the normalized tx ID as well as
the legacy tx ID when validating transactions.
https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can now
use template transactions upon which sequences of transactions can be built
before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would be
a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
Pieter Wuille
2015-05-13 17:14:07 UTC
Permalink
Normalized transaction ids are only effectively non-malleable when all
inputs they refer to are also non-malleable (or you can have malleability
in 2nd level dependencies), so I do not believe it makes sense to allow
mixed usage of the txids at all. They do not provide the actual benefit of
guaranteed non-malleability before it becomes disallowed to use the old
mechanism. That, together with the +- resource doubling needed for the UTXO
set (as earlier mentioned) and the fact that an alternative which is only a
softfork are available, makes this a bad idea IMHO.

Unsure to what extent this has been presented on the mailinglist, but the
softfork idea is this:
* Transactions get 2 txids, one used to reference them (computed as
before), and one used in an (extended) sighash.
* The txins keep using the normal txid, so not structural changes to
Bitcoin.
* The ntxid is computed by replacing the scriptSigs in inputs by the empty
string, and by replacing the txids in txins by their corresponding ntxids.
* A new checksig operator is softforked in, which uses the ntxids in its
sighashes rather than the full txid.
* To support efficiently computing ntxids, every tx in the utxo set
(currently around 6M) stores the ntxid, but only supports lookup bu txid
still.

This does result in a system where a changed dependency indeed invalidates
the spending transaction, but the fix is trivial and can be done without
access to the private key.
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to address
transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the current
(legacy) transaction IDs to address outputs in transactions. It is
calculated by removing (zeroing) the scriptSig before computing the hash,
which ensures that only data whose integrity is also guaranteed by the
signatures influences the hash. Thus if anything causes the normalized ID
to change it automatically invalidates the signature. When validating a
client supporting this BIP would use both the normalized tx ID as well as
the legacy tx ID when validating transactions.
https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can now
use template transactions upon which sequences of transactions can be built
before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would be
a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Christian Decker
2015-05-13 18:04:54 UTC
Permalink
If the inputs to my transaction have been long confirmed I can be
reasonably safe in assuming that the transaction hash does not change
anymore. It's true that I have to be careful not to build on top of
transactions that use legacy references to transactions that are
unconfirmed or have few confirmations, however that does not invalidate the
utility of the normalized transaction IDs.

The resource doubling is not optimal, I agree, but compare that to dragging
around malleability and subsequent hacks to sort-of fix it forever.
Additionally if we were to decide to abandon legacy transaction IDs we
could eventually drop the legacy index after a sufficient transition period.

I remember reading about the SIGHASH proposal somewhere. It feels really
hackish to me: It is a substantial change to the way signatures are
verified, I cannot really see how this is a softfork if clients that did
not update are unable to verify transactions using that SIGHASH Flag and it
is adding more data (the normalized hash) to the script, which has to be
stored as part of the transaction. It may be true that a node observing
changes in the input transactions of a transaction using this flag could
fix the problem, however it requires the node's intervention.

Compare that to the simple and clean solution in the proposal, which does
not add extra data to be stored, keeps the OP_*SIG* semantics as they are
and where once you sign a transaction it does not have to be monitored or
changed in order to be valid.

There certainly are merits using the SIGHASH approach in the short term (it
does not require a hard fork), however I think the normalized transaction
ID is a cleaner and simpler long-term solution, even though it requires a
hard-fork.

Regards,
Christian
Post by Pieter Wuille
Normalized transaction ids are only effectively non-malleable when all
inputs they refer to are also non-malleable (or you can have malleability
in 2nd level dependencies), so I do not believe it makes sense to allow
mixed usage of the txids at all. They do not provide the actual benefit of
guaranteed non-malleability before it becomes disallowed to use the old
mechanism. That, together with the +- resource doubling needed for the UTXO
set (as earlier mentioned) and the fact that an alternative which is only a
softfork are available, makes this a bad idea IMHO.
Unsure to what extent this has been presented on the mailinglist, but the
* Transactions get 2 txids, one used to reference them (computed as
before), and one used in an (extended) sighash.
* The txins keep using the normal txid, so not structural changes to
Bitcoin.
* The ntxid is computed by replacing the scriptSigs in inputs by the empty
string, and by replacing the txids in txins by their corresponding ntxids.
* A new checksig operator is softforked in, which uses the ntxids in its
sighashes rather than the full txid.
* To support efficiently computing ntxids, every tx in the utxo set
(currently around 6M) stores the ntxid, but only supports lookup bu txid
still.
This does result in a system where a changed dependency indeed invalidates
the spending transaction, but the fix is trivial and can be done without
access to the private key.
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to
address transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the current
(legacy) transaction IDs to address outputs in transactions. It is
calculated by removing (zeroing) the scriptSig before computing the hash,
which ensures that only data whose integrity is also guaranteed by the
signatures influences the hash. Thus if anything causes the normalized ID
to change it automatically invalidates the signature. When validating a
client supporting this BIP would use both the normalized tx ID as well as
the legacy tx ID when validating transactions.
https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can now
use template transactions upon which sequences of transactions can be built
before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would
be a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Pieter Wuille
2015-05-13 18:40:34 UTC
Permalink
On Wed, May 13, 2015 at 11:04 AM, Christian Decker <
Post by Christian Decker
If the inputs to my transaction have been long confirmed I can be
reasonably safe in assuming that the transaction hash does not change
anymore. It's true that I have to be careful not to build on top of
transactions that use legacy references to transactions that are
unconfirmed or have few confirmations, however that does not invalidate the
utility of the normalized transaction IDs.
Sufficient confirmations help of course, but make systems like this less
useful for more complex interactions where you have multiple unconfirmed
transactions waiting on each other. I think being able to rely on this
problem being solved unconditionally is what makes the proposal attractive.
For the simple cases, see BIP62.

I remember reading about the SIGHASH proposal somewhere. It feels really
Post by Christian Decker
hackish to me: It is a substantial change to the way signatures are
verified, I cannot really see how this is a softfork if clients that did
not update are unable to verify transactions using that SIGHASH Flag and it
is adding more data (the normalized hash) to the script, which has to be
stored as part of the transaction. It may be true that a node observing
changes in the input transactions of a transaction using this flag could
fix the problem, however it requires the node's intervention.
I think you misunderstand the idea. This is related, but orthogonal to the
ideas about extended the sighash flags that have been discussed here before.

All it's doing is adding a new CHECKSIG operator to script, which, in its
internally used signature hash, 1) removes the scriptSigs from transactions
before hashing 2) replaces the txids in txins by their ntxid. It does not
add any data to transactions, and it is a softfork, because it only impacts
scripts which actually use the new CHECKSIG operator. Wallets that don't
support signing with this new operator would not give out addresses that
use it.
Post by Christian Decker
Compare that to the simple and clean solution in the proposal, which does
not add extra data to be stored, keeps the OP_*SIG* semantics as they are
and where once you sign a transaction it does not have to be monitored or
changed in order to be valid.
OP_*SIG* semantics don't change here either, we're just adding a superior
opcode (which in most ways behaves the same as the existing operators). I
agree with the advantage of not needing to monitor transactions afterwards
for malleated inputs, but I think you underestimate the deployment costs.
If you want to upgrade the world (eventually, after the old index is
dropped, which is IMHO the only point where this proposal becomes superior
to the alternatives) to this, you're changing *every single piece of
Bitcoin software on the planet*. This is not just changing some validation
rules that are opt-in to use, you're fundamentally changing how
transactions refer to each other.

Also, what do blocks commit to? Do you keep using the old transaction ids
for this? Because if you don't, any relayer on the network can invalidate a
block (and have the receiver mark it as invalid) by changing the txids. You
need to somehow commit to the scriptSig data in blocks still so the POW of
a block is invalidated by changing a scriptSig.

There certainly are merits using the SIGHASH approach in the short term (it
Post by Christian Decker
does not require a hard fork), however I think the normalized transaction
ID is a cleaner and simpler long-term solution, even though it requires a
hard-fork.
It requires a hard fork, but more importantly, it requires the whole world
to change their software (not just validation code) to effectively use it.
That, plus large up-front deployment costs (doubling the cache size for
every full node for the same propagation speed is not a small thing) which
may not end up being effective.
--
Pieter
Christian Decker
2015-05-13 19:14:57 UTC
Permalink
Post by Pieter Wuille
On Wed, May 13, 2015 at 11:04 AM, Christian Decker <
Post by Christian Decker
If the inputs to my transaction have been long confirmed I can be
reasonably safe in assuming that the transaction hash does not change
anymore. It's true that I have to be careful not to build on top of
transactions that use legacy references to transactions that are
unconfirmed or have few confirmations, however that does not invalidate the
utility of the normalized transaction IDs.
Sufficient confirmations help of course, but make systems like this less
useful for more complex interactions where you have multiple unconfirmed
transactions waiting on each other. I think being able to rely on this
problem being solved unconditionally is what makes the proposal attractive.
For the simple cases, see BIP62.
If we are building a long running contract using a complex chain of
transactions, or multiple transactions that depend on each other, there is
no point in ever using any malleable legacy transaction IDs and I would
simply stop cooperating if you tried. I don't think your argument applies.
If we build our contract using only normalized transaction IDs there is no
way of suffering any losses due to malleability.

The reason I mentioned the confirmation is that all protocols I can think
of start by collaboratively creating a transaction that locks in funds into
a multisig output, that is committed to the blockchain. Starting from this
initial setup transaction would be using normalized transaction IDs,
therefore not be susceptible to malleability.
Post by Pieter Wuille
I remember reading about the SIGHASH proposal somewhere. It feels really
Post by Christian Decker
hackish to me: It is a substantial change to the way signatures are
verified, I cannot really see how this is a softfork if clients that did
not update are unable to verify transactions using that SIGHASH Flag and it
is adding more data (the normalized hash) to the script, which has to be
stored as part of the transaction. It may be true that a node observing
changes in the input transactions of a transaction using this flag could
fix the problem, however it requires the node's intervention.
I think you misunderstand the idea. This is related, but orthogonal to the
ideas about extended the sighash flags that have been discussed here before.
All it's doing is adding a new CHECKSIG operator to script, which, in its
internally used signature hash, 1) removes the scriptSigs from transactions
before hashing 2) replaces the txids in txins by their ntxid. It does not
add any data to transactions, and it is a softfork, because it only impacts
scripts which actually use the new CHECKSIG operator. Wallets that don't
support signing with this new operator would not give out addresses that
use it.
In that case I don't think I heard this proposal before, and I might be
missing out :-)
So if transaction B spends an output from A, then the input from B contains
the CHECKSIG operator telling the validating client to do what exactly? It
appears that it wants us to go and fetch A, normalize it, put the
normalized hash in the txIn of B and then continue the validation? Wouldn't
that also need a mapping from the normalized transaction ID to the legacy
transaction ID that was confirmed?

A client that did not update still would have no clue on how to handle
these transactions, since it simply does not understand the CHECKSIG
operator. If such a transaction ends up in a block I cannot even catch up
with the network since the transaction does not validate for me.

Could you provide an example of how this works?
Post by Pieter Wuille
Post by Christian Decker
Compare that to the simple and clean solution in the proposal, which does
not add extra data to be stored, keeps the OP_*SIG* semantics as they are
and where once you sign a transaction it does not have to be monitored or
changed in order to be valid.
OP_*SIG* semantics don't change here either, we're just adding a superior
opcode (which in most ways behaves the same as the existing operators). I
agree with the advantage of not needing to monitor transactions afterwards
for malleated inputs, but I think you underestimate the deployment costs.
If you want to upgrade the world (eventually, after the old index is
dropped, which is IMHO the only point where this proposal becomes superior
to the alternatives) to this, you're changing *every single piece of
Bitcoin software on the planet*. This is not just changing some validation
rules that are opt-in to use, you're fundamentally changing how
transactions refer to each other.
As I mentioned before, this is a really long term strategy, hoping to get
the cleanest and easiest solution, so that we do not further complicate the
inner workings of Bitcoin. I don't think that it is completely out of
question to eventually upgrade to use normalized transactions, after all
the average lifespan of hardware is a few years tops.
Post by Pieter Wuille
Also, what do blocks commit to? Do you keep using the old transaction ids
for this? Because if you don't, any relayer on the network can invalidate a
block (and have the receiver mark it as invalid) by changing the txids. You
need to somehow commit to the scriptSig data in blocks still so the POW of
a block is invalidated by changing a scriptSig.
How could I change the transaction IDs if I am a relayer? The miner decides
which flavor of IDs it is adding into its merkle tree, the block hash locks
in the choice. If we saw a transaction having a valid sigScript, it does
not matter how we reference it in the block.
Post by Pieter Wuille
There certainly are merits using the SIGHASH approach in the short term
Post by Christian Decker
(it does not require a hard fork), however I think the normalized
transaction ID is a cleaner and simpler long-term solution, even though it
requires a hard-fork.
It requires a hard fork, but more importantly, it requires the whole world
to change their software (not just validation code) to effectively use it.
That, plus large up-front deployment costs (doubling the cache size for
every full node for the same propagation speed is not a small thing) which
may not end up being effective.
Yes, hard forks are hard, I'm under no illusion that pushing such a change
through takes time, but in the end the advantages will prevail.

I didn't want to put it in the initial proposal, but we could also increase
the transaction version which signals to the client that the transaction
may only be referenced by the normalized transaction ID. So every
transaction would be either in one index or the other, reducing the
deployment cost to almost nothing.
Post by Pieter Wuille
--
Pieter
Pieter Wuille
2015-05-13 19:40:54 UTC
Permalink
On Wed, May 13, 2015 at 12:14 PM, Christian Decker <
Post by Christian Decker
Post by Pieter Wuille
On Wed, May 13, 2015 at 11:04 AM, Christian Decker <
Post by Christian Decker
If the inputs to my transaction have been long confirmed I can be
reasonably safe in assuming that the transaction hash does not change
anymore. It's true that I have to be careful not to build on top of
transactions that use legacy references to transactions that are
unconfirmed or have few confirmations, however that does not invalidate the
utility of the normalized transaction IDs.
Sufficient confirmations help of course, but make systems like this less
useful for more complex interactions where you have multiple unconfirmed
transactions waiting on each other. I think being able to rely on this
problem being solved unconditionally is what makes the proposal attractive.
For the simple cases, see BIP62.
If we are building a long running contract using a complex chain of
transactions, or multiple transactions that depend on each other, there is
no point in ever using any malleable legacy transaction IDs and I would
simply stop cooperating if you tried. I don't think your argument applies.
If we build our contract using only normalized transaction IDs there is no
way of suffering any losses due to malleability.
That's correct as long as you stay within your contract, but you likely
want compatibility with other software, without waiting an age before and
after your contract settles on the chain. It's a weaker argument, though, I
agree.

I remember reading about the SIGHASH proposal somewhere. It feels really
Post by Christian Decker
Post by Pieter Wuille
Post by Christian Decker
hackish to me: It is a substantial change to the way signatures are
verified, I cannot really see how this is a softfork if clients that did
not update are unable to verify transactions using that SIGHASH Flag and it
is adding more data (the normalized hash) to the script, which has to be
stored as part of the transaction. It may be true that a node observing
changes in the input transactions of a transaction using this flag could
fix the problem, however it requires the node's intervention.
I think you misunderstand the idea. This is related, but orthogonal to
the ideas about extended the sighash flags that have been discussed here
before.
All it's doing is adding a new CHECKSIG operator to script, which, in its
internally used signature hash, 1) removes the scriptSigs from transactions
before hashing 2) replaces the txids in txins by their ntxid. It does not
add any data to transactions, and it is a softfork, because it only impacts
scripts which actually use the new CHECKSIG operator. Wallets that don't
support signing with this new operator would not give out addresses that
use it.
In that case I don't think I heard this proposal before, and I might be
missing out :-)
So if transaction B spends an output from A, then the input from B
contains the CHECKSIG operator telling the validating client to do what
exactly? It appears that it wants us to go and fetch A, normalize it, put
the normalized hash in the txIn of B and then continue the validation?
Wouldn't that also need a mapping from the normalized transaction ID to the
legacy transaction ID that was confirmed?
There would just be an OP_CHECKAWESOMESIG, which can do anything. It can
identical to how OP_CHECKSIG works now, but has a changed algorithm for its
signature hash algorithm. Optionally (and likely in practice, I think), it
can do various other proposed improvements, like using Schnorr signatures,
having a smaller signature encoding, supporting batch validation, have
extended sighash flags, ...

It wouldn't fetch A and normalize it; that's impossible as you would need
to go fetch all of A's dependencies too and recurse until you hit the
coinbases that produced them. Instead, your UTXO set contains the
normalized txid for every normal txid (which adds around 26% to the UTXO
set size now), but lookups in it remain only by txid.

You don't need a ntxid->txid mapping, as transactions and blocks keep
referring to transactions by txid. Only the OP_CHECKAWESOMESIG operator
would do the conversion, and at most once.

A client that did not update still would have no clue on how to handle
Post by Christian Decker
these transactions, since it simply does not understand the CHECKSIG
operator. If such a transaction ends up in a block I cannot even catch up
with the network since the transaction does not validate for me.
As for every softfork, it works by redefining an OP_NOP operator, so old
nodes simply consider these checksigs unconditionally valid. That does mean
you don't want to use them before the consensus rule is forked in
(=enforced by a majority of the hashrate), and that you suffer from the
temporary security reduction that an old full node is unknowingly reduced
to SPV security for these opcodes. However, as full node wallet, this
problem does not affect you, as your wallet would simply not give out
addresses using the new opcode (and thus, wouldn't receive coins using it),
unless it was upgraded to support it.

Could you provide an example of how this works?
Post by Christian Decker
Post by Pieter Wuille
Post by Christian Decker
Compare that to the simple and clean solution in the proposal, which
does not add extra data to be stored, keeps the OP_*SIG* semantics as they
are and where once you sign a transaction it does not have to be monitored
or changed in order to be valid.
OP_*SIG* semantics don't change here either, we're just adding a superior
opcode (which in most ways behaves the same as the existing operators). I
agree with the advantage of not needing to monitor transactions afterwards
for malleated inputs, but I think you underestimate the deployment costs.
If you want to upgrade the world (eventually, after the old index is
dropped, which is IMHO the only point where this proposal becomes superior
to the alternatives) to this, you're changing *every single piece of
Bitcoin software on the planet*. This is not just changing some validation
rules that are opt-in to use, you're fundamentally changing how
transactions refer to each other.
As I mentioned before, this is a really long term strategy, hoping to get
the cleanest and easiest solution, so that we do not further complicate the
inner workings of Bitcoin. I don't think that it is completely out of
question to eventually upgrade to use normalized transactions, after all
the average lifespan of hardware is a few years tops.
Fair enough, I definitely agree the end result is superior in this case.

Also, what do blocks commit to? Do you keep using the old transaction ids
Post by Christian Decker
Post by Pieter Wuille
for this? Because if you don't, any relayer on the network can invalidate a
block (and have the receiver mark it as invalid) by changing the txids. You
need to somehow commit to the scriptSig data in blocks still so the POW of
a block is invalidated by changing a scriptSig.
How could I change the transaction IDs if I am a relayer? The miner
decides which flavor of IDs it is adding into its merkle tree, the block
hash locks in the choice. If we saw a transaction having a valid sigScript,
it does not matter how we reference it in the block.
If the merkle tree of a block only commits to a transaction's normalized
hash, that means that the block hash does not change when the scriptSig is
altered. So, anyone on the network can take a random valid block, and
modify its scriptSig, and the block will become invalid _without_
invalidating the block header. This means that nodes on the network will
now classify that block header as having invalid transactions, and reject
it. Not having the ability anymore to mark blocks as invalid opens
significant DoS risks.

So yes, seeing a block with valid scriptSigs is indeed a proof the
transaction was legitimately authored. But the oppose is no longer true,
and we need that. The correct solution is to either keep using the old full
transaction ids in blocks, but ntxids everywhere else, or having some
alternative means to commit to the scriptSigs inside the block (for example
in the coinbase or using one of the more efficient block commitment
proposals), and have that enforced as consensus rule.
--
Pieter
Tier Nolan
2015-05-13 18:11:30 UTC
Permalink
Post by Pieter Wuille
Normalized transaction ids are only effectively non-malleable when all
inputs they refer to are also non-malleable (or you can have malleability
in 2nd level dependencies), so I do not believe it makes sense to allow
mixed usage of the txids at all.
The txid or txid-norm is signed, so can't be changed after signing.

The hard fork is to allow transactions to refer to their inputs by txid or
txid-norm. You pick one before signing.
Post by Pieter Wuille
They do not provide the actual benefit of guaranteed non-malleability
before it becomes disallowed to use the old mechanism.
A signed transaction cannot have its txid changed. It is true that users
of the system would have to use txid-norm.

The basic refund transaction is as follows.

A creates TX1: "Pay w BTC to <B's public key> if signed by A & B"

A creates TX2: "Pay w BTC from TX1-norm to <A's public key>, locked 48
hours in the future, signed by A"

A sends TX2 to B

B signs TX2 and returns to A

A broadcasts TX1. It is mutated before entering the chain to become
TX1-mutated.

A can still submit TX2 to the blockchain, since TX1 and TX1-mutated have
the same txid-norm.
Post by Pieter Wuille
That, together with the +- resource doubling needed for the UTXO set (as
earlier mentioned) and the fact that an alternative which is only a
softfork are available, makes this a bad idea IMHO.
Unsure to what extent this has been presented on the mailinglist, but the
* Transactions get 2 txids, one used to reference them (computed as
before), and one used in an (extended) sighash.
* The txins keep using the normal txid, so not structural changes to
Bitcoin.
* The ntxid is computed by replacing the scriptSigs in inputs by the empty
string, and by replacing the txids in txins by their corresponding ntxids.
* A new checksig operator is softforked in, which uses the ntxids in its
sighashes rather than the full txid.
* To support efficiently computing ntxids, every tx in the utxo set
(currently around 6M) stores the ntxid, but only supports lookup bu txid
still.
This does result in a system where a changed dependency indeed invalidates
the spending transaction, but the fix is trivial and can be done without
access to the private key.
The problem with this is that 2 level malleability is not protected against.

C spends B which spends A.

A is mutated before it hits the chain. The only change in A is in the
scriptSig.

B can be converted to B-new without breaking the signature. This is
because the only change to A was in the sciptSig, which is dropped when
computing the txid-norm.

B-new spends A-mutated. B-new is different from B in a different place.
The txid it uses to refer to the previous output is changed.

The signed transaction C cannot be converted to a valid C-new. The txid of
the input points to B. It is updated to point at B-new. B-new and B don't
have the same txid-norm, since the change is outside the scriptSig. This
means that the signature for C is invalid.

The txid replacements should be done recursively. All input txids should
be replaced by txid-norms when computing the txid-norm for the
transaction. I think this repairs the problem with only allowing one level?

Computing txid-norm:

- replace all txids in inputs with txid-norms of those transactions
- replace all input scriptSigs with empty scripts
- transaction hash is txid-norm for that transaction

The same situation as above is not fatal now.

C spends B which spends A.

A is mutated before it hits the chain. The only change in A is in the
scriptSig.

B can be converted to B-new without breaking the signature. This is
because the only change to A was in the sciptSig, which is dropped when
computing the txid-norm (as before).

B-new spends A mutated. B-new is different from B in for the previous
inputs.

The input for B-new points to A-mutated. When computing the txid-norm,
that would be replaced with the txid-norm for A.

Similarly, the input for B points to A and that would have been replaced
with the txid-norm for A.

This means that B and B-new have the same txid-norm.

The signed transaction C can be converted to a valid C-new. The txid of
the input points to B. It is updated to point at B-new. B-new and B now
have have the same txid-norm and so C is valid.

I think this reasoning is valid, but probably needs writing out actual
serializations.
Tier Nolan
2015-05-13 20:27:14 UTC
Permalink
After more thought, I think I came up with a clearer description of the
recursive version.

The simple definition is that the hash for the new signature opcode should
simply assume that the normalized txid system was used since the
beginning. All txids in the entire blockchain should be replaced with the
"correct" values.

This requires a full re-index of the blockchain. You can't work out what
the TXID-N of a transaction is without knowning the TXID-N of its parents,
in order to do the replacement.

The non-recursive version can only handle refunds one level deep.

A:
from: IN
sigA: based on hash(...)

B:
from A
sig: based on hash(from: TXID-N(A) | "") // sig removed

C:
from B
sig: based on hash(from: TXID-N(B) | "") // sig removed

If A is mutated before being added into the chain, then B can be modified
to a valid transaction (B-new).

A-mutated:
from: IN
sig_mutated: based on hash(...) with some mutation

B has to be modified to B-new to make it valid.

B-new:
from A-mutated
sig: based on hash(from: TXID-N(A-mutated), "")

Since TXID-N(A-mutated) is equal to TXID-N(A), the signature from B is
still valid.

Howver, C-new cannot be created.

C-new:
from B-new
sig: based on hash(from: TXID-N(B-new), "")

TXID-N(B-new) is not the same as TXID-N(B). Since the from field is not
removed by the TXID-N operation, differences in that field mean that the
TXIDs are difference.

This means that the signature for C is not valid for C-new.

The recursive version repairs this problem.

Rather than simply delete the scriptSig from the transaction. All txids
must also be replaced with their TXID-N versions.

Again, A is mutated before being added into the chain and B-new is produced.

A-mutated:
from: IN
sig_mutated: based on hash(...) with some mutation
TXID-N: TXID-N(A)

B has to be modified to B-new to make it valid.

B-new:
from A-mutated
sig: based on hash(from: TXID-N(A-mutated), "")
TXID-N: TXID-N(B)

Since TXID-N(A-mutated) is equal to TXID-N(A), the signature from B is
still valid.

Likewise the TXID-N(B-new) is equal to TXID-N(B).

The from field is replaced by the TXID-N from A-mutated which is equal to
TXID-N(A) and the sig is the same.

C-new:
from B-new
sig: based on hash(from: TXID-N(B-new), "")

The signature is still valid, since TXID-N(B-new) is the same as TXID-N(B).

This means that multi-level refunds are possible.
Pieter Wuille
2015-05-13 20:31:06 UTC
Permalink
Post by Tier Nolan
After more thought, I think I came up with a clearer description of the
recursive version.
The simple definition is that the hash for the new signature opcode should
simply assume that the normalized txid system was used since the
beginning. All txids in the entire blockchain should be replaced with the
"correct" values.
This requires a full re-index of the blockchain. You can't work out what
the TXID-N of a transaction is without knowning the TXID-N of its parents,
in order to do the replacement.
The non-recursive version can only handle refunds one level deep.
This was what I was suggesting all along, sorry if I wasn't clear.
--
Pieter
Tier Nolan
2015-05-13 20:32:43 UTC
Permalink
Post by Pieter Wuille
This was what I was suggesting all along, sorry if I wasn't clear.
That's great. So, basically the multi-level refund problem is solved by
this?
Pieter Wuille
2015-05-14 00:37:30 UTC
Permalink
Post by Pieter Wuille
This was what I was suggesting all along, sorry if I wasn't clear.
That's great. So, basically the multi-level refund problem is solved by
this?
Yes. So to be clear, I think there are 2 desirable end-goal proposals
(ignoring difficulty of changing things for a minute):

* Transactions and blocks keep referring to other transactions by full
txid, but signature hashes are computed off normalized txids (which are
recursively defined to use normalized txids all the way back to coinbases).
Is this what you are suggesting now as well?

* Blocks commit to full transaction data, but transactions and signature
hashes use normalized txids.

The benefit of the latter solution is that it doesn't need "fixing up"
transactions whose inputs have been malleated, but comes at the cost of
doing a very invasive hard fork.
--
Pieter
Christian Decker
2015-05-14 11:01:56 UTC
Permalink
Ok, I think I got the OP_CHECKAWESOMESIG proposal, transactions keep
referencing using hashes of complete transactions (including signatures),
while the OP_CHECKAWESOMESIG looks up the previous transaction (which we
already need to do anyway in order to insert the prevOut pubkeyScript),
normalizes the prevout and calculates its normalized transaction ID. It
then inserts the normalized transaction IDs in the OutPoint before
calculating its own hash which is then signed. Is that correct so far?

Let me try to summarize the discussion so far:

I think we have consensus that transaction malleability needs to be
addressed, and normalized transaction IDs seem to be the way to go forward.

The discussion now is how to use normalized transaction IDs and we have two
approaches to implement them:

- OP_CHECKAWESOMESIG which continues to use the current hashes to
reference a specific signed instance of a class of semantically identical
transactions. Internally only the semantic class is enforced. Transactions
can be fixed to reference the correct signed instance if the transaction
has been changed along the way.is a softfork using the "if I don't know
this opcode the TX is automatically valid" trick
Post by Pieter Wuille
Post by Pieter Wuille
This was what I was suggesting all along, sorry if I wasn't clear.
That's great. So, basically the multi-level refund problem is solved by
this?
Yes. So to be clear, I think there are 2 desirable end-goal proposals
* Transactions and blocks keep referring to other transactions by full
txid, but signature hashes are computed off normalized txids (which are
recursively defined to use normalized txids all the way back to coinbases).
Is this what you are suggesting now as well?
* Blocks commit to full transaction data, but transactions and signature
hashes use normalized txids.
The benefit of the latter solution is that it doesn't need "fixing up"
transactions whose inputs have been malleated, but comes at the cost of
doing a very invasive hard fork.
--
Pieter
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Christian Decker
2015-05-14 11:26:44 UTC
Permalink
This post might be inappropriate. Click to display it.
s7r
2015-05-15 09:54:55 UTC
Permalink
Hello,

How will this exactly be safe against:
a) the malleability of the parent tx (2nd level malleability)
b) replays

If you strip just the scriptSig of the input(s), the txid(s) can still
be mutated (with higher probability before it gets confirmed).

If you strip both the scriptSig of the parent and the txid, nothing can
any longer be mutated but this is not safe against replays. This could
work if we were using only one scriptPubKey per tx. But this is not
enforced, and I don't think it's the proper way to do it.

Something similar can be achieved if you would use a combination of
flags from here:

https://github.com/scmorse/bitcoin-misc/blob/master/sighash_proposal.md

But this has some issues too.

I've read your draft but didn't understand how exactly will this prevent
normal malleability as we know it, second level malleability and replays
as well as how will we do the transition into mapping the txes in the
blockchain to normalized txids. Looking forward to read more on this
topic. Thanks for the brainstorming ;)
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to
address transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the
current (legacy) transaction IDs to address outputs in transactions. It
is calculated by removing (zeroing) the scriptSig before computing the
hash, which ensures that only data whose integrity is also guaranteed by
the signatures influences the hash. Thus if anything causes the
normalized ID to change it automatically invalidates the signature. When
validating a client supporting this BIP would use both the normalized tx
ID as well as the legacy tx ID when validating transactions.
The detailed writeup can be found
here: https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can
now use template transactions upon which sequences of transactions can
be built before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would
be a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Tier Nolan
2015-05-15 10:45:05 UTC
Permalink
Post by s7r
Hello,
a) the malleability of the parent tx (2nd level malleability)
The signature signs everything except the signature itself. The normalized
txid doesn't include that signature, so mutations of the signature don't
cause the normalized txid to change.

If the refund transaction refers to the parent using the normalised txid,
then it doesn't matter if the parent has a mutated signature. The
normalized transaction ignores the mutation.

If the parent is mutated, then the refund doesn't even have to be modified,
it still refers to it.

If you want a multi-level refund transaction, then all refund transactions
must use the normalized txids to refer to their parents. The "root"
transaction is submitted to the blockchain and locked down.
Post by s7r
b) replays
If there are 2 transactions which are mutations of each other, then only
one can be added to the block chain, since the other is a double spend.

The normalized txid refers to all of them, rather than a specific
transaction.
Post by s7r
If you strip just the scriptSig of the input(s), the txid(s) can still
be mutated (with higher probability before it gets confirmed).
Mutation is only a problem if it occurs after signing. The signature signs
everything except the signature itself.
Post by s7r
If you strip both the scriptSig of the parent and the txid, nothing can
any longer be mutated but this is not safe against replays.
Correct, but normalized txids are safe against replays, so are better.

I think the new signature opcode fixes things too. The question is hard
fork but clean solution vs a soft fork but a little more hassle.
Luke Dashjr
2015-05-15 16:31:47 UTC
Permalink
Post by s7r
If you strip both the scriptSig of the parent and the txid, nothing can
any longer be mutated but this is not safe against replays. This could
work if we were using only one scriptPubKey per tx. But this is not
enforced, ...
Assuming you mean one output per scriptPubKey (and not limiting tx to one
output), the alternative is essentially undefined, and creates real problems
for Bitcoin today. It's not something we should go out of the way to support
or encourage. Therefore, regardless of whatever other options are available, I
would like to see a scriptPubKey-only sighash type for strong safety within
all malleability situations (including CoinJoin and other sender-respends)
that more advanced wallet software could take advantage of in the future
(while strictly enforcing no-reuse on its own wallet to avoid known replays).

Luke
Stephen
2015-05-16 03:58:56 UTC
Permalink
We should make sure to consider how BIP34 affects normalized transaction ids, since the height of the block is included in the scriptSig ensuring that the txid will be different. We wouldn't want to enable replay attacks in the form of spending coinbase outputs in the same way they were spent from a previous block.

So maybe normalized txids should strip the scriptSigs of all transactions except for coinbase transactions? This seems to make sense, since coinbase transactions are inherently not malleable anyway.

Also, s7r linked to my 'Build your own nHashType' proposal (although V2 is here: https://github.com/scmorse/bitcoin-misc/blob/master/sighash_proposal_v2.md). I just wanted to add that I think even with normalized ids, it could still be useful to be able to apply these flags to choose which parts of the transaction become signed. I've also seen vague references to some kind of a merklized abstract syntax tree, but am not fully sure how that would work. Maybe someone on here could explain it?

Best,
Stephen
Post by s7r
Hello,
a) the malleability of the parent tx (2nd level malleability)
b) replays
If you strip just the scriptSig of the input(s), the txid(s) can still
be mutated (with higher probability before it gets confirmed).
If you strip both the scriptSig of the parent and the txid, nothing can
any longer be mutated but this is not safe against replays. This could
work if we were using only one scriptPubKey per tx. But this is not
enforced, and I don't think it's the proper way to do it.
Something similar can be achieved if you would use a combination of
https://github.com/scmorse/bitcoin-misc/blob/master/sighash_proposal.md
But this has some issues too.
I've read your draft but didn't understand how exactly will this prevent
normal malleability as we know it, second level malleability and replays
as well as how will we do the transition into mapping the txes in the
blockchain to normalized txids. Looking forward to read more on this
topic. Thanks for the brainstorming ;)
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to
address transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the
current (legacy) transaction IDs to address outputs in transactions. It
is calculated by removing (zeroing) the scriptSig before computing the
hash, which ensures that only data whose integrity is also guaranteed by
the signatures influences the hash. Thus if anything causes the
normalized ID to change it automatically invalidates the signature. When
validating a client supporting this BIP would use both the normalized tx
ID as well as the legacy tx ID when validating transactions.
The detailed writeup can be found
here: https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can
now use template transactions upon which sequences of transactions can
be built before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would
be a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Tier Nolan
2015-05-16 10:52:34 UTC
Permalink
Post by Stephen
We should make sure to consider how BIP34 affects normalized transaction
ids, since the height of the block is included in the scriptSig ensuring
that the txid will be different. We wouldn't want to enable replay attacks
in the form of spending coinbase outputs in the same way they were spent
from a previous block.
So maybe normalized txids should strip the scriptSigs of all transactions
except for coinbase transactions? This seems to make sense, since coinbase
transactions are inherently not malleable anyway.
That is a good point. Since the point is the change is to use good
practice right back until the genesis block, maybe the scriptSig for
coinbases could be replaced by the height expressed as a varint. That
means that all coinbases get a unique normalized txid. The coinbases with
duplicate txids still wouldn't be spendable though.
Christian Decker
2015-05-19 08:28:58 UTC
Permalink
Thanks Stephen, I hadn't thought about BIP 34 and we need to address this
in both proposals. If we can avoid it I'd like not to have one transaction
hashed one way and other transactions in another way.

Since BIP 34 explicitly uses the scriptSig to make the coinbase transaction
unique, simply removing the scriptSig is not an option as it would
potentially cause collisions. I don't remember why the scriptSig was
chosen, but we also have the option of putting the blockchain height in the
sequence number of the coinbase input or the locktime of the transaction,
restoring the uniqueness constraint in normalized transaction IDs (for both
proposals). Is there a specific reason why that was not chosen at the time?
Post by Stephen
We should make sure to consider how BIP34 affects normalized transaction
ids, since the height of the block is included in the scriptSig ensuring
that the txid will be different. We wouldn't want to enable replay attacks
in the form of spending coinbase outputs in the same way they were spent
from a previous block.
So maybe normalized txids should strip the scriptSigs of all transactions
except for coinbase transactions? This seems to make sense, since coinbase
transactions are inherently not malleable anyway.
Also, s7r linked to my 'Build your own nHashType' proposal (although V2 is
https://github.com/scmorse/bitcoin-misc/blob/master/sighash_proposal_v2.md).
I just wanted to add that I think even with normalized ids, it could still
be useful to be able to apply these flags to choose which parts of the
transaction become signed. I've also seen vague references to some kind of
a merklized abstract syntax tree, but am not fully sure how that would
work. Maybe someone on here could explain it?
Best,
Stephen
Post by s7r
Hello,
a) the malleability of the parent tx (2nd level malleability)
b) replays
If you strip just the scriptSig of the input(s), the txid(s) can still
be mutated (with higher probability before it gets confirmed).
If you strip both the scriptSig of the parent and the txid, nothing can
any longer be mutated but this is not safe against replays. This could
work if we were using only one scriptPubKey per tx. But this is not
enforced, and I don't think it's the proper way to do it.
Something similar can be achieved if you would use a combination of
https://github.com/scmorse/bitcoin-misc/blob/master/sighash_proposal.md
But this has some issues too.
I've read your draft but didn't understand how exactly will this prevent
normal malleability as we know it, second level malleability and replays
as well as how will we do the transition into mapping the txes in the
blockchain to normalized txids. Looking forward to read more on this
topic. Thanks for the brainstorming ;)
Post by Christian Decker
Hi All,
I'd like to propose a BIP to normalize transaction IDs in order to
address transaction malleability and facilitate higher level protocols.
The normalized transaction ID is an alias used in parallel to the
current (legacy) transaction IDs to address outputs in transactions. It
is calculated by removing (zeroing) the scriptSig before computing the
hash, which ensures that only data whose integrity is also guaranteed by
the signatures influences the hash. Thus if anything causes the
normalized ID to change it automatically invalidates the signature. When
validating a client supporting this BIP would use both the normalized tx
ID as well as the legacy tx ID when validating transactions.
The detailed writeup can be found
https://github.com/cdecker/bips/blob/normalized-txid/bip-00nn.mediawiki.
Post by s7r
Post by Christian Decker
@gmaxwell: I'd like to request a BIP number, unless there is something
really wrong with the proposal.
In addition to being a simple alternative that solves transaction
malleability it also hugely simplifies higher level protocols. We can
now use template transactions upon which sequences of transactions can
be built before signing them.
I hesitated quite a while to propose it since it does require a hardfork
(old clients would not find the prevTx identified by the normalized
transaction ID and deem the spending transaction invalid), but it seems
that hardforks are no longer the dreaded boogeyman nobody talks about.
I left out the details of how the hardfork is to be done, as it does not
really matter and we may have a good mechanism to apply a bunch of
hardforks concurrently in the future.
I'm sure it'll take time to implement and upgrade, but I think it would
be a nice addition to the functionality and would solve a long standing
problem :-)
Please let me know what you think, the proposal is definitely not set in
stone at this point and I'm sure we can improve it further.
Regards,
Christian
------------------------------------------------------------------------------
Post by s7r
Post by Christian Decker
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
------------------------------------------------------------------------------
Post by s7r
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Tier Nolan
2015-05-19 09:13:17 UTC
Permalink
On Tue, May 19, 2015 at 9:28 AM, Christian Decker <
Post by Christian Decker
Thanks Stephen, I hadn't thought about BIP 34 and we need to address this
in both proposals. If we can avoid it I'd like not to have one
transaction hashed one way and other transactions in another way.
The normalized TXID cannot depend on height for other transactions.
Otherwise, it gets mutated when been added to the chain, depending on
height.

An option would be that the height is included in the scriptSig for all
transactions, but for non-coinbase transctions, the height used is zero.

I think if height has to be an input into the normalized txid function, the
specifics of inclusion don't matter.

The previous txid for coinbases are required to be all zeros, so the
normalized txid could be to add the height to the txids of all inputs.
Again, non-coinbase transactions would have heights of zero.
Post by Christian Decker
Is there a specific reason why that was not chosen at the time?
I assumed that since the scriptSig in the coinbase is specifically intended
to be "random" bytes/extra nonce, so putting a restriction on it was
guaranteed to be backward compatible.
Christian Decker
2015-05-19 10:43:39 UTC
Permalink
Post by Tier Nolan
On Tue, May 19, 2015 at 9:28 AM, Christian Decker <
Post by Christian Decker
Thanks Stephen, I hadn't thought about BIP 34 and we need to address this
in both proposals. If we can avoid it I'd like not to have one
transaction hashed one way and other transactions in another way.
The normalized TXID cannot depend on height for other transactions.
Otherwise, it gets mutated when been added to the chain, depending on
height.
Well in the case of coinbase transactions we want them to be dependent on
the height they are included in, which is not a problem since they are only
valid in conjunction with the block that mined them.
Post by Tier Nolan
An option would be that the height is included in the scriptSig for all
transactions, but for non-coinbase transctions, the height used is zero.
No need to add an extra field to the transaction just to include the
height. We can just add a rule that the height specified in the scriptSig
in coinbase transactions (and only coinbase transactions) is copied into
the locktime of the transaction before computing the normalized transaction
ID and leave the locktime untouched for all normal transactions
Post by Tier Nolan
I think if height has to be an input into the normalized txid function,
the specifics of inclusion don't matter.
The previous txid for coinbases are required to be all zeros, so the
normalized txid could be to add the height to the txids of all inputs.
Again, non-coinbase transactions would have heights of zero.
Post by Christian Decker
Is there a specific reason why that was not chosen at the time?
I assumed that since the scriptSig in the coinbase is specifically
intended to be "random" bytes/extra nonce, so putting a restriction on it
was guaranteed to be backward compatible.
Sounds reasonable :-)
Post by Tier Nolan
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bitcoin-development mailing list
https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Stephen Morse
2015-05-19 12:48:20 UTC
Permalink
Post by Tier Nolan
An option would be that the height is included in the scriptSig for all
Post by Tier Nolan
transactions, but for non-coinbase transctions, the height used is zero.
No need to add an extra field to the transaction just to include the
height. We can just add a rule that the height specified in the scriptSig
in coinbase transactions (and only coinbase transactions) is copied into
the locktime of the transaction before computing the normalized transaction
ID and leave the locktime untouched for all normal transactions
No need to replace lock times (or any other part of the transaction) at
all. If you have to, just serialize the height right before serializing the
transaction (into the same buffer). And you could pre-serialize 0 instead
of the height for all non-coinbase transactions. I don't really see what
that gets you, though, because the 0 is not really doing anything.

But, I don't see any reason you have to mess with the serialization this
much at all. Just do:

uint256 normalized_txid(CTransaction tx)
{
// Coinbase transactions are already normalized
if (!tx.IsCoinbase())
{
foreach(CTxIn in : tx.vin)
{
if (!ReplacePrevoutHashWithNormalizedHash(in.prevout))
throw NormalizationError("Could not lookup prevout");
in.scriptSig.clear();
}
}

// Serialize
CHashWriter ss(SER_GETHASH, 0);
ss << tx;
return ss.GetHash();
}

An alternative could be (although I like the above option better):

uint256 normalized_txid(CTransaction tx, int nHeight)
{
foreach(CTxIn in : tx.vin)
{
if (!in.prevout.IsNull() &&
!ReplacePrevoutHashWithNormalizedHash(in.prevout))
throw NormalizationError("Could not lookup prevout");
in.scriptSig.clear();
}

// Serialize
CHashWriter ss(SER_GETHASH, 0);

if (tx.IsCoinbase())
ss << nHeight;
// or:
// ss << (tx.IsCoinbase() ? nHeight : 0);

ss << tx;
return ss.GetHash();
}

Continue reading on narkive:
Loading...