login
BUIP037: Hardfork SegWit
Proposer: Amaury SÉCHET
Submitted: 2016-11-12
Status: draft

Summary

There are various problem with the current transaction format, including
malleability and quadratic hashing. SegWit proposes to add a new
transaction format, which solves this problem but does so in a way that
does not allow to spend existing UTXO. As a result, SegWit doesn’t
deliver on its promise. FlexTrans proposes an alternative way to solve
these problems in a way that is compatible with existing UTXO, which
allows us to eventually weed out the old transaction format.

This BUIP proposes to adopt a strategy similar to FlexTrans but using
implementation details more similar to SegWit. Doing so should allow
actors in the ecosystem, who have already implemented SegWit, to support
this BUIP with minimal effort.

Specification

A new transaction format is introduced. It is recognized by its use of
version 7.

The binary format is similar to existing transaction format. However,
the signature script for each input is replaced by witness data, the
output is limited to a version and a hash, and options and metadata
fields are added for extensibility.

For compactness all integer are represented via a variable length
encoding, except the version field. The leading byte of the format
indicate how large the representation is. The representation is big
endian.

First byte Bits From To
0xxxxxxx 7 0x0000000000000000 0x000000000000007f
10xxxxxx 14 0x0000000000000080 0x000000000000407f
110xxxxx 21 0x0000000000004080 0x000000000020407f
1110xxxx 28 0x0000000000204080 0x000000001020407f
11110xxx 35 0x0000000010204080 0x000000081020407f
111110xx 42 0x0000000810204080 0x000004081020407f
1111110x 49 0x0000040810204080 0x000204081020407f
11111110 56 0x0002040810204080 0x010204081020407f
11111111 64 0x0102040810204080 0xffffffffffffffff

It has the following binary format:

Field Size Description Data type Comments
1+ version varuint Transaction data format version (in this case, 7)
1+ tx_in count varuint Number of Transaction inputs
35+ tx_in tx_in[] A list of 1 or more transaction inputs or sources for coins
1+ tx_out count varuint Number of Transaction outputs
22+ tx_out tx_out[] A list of 1 or more transaction outputs or destinations for coins
1+ option size varuint The size of the metadata field, in bytes
? option metadata Optional metadata relative to this input

Transaction id (aka txid) is computed by hashing the transaction,
skipping over input’s witness and metadata. See the inputs section for
more details.

The option field is interpreted as follow:

The metadata field contains a set of entries. Each entry is made of one
varuint tag, followed by a value, which size and representation depends
on the tag. One tag value cannot appear twice.

Because the size of the metadata field is known by the parser, the
parser can skip over remaining metadata when it encounter a tag with a
value it doesn’t know how to interpret.

This BUIP defines the following tags for transaction metadata:

Tag Description Data type Comments
10 LockByBlock varuint lock_time support (block height)
11 LockByTime varuint lock_time support (timestamp)

Inputs

TxIn consists of the following fields:

Field Size Description Data type Comments
33+ previous_output outpoint The previous output transaction reference, as an OutPoint structure
1+ witness count varuint Number of witness data
? witness witness Witness data
1+ metadata size varuint The size of the metadata field, in bytes
? metadata metadata Optional metadata relative to this input

The OutPoint structure consists of the following fields:

Field Size Description Data type Comments
32 hash char[32] The hash of the referenced transaction
1+ index varuint The index of the specific output in the transaction. The first output is 0, etc.

Witness consist of the following fields:

Field Size Description Data type Comments
1+ witness length varuint Witness length
? witness uchar[] Witness data

The witness field represent element that are on the stack before the
redeem script starts to run.

This BUIP defines the following tags for input metadata:

Tag Description Data type Comments
10 LockByBlock varuint BIP68/112/113 support
11 LockByTime varuint BIP68/112/113 support

Metadata tags are a BUIP039 extension point. The can be signaled with
the InMD prefix.

The signature process compute sighash as double_sha256(txid +
previous_output + TxOut + metadata + hashType)

TxOut needs to use the new format. When spending UTXO from older
transaction, please refers to the conversion procedure in the Outputs
section.

Outputs

TxOut consists of the following fields:

Field Size Description Data type Comments
1+ value varuint Transaction Value
1+ version varuint Script versioning capabilities
20+ hash uchar[] Usually contains the public key as a Bitcoin script setting up conditions to claim this output

The 3 least significant bits of the version field indicate the size of
the output hash as follow:

Size class Hash size
0 20B - 160bits
1 24B - 192bits
2 28B - 224bits
3 32B - 256bits
4 40B - 320bits
5 48B - 384bits
6 56B - 448bits
7 64B - 512bits

This BUIP define 4 valid versions:

Version Semantic
0 P2KH - OP_HASH160
3 P2KH - OP_HASH256
8 P2SH - OP_HASH160
11 P2SH - OP_HASH256

For both version, we’ll define OP_HASH as OP_HASH160 is the hash size
is 20 bytes and OP_HASH256 if the hash size is 32bytes.

If we have P2KH version, the following redeem script is executed to
verify the signature:

OP_DUP OP_HASH <pk_script> OP_EQUALVERIFY OP_CHECKSIG

If we have P2SH version, the topmost element of the stack popped and
hashed using OP_HASH, and the result compared to hash. If the
comparison fails, the transaction is invalid. If the comparison succeed
the popped element is executed as a script to validate the spend.

Out version are a BUIP039 extension point. They can be signaled with the
OutV prefix.

Legacy UTXO can be converted to this new format using the following
procedure:

Script pattern Version Script
OP_DUP OP_HASH160 <pk_hash> OP_EQUALVERIFY OP_CHECKSIG 0 pk_hash
OP_DUP OP_HASH256 <pk_hash> OP_EQUALVERIFY OP_CHECKSIG 3 pk_hash
OP_CHECKSIG 3 double_sha256(pubkey)
OP_HASH160 <script_hash> OP_EQUAL 8 script_hash
anything 11 double_sha256(anything)

NB: this may require output script to be duplicated in the witness to
spend legacy UTXO. This will happen in a very minimal number of cases
and is on purpose. Witness data can be put in cold storage while UTXO
data need to be kept hot as normal node needs to query these data
frequently when operating. Because of this, it is desirable to shift as
much data as possible from the UTXO set to the witness data.

Semantic

- Signature malleability are prohibited as per BIP146 -
OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY dummy argument must be a
null vector as per BIP147.

Rationale

This BUIP keeps thing close to SegWit to minimize sunk cost for actor
supporting it. It also reuse the tag system from FlexTrans to ensure the
format is extensible, but limit this to the metadata and option field in
order to enable BUIP039.

In conclusion, This combine the best part of SegWit (extensibility and
privacy using a version/hash pair as output) and FlexTrans
(extensibility, compatible UTXO) and allow for BUIP039. In addition,
this limits the sunk cost for actors who prepared for SegWit.

Revisions

EDIT: Revision, removing the lock_time field and reintroducing the
sequence one. More revision to come as this converge toward FlexTrans.

EDIT2: Revision.

Output script are now just a hash. This is also an idea from SegWit and
allow to shift data from the UTXO set to the witness data. Because this
is a hard fork, it can be done in a way that is backward compatible.
Inputs now have a metadata field containing optional data identified via
a tag. This is a natural extension point for future extension and an
ideal place to store optional data such as the sequence field.

EDIT3: Use BUIP039 to extend this transaction format.

EDIT4: Add a global option field which respects the metadata. Leverage
it to implement lock_time.

EDIT5: Rework the Out structure to be more compact. Add a rationale
section to compare to FlexTrans and SegWit.

EDIT6: Remove comparison to FT and SW as post length is limited.

EDIT7: Use variable size encoding for int all over the place. Define the
format.