BUIP037: Hardfork SegWit
Proposer: Amaury SÉCHET
Submitted: 2016-11-12
Status: draft
There are various problem with the current transaction format, including
malleability and quadratic hashing. SegWit proposes to add a new
transaction format, which solves this problem but does so in a way that
does not allow to spend existing UTXO. As a result, SegWit doesn’t
deliver on its promise. FlexTrans proposes an alternative way to solve
these problems in a way that is compatible with existing UTXO, which
allows us to eventually weed out the old transaction format.
This BUIP proposes to adopt a strategy similar to FlexTrans but using
implementation details more similar to SegWit. Doing so should allow
actors in the ecosystem, who have already implemented SegWit, to support
this BUIP with minimal effort.
A new transaction format is introduced. It is recognized by its use of
version 7.
The binary format is similar to existing transaction format. However,
the signature script for each input is replaced by witness data, the
output is limited to a version and a hash, and options and metadata
fields are added for extensibility.
For compactness all integer are represented via a variable length
encoding, except the version field. The leading byte of the format
indicate how large the representation is. The representation is big
endian.
First byte | Bits | From | To |
---|---|---|---|
0xxxxxxx | 7 | 0x0000000000000000 | 0x000000000000007f |
10xxxxxx | 14 | 0x0000000000000080 | 0x000000000000407f |
110xxxxx | 21 | 0x0000000000004080 | 0x000000000020407f |
1110xxxx | 28 | 0x0000000000204080 | 0x000000001020407f |
11110xxx | 35 | 0x0000000010204080 | 0x000000081020407f |
111110xx | 42 | 0x0000000810204080 | 0x000004081020407f |
1111110x | 49 | 0x0000040810204080 | 0x000204081020407f |
11111110 | 56 | 0x0002040810204080 | 0x010204081020407f |
11111111 | 64 | 0x0102040810204080 | 0xffffffffffffffff |
It has the following binary format:
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | version | varuint | Transaction data format version (in this case, 7) |
1+ | tx_in count | varuint | Number of Transaction inputs |
35+ | tx_in | tx_in[] | A list of 1 or more transaction inputs or sources for coins |
1+ | tx_out count | varuint | Number of Transaction outputs |
22+ | tx_out | tx_out[] | A list of 1 or more transaction outputs or destinations for coins |
1+ | option size | varuint | The size of the metadata field, in bytes |
? | option | metadata | Optional metadata relative to this input |
Transaction id (aka txid) is computed by hashing the transaction,
skipping over input’s witness and metadata. See the inputs section for
more details.
The option field is interpreted as follow:
The metadata field contains a set of entries. Each entry is made of one
varuint tag, followed by a value, which size and representation depends
on the tag. One tag value cannot appear twice.
Because the size of the metadata field is known by the parser, the
parser can skip over remaining metadata when it encounter a tag with a
value it doesn’t know how to interpret.
This BUIP defines the following tags for transaction metadata:
Tag | Description | Data type | Comments |
---|---|---|---|
10 | LockByBlock | varuint | lock_time support (block height) |
11 | LockByTime | varuint | lock_time support (timestamp) |
TxIn consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
33+ | previous_output | outpoint | The previous output transaction reference, as an OutPoint structure |
1+ | witness count | varuint | Number of witness data |
? | witness | witness | Witness data |
1+ | metadata size | varuint | The size of the metadata field, in bytes |
? | metadata | metadata | Optional metadata relative to this input |
The OutPoint structure consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | hash | char[32] | The hash of the referenced transaction |
1+ | index | varuint | The index of the specific output in the transaction. The first output is 0, etc. |
Witness consist of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | witness length | varuint | Witness length |
? | witness | uchar[] | Witness data |
The witness field represent element that are on the stack before the
redeem script starts to run.
This BUIP defines the following tags for input metadata:
Tag | Description | Data type | Comments |
---|---|---|---|
10 | LockByBlock | varuint | BIP68/112/113 support |
11 | LockByTime | varuint | BIP68/112/113 support |
Metadata tags are a BUIP039 extension point. The can be signaled with
the InMD prefix.
The signature process compute sighash as double_sha256(txid +
previous_output + TxOut + metadata + hashType)
TxOut needs to use the new format. When spending UTXO from older
transaction, please refers to the conversion procedure in the Outputs
section.
TxOut consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | value | varuint | Transaction Value |
1+ | version | varuint | Script versioning capabilities |
20+ | hash | uchar[] | Usually contains the public key as a Bitcoin script setting up conditions to claim this output |
The 3 least significant bits of the version field indicate the size of
the output hash as follow:
Size class | Hash size |
---|---|
0 | 20B - 160bits |
1 | 24B - 192bits |
2 | 28B - 224bits |
3 | 32B - 256bits |
4 | 40B - 320bits |
5 | 48B - 384bits |
6 | 56B - 448bits |
7 | 64B - 512bits |
This BUIP define 4 valid versions:
Version | Semantic |
---|---|
0 | P2KH - OP_HASH160 |
3 | P2KH - OP_HASH256 |
8 | P2SH - OP_HASH160 |
11 | P2SH - OP_HASH256 |
For both version, we’ll define OP_HASH as OP_HASH160 is the hash size
is 20 bytes and OP_HASH256 if the hash size is 32bytes.
If we have P2KH version, the following redeem script is executed to
verify the signature:
OP_DUP OP_HASH <pk_script> OP_EQUALVERIFY OP_CHECKSIG
If we have P2SH version, the topmost element of the stack popped and
hashed using OP_HASH, and the result compared to hash. If the
comparison fails, the transaction is invalid. If the comparison succeed
the popped element is executed as a script to validate the spend.
Out version are a BUIP039 extension point. They can be signaled with the
OutV prefix.
Legacy UTXO can be converted to this new format using the following
procedure:
Script pattern | Version | Script |
---|---|---|
OP_DUP OP_HASH160 <pk_hash> OP_EQUALVERIFY OP_CHECKSIG | 0 | pk_hash |
OP_DUP OP_HASH256 <pk_hash> OP_EQUALVERIFY OP_CHECKSIG | 3 | pk_hash |
OP_CHECKSIG | 3 | double_sha256(pubkey) |
OP_HASH160 <script_hash> OP_EQUAL | 8 | script_hash |
anything | 11 | double_sha256(anything) |
NB: this may require output script to be duplicated in the witness to
spend legacy UTXO. This will happen in a very minimal number of cases
and is on purpose. Witness data can be put in cold storage while UTXO
data need to be kept hot as normal node needs to query these data
frequently when operating. Because of this, it is desirable to shift as
much data as possible from the UTXO set to the witness data.
- Signature malleability are prohibited as per BIP146 -
OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY dummy argument must be a
null vector as per BIP147.
This BUIP keeps thing close to SegWit to minimize sunk cost for actor
supporting it. It also reuse the tag system from FlexTrans to ensure the
format is extensible, but limit this to the metadata and option field in
order to enable BUIP039.
In conclusion, This combine the best part of SegWit (extensibility and
privacy using a version/hash pair as output) and FlexTrans
(extensibility, compatible UTXO) and allow for BUIP039. In addition,
this limits the sunk cost for actors who prepared for SegWit.
EDIT: Revision, removing the lock_time field and reintroducing the
sequence one. More revision to come as this converge toward FlexTrans.
EDIT2: Revision.
Output script are now just a hash. This is also an idea from SegWit and
allow to shift data from the UTXO set to the witness data. Because this
is a hard fork, it can be done in a way that is backward compatible.
Inputs now have a metadata field containing optional data identified via
a tag. This is a natural extension point for future extension and an
ideal place to store optional data such as the sequence field.
EDIT3: Use BUIP039 to extend this transaction format.
EDIT4: Add a global option field which respects the metadata. Leverage
it to implement lock_time.
EDIT5: Rework the Out structure to be more compact. Add a rationale
section to compare to FlexTrans and SegWit.
EDIT6: Remove comparison to FT and SW as post length is limited.
EDIT7: Use variable size encoding for int all over the place. Define the
format.