Evolution plan: transaction summaries

(Mike Hearn) #1

Evolution plans are writeups of how we could add support for features that aren’t going to be ready in time for the 1.0 release in a backwards compatible way. The goal is to explore evolutions and make sure that we have everything we need to do this in place.

These are not formal design documents and do not imply that these designs will be used. The goal is surface cases where we need to make compatibility tweaks to ensure we could do such a design.

Backwards compatibility assumptions

  • We can add new fields to existing data structures. Old nodes will ignore them.
  • Flows know the version of the flow running on the other side, and can adapt what messages they send and receive as appropriate.
  • We can add new methods and types to the API, but not make breaking changes like renames or deletes.


Transaction summaries are described in the technical white paper, section 10.3. Briefly, they are short human-readable pieces of text that describe what a transaction actually does in business language. They can be rendered via a secure display on a signing device (as found in devices like the Trezor or Ledger Blue) to ensure the user understands what they’re about to authorise.

This is useful to avoid confusion attacks of the type described by the BBC at the start of this news article:

Confusion attacks are simple to carry out with the right malware, and can yield millions of dollars in stolen money.

Transaction summaries are verified to correspond to what the machine-readable version of the transaction does because the smart contract logic knows how to build the descriptions.


  1. Scalability: signing devices must not have to load the entire transaction contents into memory.
  2. Multi-lingual messages: it must be possible for transaction messages to appear in non-English languages.
  3. Developer usability: it must be easy to add support for this feature to applications.
  4. Hardware independence.


Contracts that support this feature should expose a function that takes a LedgerTransaction and a language code and returns a string. The verify() function should iterate over every summary message found in the LedgerTransaction and for each one, call its own function that generates the summary message from the machine readable data. Then it must do an equality check to ensure that the message that’s embedded in the transaction does in fact match the generated message.

Note that we use a design where the contract code generates text based on the states+commands, rather than where it attempts to parse the message and check the contents, because it’s simpler and more robust to do it that way. The generation logic must be in a separate method so that when a transaction is being built, the messages can be filled out.

The transaction contains the messages as part of itself because that way, the messages are hashed into the Merkle tree, and the messages must be covered by the tree so that the message data can be passed to the secure hardware without having to provide the entire transaction data structure, and so the secure hardware does not have to execute any custom code to figure out what the message should be. This design is thus somewhat indirect, but minimises the amount of code required on the secure hardware device.

The secure hardware does the following steps when a request to sign the transaction data is made:

  1. Accepts an upload that contains the Merkle root of the transaction, a list of summary messages, and a Merkle branch that links the messages to the Merkle root. The upload also contains signatures over the transaction, and X.509 certificate chains that link those signatures to long term identity certificates.
  2. It verifies the signatures pass, and then verifies the Merkle branch, and then verifies the X.509 certificate chains. These verifications tie all the pieces together: the signatures are for a transaction that specify the provided messages within itself, and the identities of the signers are now known.
  3. It performs simple token substitution of the identities into the transaction messages (to avoid breaking the confidential identities feature).
  4. Having calculated the final strings to display, it goes ahead and shows them to the user, then asks the user to confirm or cancel signing.

There might be alternative designs possible here that don’t involve identity substitution, by ensuring that the part of the transaction that contains messages is always torn off (the messages are irrelevant to anyone who isn’t signing after all). Again, this post isn’t a comprehensive design doc for the feature.

Possible introduction plan

A new type is added:

/** Contains a language code, message pair that summarises what the transaction does. */
data class TransactionSummaryMessage(val languageCode: String, val message: String)

A new field is added to WireTransaction and LedgerTransaction. It contains a List<TransactionSummaryMessage>.

The message objects are passed in to the verify function (the smart contract) by virtue of being on LedgerTransaction.

The contract object implements a new interface:

interface SummarizableContract : Contract {
    fun summarizeTransaction(ltx: LedgerTransaction, forLanguageCode: String): TransactionSummaryMessage

which, if implemented, is used by the platform to calculate the initial summaries and insert them into the transaction once software has built the transaction from e.g. GUI input or an inbound RPC.

Merkle tree evolution

The introduction of a new field onto the transaction structure entails a change to how Merkle roots are calculated. Old nodes that can’t parse the new data will ignore it, and thus will calculate the transaction ID without taking it into account. Agreement has been reached with Kostas and Rick about how to handle this.

Partial upgrades

With the design outlined so far there is a version skew problem that can lead to a security attack. How does the platform or secure hardware know if all the contracts in the transaction actually supports the new feature? If the contract has not been upgraded then an attack could construct a transaction and ask you to sign it that has a message that doesn’t match what the transaction itself actually does at all.

This problem is especially complicated by the fact that a transaction may involve multiple apps at once. Consider a transaction that moves central bank money in order to clear an obligation. This would involve both the cash and obligation contracts. A good summary for this might be:

“Pay $10k to FooCorp. Clear obligation #1234.”

This is two messages, one for each contract. But if cash hasn’t been upgraded and obligation has, we might be presented with a transaction like this:

“Pay $10k to BarCorp. Clear obligation #1234.”

where in fact the payment goes to FooCorp … this is the sort of attack described by the BBC. The data the machines use to route payments does not match what the user believes they’re doing.

It is possible that the platform should simply ignore any string found in the summaries of a transaction that its local copy of the application is not able to provide, to exclude potentially misleading or wrong summaries.