Notary composite key and changes


(Quiark) #1

Hi,

Cluster notaries (RAFT or BFT based) have in their service identity a composite key that consists of public keys of all members of the cluster. And this composite key is included in each transaction. This brings two problems and questions:

  1. changing the set of cluster nodes (such as adding a new node) would change notary identity and thus require a Notary Change Flow for all existing states? If this is correct, is there or are you planning any way around that?

  2. it also increases the size of the state (around 500 bytes for a cluster of 9). This becomes a problem in larger scale

Thanks


(Mike Hearn) #2
  1. That’s correct. We think this makes sense: when the operators of a compatibility zone decide to trust a notary, even if it’s distributed via a BFT algorithm, that trust decision ultimately rests on who makes up the notary. A BFT notary in which all the participants are malicious is obviously still malicious. So, a change in the makeup of a notary is something that should require explicit consent of the users.
  2. It increases the size of the signed transaction yes. Scaling is best thought of as a graph. When you say it becomes a problem, how much transaction traffic are you anticipating?

There are cryptographic techniques that let you collapse a composite signature down to a single signature, when all keys are using a compatible scheme. The reason we don’t use these in Corda out of the box is because we want the ability for nodes to migrate between different schemes, e.g. from RSA to ECDSA to SPHINCS (a post quantum algorithm). However, if we find that storage costs are becoming a real problem to deployment, we can look at supporting a cryptographic threshold signature scheme that would yield a single signature despite multiple independent parties making it up. This would be an optimisation that would only apply whilst all members of a notary use the same algorithm. It’d cease to apply during a transition to alternative schemes.


(Quiark) #3

Hi, as we talked on Slack, we are looking to run hundreds of Cash transactions per second.

I noticed that WireTransaction has a notary field as well as the TransactionState object. Since a Cash transaction may likely have 2 outputs (one primary and one ‘change’ output), the notary composite key will be serialized 3 times inside the transaction unless I missed something and some sort of internal reference is supported.

In my testing, with 3 notary nodes, a typical Cash payment transaction size is 3.5 kB. If we aim for 100 tx/s, it comes to 350 kB of data being created per second. I think this is a bit problematic.


(Mike Hearn) #4

We’ve been working for a while on a serialisation scheme that doesn’t duplicate serialised data, even if it’s referenced multiple times on the wire. @parkri can talk about it a bit more, but we’re very close to turning the new scheme on. The current scheme is using a generic Kryo scheme which isn’t that well optimised. I don’t think we’ve done space comparisons yet, but optimising the new format is something that’s going to be a focus soon.

Although we’re heading towards a Corda 1.0 release quite quickly, this initial release won’t stabilise the wire protocol. That’s the next target after 1.0 to follow soon after. We’ll look at tuning the transaction size as part of that. I think there’s going to be a lot of low hanging fruit - simply gzipping can remove a lot of unnecessary redundancy at minimal CPU cost.

Let’s reason about storage costs in dollar terms. 350kb of data per second would result in about 30 gigabytes of data per day, or about 11 terabytes of transaction data per year. According to this helpful article from Backblaze, this would cost about $480 / year in hard disk cost, if we ignore replication and backup. Let’s say we duplicate data three times for backup and redundancy. So that’s about $1500 / year in hard disk costs, to keep up with this data rate. Of course there’ll be some indexing overhead and the like. $1500 is something of a lower bound.

Is this too expensive? I’m not sure - that depends on the context. Too expensive for an individual hobbyist, yes, probably. Too expensive for a bank? I suspect it is not a big cost in the context of a bank.

This is before we consider non-HDD storage, like the commercial cloud BluRay storage services like Amazon Glacier. Those are supposedly much cheaper than HDDs.

None of this is an excuse to not optimise the wire protocol, of course. There are many simple tricks we can use, like the one Bitcoin uses where keys are represented as short hashes. It’d be easy to resolve an unknown hash to the full composite key on demand using an extra flow roundtrip, if we want to take out the overhead of the composite key being a part of every transaction.

I’ll file some tasks in JIRA to study the size of a SignedTransaction and ensure we optimise it before we commit to the wire protocol. Hope that helps!

https://r3-cev.atlassian.net/browse/CORDA-514


(Quiark) #5

Thanks.

What is your thinking on reusing the same key for multiple nodes in a Raft cluster? In this case the nodes are assumed to be operated by the same company and the only protection we expect is against random crashes. I think having multiple nodes with the same private key could make sense in that case. And we would save on composite key size ^.^

A side question: is there any support or plan for key revocation?


(Mike Hearn) #6

Yes, that would make a lot of sense.

We have a plan for revocation, I already have a todo on my list to write up a blog post about it this week. Briefly, our current thinking is to use CRLs with a later migration to OCSP Stapling. Stapling is built in to the Java SSL stack from v9 onwards and at the start we won’t need that complexity for small networks, so my inclination is to delay the deployment of OCSP until later. Certificates would not expire.