What happens if a node's database gets corrupted or deleted?

(Kevin Smets) #1

Hi, I am just wondering, what happens if a node loses its database? Just as a test, I deleted a node’s database in a sample setup and the node just loses it’s states. This is after a reboot, since the states are in memory once started if I am not mistaken? I found nothing on the subject here or in the technical white paper (or I missed it).

It seems (have not digged into the code far enough yet) there is no recovery mechanism built into the system like with Ethereum where there is a sync with other nodes at startup time?

Say there is a network with only one notary, it should be possible to restore the database, given other nodes or the notary still have all the data, no? Obviously, with multiple networks / shared ledgers it will also depend on the databases of the other nodes and notaries if it would be at all possible to restore all states and it will become complex, if not impossible real fast.

So in recap, is there a recovery mechanism for the nodes’ databases or is it on the roadmap? (Maybe it’s not what is intended with Corda and then that is an answer in itself :slight_smile:). Or is the intention that backup or restore mechanism(s) are optionally built in by the devs per application? Obviously backing up and restoring databases on the infrastructure level are the way to go in both scenarios (as far as I can tell).

(Roger Willis) #2

Good question. Perhaps this is a bit of a facetious response but what happens if you delete all the databases within your organisation?

… you lose all the data! Corda is no different. As such, you need to back up your data.

Corda could provide the possibility of restoring shared data from peers on the network but it’s not something we have implemented yet.

If Corda nodes are presented with a valid transaction hash, they will respond with the transaction data corresponding to that hash. As such, if you record a list of tx hashes and the identities of the peers which you know to possess the corresponding transaction data, then in theory, you can recover all of your shared data. To me, it seems this should be seen as a “last resort” strategy to recovering your data.

As an aside, the value of DLT is in reducing/eliminating inconsistencies and thus the need to reconcile ledgers. Some have erroneously identified another “benefit” - that they now don’t have to worry about data recovery or backups because the data is “on the ledger”. Crucially, this is only a valid benefit if your business depends only on shared data and has no private data of its own. Having said that, as Corda is designed to bring untrusting parties into consensus, you can never trust that your counter-party won’t delete a state object, as such you should always keep your own backups.

If you do have private data, then of course you’ll back up it as per usual and you may as well back up the shared data whilst you’re at it!

(Kevin Smets) #3

Thanks for the quick feedback. Indeed, backup is a given, especially since there is no single ledger, at least not in 99% of the use cases most likely. Not that you could rely on auto restoration even if the ledger was shared with all parties. So my question is then: is an integrated restoration system something that is on the roadmap then? Since you say “not implemented yet:slight_smile:

I’m also wondering how this reflects on the assumed setup, in the technical paper it states nodes are supposed to be long lived for instance. This does not correlate well with something like a cloud native environment, where nodes will be replicated, load balanced and most likely, shot down just for testing the resilience of the system (chaos monkey style). Obviously a cloud native environment implies persistent database storage and backup, but I’m just trying to cover all bases here. If a (replicated) node can restore to the latest true state of affairs out of itself with help of it’s (untrusted) peers, it’s just good to know which scenario’s you have at your disposal. Even if you would only use it for comparison against the last backup you had before the system went down.

(Mike Hearn) #4

At the moment such a scheme is not on the roadmap. We think that in practice there will always be private data.

With respect to the cloud, yes it is assumed that you have stable relational database storage. If you do then the node can be killed and restarted more or less at will due to the flow checkpointing mechanism.

(Kevin Smets) #5

Alright, thanks a lot for the clarifying answers!