Private Data and Transient Data in Hyperledger Fabric

Overview

The feature we focus on here is Private Data, while another concept of Transient Data is also introduced when using Private Data. Technically these are two different concepts. While Private Data is about keeping data inside a subgroup of organizations defined in collection definition, Transient Data is an input method when using Private Data. Interesting enough, they do not have direct relationship, though in real life, as shown below, we should use Transient Data when we need Private Data for a certain security level.

Concept Review

Review on some key concepts helps us when sailing in the demonstration.

Ledger: In Hyperledger Fabric peers are maintaining a copy of the ledger after joining a channel. There are two parts in the ledger. The first part is a blockchain data structure holding the blocks (of transactions). The second part is a worldstate database, keeping the latest state after a block is committed. When a new block is received from ordering service, upon successful validation, peer commits the block into ledger. This includes placing the block into the blockchain and updating the worldstate according to the RWSet inside each transaction.

Most parts of the ledger are identical among the peers within a channel, thanks to the overall consensus mechanism. There is an exception, Private Data, in which only specified organizations are storing in the worldstate.

Private Data: Within a channel there are scenarios that only a subgroup of organizations keep the data while those outside that subgroup do not. It is always due to requirement of data privacy between organizations. Hyperledger Fabric introduces Private Data to address this need. Through data collection definition, we can define the subgroups as collections where private data is implemented. As a proof of data existence or audit purpose, all peers (within and outside the subgroup) will keep a record of private data hash.

The use of private data is made through chaincode API. For our interest we are using PutPrivateData and GetPrivateData in our demonstration. As a comparison, we use PutState and GetState when we write and read data from public state.

Transient Data: Many chaincode functions, when being invoked, require external data input. In most cases we are supplying a list of arguments when invoking a function, and the chaincode is well-coded to process the arguments. The chaincode arguments, including function name and arguments associated to that function, are kept as part of a valid transaction inside a block, and therefore will stay in the ledger forever. If for some reasons we do not wish to keep the argument list permanently in blockchain, we can use Transient Data. Transient Data is a data input method such that the input data can reach the chaincode but will not stay in transaction record. A special chaincode API GetTransient is needed when Transient Data is used, with a proper format. We will see this in our demonstration.

Last updated