Concurrent smart contracts in Hyperledger Fabric blockchain (part 1)

Published in

SoftwareMill Tech Blog

10 min readDec 10, 2019

Working with distributed systems is not easy. You need to face a separate category of problems like lost or duplicated messages, inconsistent state, performance bottlenecks and many others. Even the private blockchains that have strong consistency guarantees are not so easy to handle.

I will start with the simple operation on the blockchain state: incrementing a value. I will show that it is not simple at all.

Key collisions in Hyperledger Fabric (Photo by Ming Jun Tan on Unsplash)

The simple chaincode

Writing chaincodes (i.e. programs that consist of smart contracts) in Hyperledger Fabric is easy. In the Node.js SDK you can extend Contract class and write an async function which will have access to the state. The simplest example of the chaincode may look as follows (all examples are written in TypeScript):

The world state in the Hyperledger Fabric is preserved in the form of key-value store. In the example above the current incrementer value is stored under the key incrementer_value. The getValue smart contract retrieves the current value. The increment smart contract increments and overwrites it.

So far so good. This example works perfectly fine… unless you call it in parallel. It simply doesn’t work in parallel. Why? Because of key collisions, a mechanism to provide state consistency in the Hyperledger Fabric. A single key incrementer_value cannot be modified in different smart contract invocations within the same block of chain.

Hyperledger Fabric guarantees

You probably know the CAP theorem. It states that in the distributed system only two out out of the consistency, availability and partition tolerance guarantees can be achieved at the same time. Even if the Wikipedia is not the best source of information about the CAP theorem (you should definitely read this great article by Martin Kleppmann), it may give you a good intuition of the concepts:

Consistency: Every read receives the most recent write or an error.
Availability: Every request receives a (non-error) response — without the guarantee that it contains the most recent write.
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

In terms of the CAP theorem the blockchain as a distributed system achieves partition tolerance, but the second guarantee depends on the way how the blockchain nodes agree on the state.

Such blockchains like Ethereum and Bitcoin rely on probabilistic consensus algorithms, which guarantee ledger consistency to a high degree of probability. The probability that a transaction is “final” increases as the block with the transaction gets deeper into the chain. For instance, it is recommended to consider Bitcoin transaction final when it is at least 6 blocks deep in the chain. In this case we have probabilistic finality and the blockchains that supports it favor availability over consistency.

The is a second group of blockchains that provides absolute finality. Once a block is added to chain, it is considered final. The blockchains that support this kind of finality favor consistency over availability. Hyperledger Fabric belongs to this group, along with many other private blockchains.

Hyperledger Fabric, before applying the changes made by smart contracts, checks which keys in the state were read and which keys were updated. If, for instance, three smart contracts try to update the same key, there is a conflict and two of them fail with no state change.

Have a look at this quite a long excerpt from the documentation:

Hyperledger Fabric has concurrency control whereby transactions execute in parallel (by endorsers) to increase throughput, and upon commit (by all peers) each transaction is verified to ensure that no other transaction has modified data it has read. In other words, it ensures that the data that was read during chaincode execution has not changed since execution (endorsement) time, and therefore the execution results are still valid and can be committed to the ledger state database. If the data that was read has been changed by another transaction, then the transaction in the block is marked as invalid and is not applied to the ledger state database. The client application is alerted, and can handle the error or retry as appropriate.

And this is the way Hyperledger Fabric provides consistency in terms of the CAP theorem. Or linearizability if you want to use a better word (see this post again). And this is the reason why each block appended to the chain is considered final.

What does it mean for our IncrementerContract? It means that if the getValue or increment smart contract succeeds, we have access to the most recent state. If there is a possibility that this is not the most recent state, the smart contract fails.

How to increment in parallel

There is a great official repo on GitHub: fabric-samples. You can find here, among others, some practical guidelines how to write your chaincodes to achieve high throughput (I skipped some technical details in the quote below):

This network is used to understand how to properly design the chaincode data model when handling thousands of transactions per second which all update the same asset in the ledger. A naive implementation would use a single key to represent the data for the asset, and the chaincode would then attempt to update this key every time a transaction involving it comes in. However, when many transactions all come in at once, (…) another transaction may have already updated the same value. Thus, (…) a large number of parallel transactions will fail.

In other words: we have the consistency at the cost of lower availability in case of concurrent smart contracts.

Another great resource about handling this kind of situations is an article How to prevent key collisions in Hyperledger Fabric chaincode by Ivan Vankov (known as gatakka). The author proposes four approaches:

No duplicated keys: You can avoid using the same keys, what is easy to implement and achieves high throughput, but you cannot create business logic in the chaincodes, since the state is dispersed over unrelated keys.
Put requests in queue: The client side is responsible for managing a queue of smart contract invocations (it takes care of the possible key collisions). This is a non-performant solution and the application layer takes a lot of responsibility that might be outsourced to the chaincodes.
Running total: You have no duplicated keys but there is also a separate smart contract that calculates the total state. This solution is performant, however it might be difficult to handle validation on the state. For real applications you will probably end up with multiple operation statuses and handling transitions among them.
Batching: Single smart contract handles a list of operations. Even if you execute batches in single thread, the throughput might be still very high. Besides it is quite easy to implement this solution, the application layer might not know the chaincode business logic and the data consistency is preserved.

For the simple incrementer chaincode let’s start with the no duplicated keys approach (described in the fabric-samples as well), which seems to be the easiest one. Batching and running total will be examined in subsequent articles in the series.

No key collisions with batching (Image by rfsyzygy from Pixabay)

No duplicated keys

It seems to be easy to implement the no duplicated keys approach. We can just save increment operations.

In this case inside the increment smart contract we need to create a key that should be unique for each invocation. We can use the transaction ID (ctx.stub.getTxID()) which uniquely identifies the transaction within the scope of the channel. Instead of simple string keys, we use the indexName and composite keys, which effectively enable us to find all saved operations.

Have a look at the part of the documentation of createCompositeKey from fabric-chaincode-node source code:

Hyperledger Fabric uses a simple key/value model for saving chaincode states. In some use case scenarios, it is necessary to keep track of multiple attributes. Furthermore, it may be necessary to make the various attributes searchable. Composite keys can be used to address these requirements. Similar to using composite keys in a relational database table, here you would treat the searchable attributes as key columns that make up the composite key. Values for the attributes become part of the key, thus they are searchable with functions like getStateByRange() and getStateByPartialCompositeKey().

In order to get the current value we just need to count all increment operations saved in the blockchain world state. As you can see in the example below we can get all operations using partial composite keys (method getStateByPartialCompositeKeyWithPagination()):

The amount of code to get the current value is a good example of the complexity of handling the aggregate state in the no duplicated keys approach. Because you are afraid of key collisions, you don’t have the aggregated state at all. You need to calculate it every time.

This approach might be useful especially in cases when you don’t need to have the aggregated state and business validation (example: signing documents with the blockchain, storing some unrelated data). If you do, you should probably have some process that aggregates the data periodically (running total) or increase the throughput by batching rather than avoiding key collisions.

Duplicate operations and idempotency

There is another thing that should be addressed early. There is a risk that we may accidentally invoke the increment smart contract twice. This might happen because of poorly designed client, a user error or even the network failure. We need to ensure somehow that each operation will be performed only once. We should make the smart contracts idempotent.

An operation is idempotent when its multiple application does not impact the state beyond the initial application. For example x = x + 1 is not idempotent, but x = 7 and x = x * 1 are.

Idempotent operations may be used as a convenient way to handle deduplication. When you have a distributed system with at least once delivery, in many cases you don’t need to care about duplicated messages as long as they are idempotent.

An interesting case of providing idempotency may be observed in the domain of payment providers. When you pay for some commodity, you want to have guarantee that the request was delivered (i.e. you need at least once delivery) and simultaneously you don’t want to pay twice in case of duplicated request. You can handle this situation by providing a unique key that guarantees idempotency. Have a look at the Stripe documentation:

To perform an idempotent request, provide an additional Idempotency-Key: <key> header to the request.
Stripe’s idempotency works by saving the resulting status code and body of the first request made for any given idempotency key, regardless of whether it succeeded or failed. Subsequent requests with the same key return the same result, including 500 errors.

The idempotency key is generated in the client side, because it is the client who may resend the request and we are not sure whether the client will get any response from the service.

We can follow this approach in our incrementer chaincode example. We can add the idempotencyKey parameter to the smart contract and then, since Hyperledger Fabric preserves the state in the key-value store, we can make the idempotencyKey the part of the key instead of the transaction ID.

The idempotencyKey becomes the unique part of the key in the Hyperledger’s key-value store and when we want to check if the operation was already performed, we can just check if it has already been saved.

However, the code above can be simplified and improved. It turns out, that when we call idempotent operation twice in the same time, we still have a key collision. Have a look:

Two smart contracts with the same idempotencyKey are called at the same time.
Both smart contracts read the value under the same key.
Since there is no value, both of them try to put new value and BANG! — key collision. One of them passes, one of them fails.

This might be an expected behavior in some cases, but for the simple incrementer example we can do better. A lot better:

And, surprise, surprise, in this case we avoid key collisions. Still, both smart contracts try to put value with the same key, however there are no reads and this is the same value, so Hyperledger can handle this situation.

The state is predictable, because the incrementer puts every time the same value (+1) and in general you might expect the same value each time for the idempotent operations. But there is a trap! In case of concurrent modifications of the same key with different values, you have no control of which value will finally be saved. So the conclusion is that you can achieve idempotency in key-value store by making the idempotency key a part of the object key, but you need to ensure the same key and the same value is going to be saved for the same idempotency key.

An idempotent operation (Photo by Asa Rodger on Unsplash)

Final notes

Be aware that this solution is eventually consistent. It is hard to implement business logic over it.

For example consider the rule: Stop incrementing when the value reaches 100. You have no guarantee that the maximal value is actually 100. Since there are no key collisions, there is no reliable way to stop putting increment operations. When you call getValue(), you get increment value from the last block in the chain — not the current one. If the increment() is called in the same block, the getValue() returns obsolete data.

Right now the no duplicated keys solution was applied, however it comes with expensive reads (the getValue() method that counts all operations) and eventual consistency (no key collision validation).

We can improve the read performance using running total, i.e. we can have a single process that counts all operations older than specific time offset (for example older than an hour), removes them (or marks them inactive, obsolete etc.) and adds the calculated count to incrementer state, kept under the separate key. In this case we can improve the availability of the reads, but still — it won’t be the most recent value.

The other two approaches: put requests in queue and batching are somehow similar. In both cases you can have strong consistency of the data and it is safe to implement business logic over it. Furthermore, in both cases you need to be careful with the keys on the client side and invoke potentially conflicting smart contracts from a single thread. However, when you start batching, you get two major improvements:

The throughput increases dramatically.
Less information about chaincode logic is required in the client side.

Still, you need to figure out, how to preserve the idempotency or how to track intermediate state changes. There are quite a few new challenges to be faced. You can read about them in the second part in the series, dedicated to batching.

The third article in the series is about running total with more complex example of transferring assets.

Learn how to simplify your work with Hyperledger Fabric using Fablo open source project.