Coordination between multiple microservices
We have always offered Master´s Thesis opportunities to university students to support their academic and professional qualifications. There are more than 40 Master´s Theses completed for Vertex during the years. The research results are implemented in our product development to provide cutting-edge software solutions for our customers. In the “My Master´s Thesis Journey” blog series, our young professionals tell about their Master´s theses and what they have learned and accomplished during the journey.
I started my journey at Vertex Systems in 2020 as a summer trainee and continued working part-time while heading toward the end of my studies. Finally, at the beginning of 2022, it was my time to start the thesis project which I had been waiting for a long time. As I had been developing the Vertex Sync for almost my entire working time, I had a desire for a topic that would somehow relate to Sync’s technologies or possible future solutions. Fortunately, I was able to influence the topic for it to be interesting to me but also beneficial to Vertex.
Together with our CTO and Sync’s architect, we started to think about the possible topic which after a few twists and turns finally took the form of “Coordination between multiple microservices”. My original idea was to make a technical implementation or proof-of-concept related to the topic, but after talking with my supervisor from university, I moved more towards the theoretical direction, which fortunately also suited Vertex. So, in the end, the goal of my work was to implement a systematic mapping study related to managing transactions between multiple microservices. The task seemed quite overwhelming at first as this was my first attempt at systematic reviews, but thanks to the great advice from my supervisors, I still got up to speed quite fast.
The usage of microservice architecture brings multiple benefits such as independent deployment and the possibility to choose the most suitable technologies for each service. However, as with each architectural pattern, there are also disadvantages such as possible issues with data consistency especially when the independence of services is improved by using the database-per-service pattern. As said, this can potentially cause problems when data needs to be updated in multiple databases in coordination as it is no longer possible to use traditional database transactions.
Fortunately, multiple design patterns could be used to help with this problem. To achieve similar transactional guarantees as with database transactions, distributed transaction protocols, such as a Two-phase commit (2PC), could be used. In the first phase, the coordinator asks for participating services to prepare for the transaction which includes locking necessary resources. If all participants can participate, the coordinator sends a commit-message in the second phase. If any of the services cannot participate, the coordinator sends a rollback message to participants that prepared for the coordination as seen in figure 1.
Figure 1: Phases of two-phase commit *
However, usually, this is not the preferred solution in microservice architecture due to performance issues and a trade-off where the availability of the system decreases when strict consistency is sought. My mapping study showed that developers are willing to relax consistency to achieve higher availability and increased performance. Due to this, the saga pattern seemed to be the most prominent solution to manage coordination.
Figure 1: Phases of two-phase commit *
In the saga pattern, coordination is divided into smaller microservice level sub-transactions which run in a sequence to reach the desired outcome. The objective of the saga pattern is either to successfully finish all sub-transactions or to return to the initial state with reverse operations as seen in figure 2. Since all participating services do not need to be available at the same time for coordination to be successful, availability and performance are increased in comparison to strict consistency protocols. Of course, there are also trade-offs to this pattern such as only offering eventual consistency and missing isolation which causes intermediate states to be visible during the process and risk of lost updates.
There are multiple ways to manage problematic parts of the saga pattern. For example, missing isolation can be managed using semantic locking or rereading values before updates but as always implementation context should be thought.
So, summarized. The saga pattern might be the way to go if strict consistency is not required. However, when it is required, developers should turn to patterns such as 2PC or Try/Cancel-Confirm which is somewhere between the saga pattern and 2PC in terms of consistency. Another possibility is to eliminate the need for coordination altogether by, for example, using a shared database. As it can be seen, there is no “silver bullet” for the problem as the most suitable solution depends on the requirements, which is why it is important to know what trade-offs are worth making in different situations.
Even though the thesis was purely theoretical, it should give more information for our software architects to find suitable patterns for arising situations whenever the coordination of microservices is concerned.
In my thesis, I go into much more detail about each design pattern, and how those can be implemented to give a more comprehensive overview of each solution. Also, future directions and the current stage of the research are reviewed to find out if any new promising solutions have emerged in the field.
* Figures are modified from HERE