Using Software Transactional Memory in Quarkus

软件事务内存 (STM) 自 1990 年代后期以来就在研究环境中出现,最近才开始出现在产品和各种编程语言中。我们不会深入探讨 STM 背后的所有细节,但有兴趣的读者可以查看 this paper。然而,足以说的是,STM 提供了一种在高度并发的环境中开发事务应用程序的方法,具有某些与 ACID 事务相同特性,而你可能已经通过 JTA 使用 ACID 事务。但重要的是,持久性属性在 STM 实现中被放宽(删除),或者至少使其成为可选项。这与 JTA 的情况不同,其中状态更改对关系数据库持久化,并支持 the X/Open XA standard

Software Transactional Memory (STM) has been around in research environments since the late 1990’s and has relatively recently started to appear in products and various programming languages. We won’t go into all the details behind STM but the interested reader could look at this paper. However, suffice it to say that STM offers an approach to developing transactional applications in a highly concurrent environment with some of the same characteristics of ACID transactions, which you’ve probably already used through JTA. Importantly though, the Durability property is relaxed (removed) within STM implementations, or at least made optional. This is not the situation with JTA, where state changes are made durable to a relational database which supports the X/Open XA standard.

请注意,Quarkus 提供的 STM 实现基于 Narayana STM实现。该文档并非打算替代该项目的文档,所以你可能需要查看它以了解更多详情。不过,我们将更专注于在开发 Kubernetes 原生应用程序和微服务时,如何将某些关键功能组合到 Quarkus 中。

Note that the STM implementation provided by Quarkus is based on the Narayana STM implementation. This document isn’t meant to be a replacement for that project’s documentation, so you may want to look at it for more detail. However, we will try to focus more on how you can combine some key capabilities into Quarkus when developing Kubernetes native applications and microservices.

Why use STM with Quarkus?

现在你仍然可能会问自己“为什么是 STM,而不是 JTA?”或“STM 有哪些好处,而 JTA 没有?”让我们尝试回答这些问题或类似问题,特别关注我们认为它们非常适合 Quarkus、微服务和 Kubernetes 原生应用程序的原因。所以没有特定的顺序……

Now you may still be asking yourself "Why STM instead of JTA?" or "What are the benefits to STM that I don’t get from JTA?" Let’s try to answer those or similar questions, with a particular focus on why we think they’re great for Quarkus, microservices and Kubernetes native applications. So in no specific order …​

  • The goal of STM is to simplify object reads and writes from multiple threads/protect state from concurrent updates. The Quarkus STM implementation will safely manage any conflicts between these threads using whatever isolation model has been chosen to protect that specific state instance (object in the case of Quarkus). In Quarkus STM, there are two isolation implementations, pessimistic (the default), which would cause conflicting threads to be blocked until the original has completed its updates (committed or aborted the transaction); then there’s the optimistic approach which allows all the threads to proceed and checks for conflicts at commit time, where one or more of the threads may be forced to abort if there have been conflicting updates.

  • STM objects have state, but it doesn’t need to be persistent (durable). In fact the default behaviour is for objects managed within transactional memory to be volatile, such that if the service or microservice within which they are being used crashes or is spawned elsewhere, e.g., by a scheduler, all state in memory is lost and the objects start from scratch. But surely you get this and more with JTA (and a suitable transactional datastore) and don’t need to worry about restarting your application? Not quite. There’s a trade-off here: we’re doing away with persistent state and the overhead of reading from and then writing (and sync-ing) to the datastore during each transaction. This makes updates to (volatile) state very fast, but you still get the benefits of atomic updates across multiple STM objects (e.g., objects your team wrote then calling objects you inherited from another team and requiring them to make all-or-nothing updates), as well as consistency and isolation in the presence of concurrent threads/users (common in distributed microservices architectures). Furthermore, not all stateful applications need to be durable - even when JTA transactions are used, it tends to be the exception and not the rule. And as you’ll see later, because applications can optionally start and control transactions, it’s possible to build microservices which can undo state changes and try alternative paths.

  • Another benefit of STM is composability and modularity. You can write concurrent Quarkus objects/services that can be easily composed with any other services built using STM, without exposing the details of how the objects/services are implemented. As we discussed earlier, this ability to compose objects you wrote with those other teams may have written weeks, months or years earlier, and have A, C and I properties can be hugely beneficial. Furthermore, some STM implementations, including the one Quarkus uses, support nested transactions and these allow changes made within the context of a nested (sub) transaction to later be rolled back by the parent transaction.

  • Although the default for STM object state is volatile, it is possible to configure the STM implementation such that an object’s state is durable. Although it’s possible to configure Narayana such that different backend datastores can be used, including relational databases, the default is the local operating system file system, which means you don’t need to configure anything else with Quarkus such as a database.

  • Many STM implementations allow "plain old language objects" to be made STM-aware with little or no changes to the application code. You can build, test and deploy applications without wanting them to be STM-aware and then later add those capabilities if they become necessary and without much development overhead at all.

Building STM applications

快速启动中还有一个完整的工作示例,你可以通过克隆 Git 存储库:git clone {quickstarts-clone-url},或下载一个 {quickstarts-archive-url}[存档]来访问它。寻找 `software-transactional-memory-quickstart`示例。这将有助于你了解如何使用 Quarkus 构建 STM 感知应用程序。然而,在我们这样做之前,有一些基本概念我们需要涵盖。

There is also a fully worked example in the quickstarts which you may access by cloning the Git repository: git clone {quickstarts-clone-url}, or by downloading an {quickstarts-archive-url}[archive]. Look for the software-transactional-memory-quickstart example. This will help to understand how you can build STM-aware applications with Quarkus. However, before we do so there are a few basic concepts which we need to cover.

请注意,如你所见,Quarkus 中的 STM 依赖于很多注释来定义行为。缺少这些注释会导致默认情况下假设明智默认值,但开发人员必须了解这些默认值是什么非常重要。请参阅 Narayana STM manualSTM annotations guide,以更详细地了解 Narayana STM 提供的所有注释。

Note, as you will see, STM in Quarkus relies on a number of annotations to define behaviours. The lack of these annotations causes sensible defaults to be assumed, but it is important for the developer to understand what these may be. Please refer to the Narayana STM manual and the STM annotations guide for more details on all the annotations Narayana STM provides.

Unresolved directive in software-transactional-memory.adoc - include::{includes}/extension-status.adoc[]

Setting it up

要使用扩展,请将其作为依赖项包含在你的构建文件中:

To use the extension include it as a dependency in your build file:

pom.xml
<dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-narayana-stm</artifactId>
</dependency>
build.gradle
implementation("io.quarkus:quarkus-narayana-stm")

Defining STM-aware classes

为了让 STM 子系统了解哪些类将在事务内存的上下文中被管理,必须提供最低程度的检测。这是通过接口边界对 STM 感知类和 STM unaware 类进行分类来实现的;具体地说,所有 STM 感知对象必须是继承自已被注释为 STM 感知类的接口的类的实例。不遵循此规则的任何其他对象(及其类)将不会被 STM 子系统管理,因此,它们的任何状态更改都不会被回滚。

In order for the STM subsystem to have knowledge about which classes are to be managed within the context of transactional memory it is necessary to provide a minimal level of instrumentation. This occurs by categorising STM-aware and STM-unaware classes through an interface boundary; specifically all STM-aware objects must be instances of classes which inherit from interfaces that themselves have been annotated to identify them as STM-aware. Any other objects (and their classes) which do not follow this rule will not be managed by the STM subsystem and hence any of their state changes will not be rolled back, for example.

STM 及其感知应用程序接口必须使用的特定注释是 org.jboss.stm.annotations.Transactional。例如:

The specific annotation that STM-aware application interfaces must use is org.jboss.stm.annotations.Transactional. For example:

@Transactional
public interface FlightService {
    int getNumberOfBookings();
    void makeBooking(String details);
}

实现此接口的类可以使用 Narayana 中的其他注释来告知 STM 子系统,例如方法是否将修改对象的 state,或者类中的哪些 state 变量应通过事务进行管理(例如某些实例变量可能不必在事务中止时回滚)。如前所述,如果这些注释不存在,则会选择默认值以确保安全性,例如假设所有方法都将修改 state。

Classes which implement this interface are able to use additional annotations from Narayana to tell the STM subsystem about things such as whether a method will modify the state of the object, or what state variables within the class should be managed transactionally, e.g., some instance variables may not need to be rolled back if a transaction aborts. As mentioned earlier, if those annotations are not present then defaults are chosen to guarantee safety, such as assuming all methods will modify state.

public class FlightServiceImpl implements FlightService {
    @ReadLock
    public int getNumberOfBookings() { ... }
    public void makeBooking(String details) {...}

    @NotState
    private int timesCalled;
}

例如,通过对 getNumberOfBookings 方法使用 @ReadLock 注释,我们能够告诉 STM 子系统,在事务性存储器中使用此对象时,将不会发生任何 state 修改。另外,@NotState 注释告诉系统在事务提交或中止时忽略 timesCalled,因此此值仅因应用程序代码而改变。

For example, by using the @ReadLock annotation on the getNumberOfBookings method, we are able to tell the STM subsystem that no state modifications will occur in this object when it is used in the transactional memory. Also, the @NotState annotation tells the system to ignore timesCalled when transactions commit or abort, so this value only changes due to application code.

请参阅 Narayana 指南,了解如何对标记有 @Transactional 注释的接口实现对象的交易行为进行更精细的控制。

Please refer to the Narayana guide for details of how to exert finer grained control over the transactional behaviour of objects that implement interfaces marked with the @Transactional annotation.

Creating STM objects

需要告知 STM 子系统它应该管理哪些对象。Quarkus(又称 Narayana)STM 实现通过提供这些对象实例驻留的事务性存储器容器来实现此目的。在对象被放置在其中一个 STM 容器中之前,无法在事务中管理它,并且任何 state 更改都不会具备 A、C、I(甚至 D)属性。

The STM subsystem needs to be told about which objects it should be managing. The Quarkus (aka Narayana) STM implementation does this by providing containers of transactional memory within which these object instances reside. Until an object is placed within one of these STM containers it cannot be managed within transactions and any state changes will not possess the A, C, I (or even D) properties.

请注意,“容器”一词是在 Linux 容器出现之前的几年由 STM 实现中定义的。这可能会造成混淆,尤其是在像 Quarkus 这样的 Kubernetes 原生环境中,但希望读者能够进行思维映射。

Note, the term "container" was defined within the STM implementation years before Linux containers came along. It may be confusing to use especially in a Kubernetes native environment such as Quarkus, but hopefully the reader can do the mental mapping.

默认 STM 容器 (org.jboss.stm.Container) 为只能在同一微服务/JVM 实例中的线程之间共享的易失性对象提供支持。当一个 STM 感知对象被放置在容器中时,它会返回一个句柄,以后该对象应通过该句柄使用。使用此句柄非常重要,因为继续通过原始引用访问对象将不允许 STM 子系统跟踪访问和管理 state 以及并发控制。

The default STM container (org.jboss.stm.Container) provides support for volatile objects that can only be shared between threads in the same microservice/JVM instance. When an STM-aware object is placed into the container it returns a handle through which that object should then be used in the future. It is important to use this handle as continuing to access the object through the original reference will not allow the STM subsystem to track access and manage state and concurrency control.

    import org.jboss.stm.Container;

    ...

    Container<FlightService> container = new Container<>(); 1
    FlightServiceImpl instance = new FlightServiceImpl(); 2
    FlightService flightServiceProxy = container.create(instance); 3
1 You need to tell each Container about the type of objects for which it will be responsible. In this example it will be instances that implement the FlightService interface.
2 Then you create an instance that implements FlightService. You should not use it directly at this stage because access to it is not being managed by the STM subsystem.
3 To obtain a managed instance, pass the original object to the STM container which then returns a reference through which you will be able to perform transactional operations. This reference can be used safely from multiple threads.

Defining transaction boundaries

一旦将对象放置在 STM 容器中,应用程序开发人员就可以管理它所在的事务的范围。有些注释可以应用于 STM 感知类,以便在调用特定方法时自动创建事务。

Once an object is placed within an STM container the application developer can manage the scope of transactions within which it is used. There are some annotations which can be applied to the STM-aware class to have the container automatically create a transaction whenever a specific method is invoked.

Declarative approach

如果在方法签名上放置 @NestedTopLevel@Nested 注释,则 STM 容器将在调用该方法时启动一个新事务,并在方法返回时尝试提交它。如果调用线程已关联了一个事务,则这两个注释中的每个注释的行为略有不同:前一个注释将始终创建在其中方法将执行的新顶级事务,因此包含事务不表现为父级,即嵌套事务将独立提交或中止;后者注释将在调用事务中正确嵌套地创建事务,即事务充当新建事务的父级。

If the @NestedTopLevel or @Nested annotation is placed on a method signature then the STM container will start a new transaction when that method is invoked and attempt to commit it when the method returns. If there is a transaction already associated with the calling thread then each of these annotations behaves slightly differently: the former annotation will always create a new top-level transaction within which the method will execute, so the enclosing transaction does not behave as a parent, i.e., the nested top-level transaction will commit or abort independently; the latter annotation will create a transaction with is properly nested within the calling transaction, i.e., that transaction acts as the parent of this newly created transaction.

Programmatic approach

应用程序可以在访问 STM 对象的方法之前通过编程启动事务:

The application can programmatically start a transaction before accessing the methods of STM objects:

AtomicAction aa = new AtomicAction(); 1

aa.begin(); 2
{
    try {
        flightService.makeBooking("BA123 ...");
        taxiService.makeBooking("East Coast Taxis ..."); 3
        4
        aa.commit();
        5
    } catch (Exception e) {
        aa.abort(); 6
    }
}
1 An object for manually controlling transaction boundaries (AtomicAction and many other useful classes are included in the extension). Refer to the javadoc for more detail.
2 Programmatically begin a transaction.
3 Notice that object updates can be composed which means that updates to multiple objects can be committed together as a single action. [Note that it is also possible to begin nested transactions so that you can perform speculative work which may then be abandoned without abandoning other work performed by the outer transaction].
4 Since the transaction has not yet been committed the changes made by the flight and taxi services are not visible outside the transaction.
5 Since the commit was successful the changes made by the flight and taxi services are now visible to other threads. Note that other transactions that relied on the old state may or may not now incur conflicts when they commit (the STM library provides a number of features for managing conflicting behaviour and these are covered in the Narayana STM manual).
6 Programmatically decide to abort the transaction which means that the changes made by the flight and taxi services are discarded.

Distributed transactions

在多个服务之间共享事务是可能的,但目前仅是高级用例,并且如果需要此行为,则应咨询 Narayana 文档。具体来说,STM 还不支持 Context Propagation guide 中描述的功能。

Sharing a transaction between multiple services is possible but is currently an advanced use case only and the Narayana documentation should be consulted if this behaviour is required. In particular, STM does not yet support the features described in the Context Propagation guide.