Native Reference Guide
本指南是Building a Native Executable.Using SSL With Native Images, 和 Writing Native Applications指南的配套指南。它探讨了一些高级主题,帮助用户诊断问题,提高稳定性,并改善本地可执行文件的运行时性能。以下是本指南中可找到的高级部分。
This guide is a companion to the Building a Native Executable, Using SSL With Native Images, and Writing Native Applications, guides. It explores advanced topics that help users diagnose issues, increase the reliability and improve the runtime performance of native executables. These are the high level sections to be found in this guide:
Native Memory Management
GraalVM 的 SubstrateVM 运行时系统启用了 Quarkus 本地可执行文件的内存管理。
Memory management for Quarkus native executables is enabled by GraalVM’s SubstrateVM runtime system.
有关 GraalVM 中内存管理组件的详细说明,请参见 GraalVM Memory Management指南。
For detailed explanations about the memory management component in GraalVM, see the GraalVM Memory Management guide.
本指南对 GraalVM 网站上的信息进行了补充,增加了与 Quarkus 应用特别相关的其他内容。
This guide complements the information available in the GraalVM website with further observations particularly relevant to Quarkus applications.
Garbage Collectors
当前 Quarkus 用户可用的垃圾收集器是 Serial GC 和 Epsilon GC。
The garbage collectors available for Quarkus users are currently Serial GC and Epsilon GC.
Serial GC
Serial GC 是 GraalVM 和 Quarkus 中的默认选项,它是一个单线程非并发 GC,就像 HotSpot 的 Serial GC。然而,GraalVM 中的实现不同于 HotSpot,并且运行时行为可能存在显著差异。
Serial GC, the default option in GraalVM and Quarkus, is a single-threaded non-concurrent GC, just like HotSpot’s Serial GC. The implementation in GraalVM however is different from the HotSpot one, and there can be significant differences in the runtime behavior.
HotSpot 的 Serial GC 和 GraalVM 的 Serial GC 之间的一个主要区别在于它们执行完全 GC 周期的方式。在 HotSpot 中,所使用的算法是标记-清除-压缩,而在 GraalVM 中则是标记-复制。两者都需要遍历所有活动对象,但在标记-复制中,此遍历还用于将活动对象复制到辅助空间或半空间中。当对象从一个半空间复制到另一个半空间时,它们也会被压缩。在标记-清除-压缩中,压缩需要活动对象上的二次遍历。这使得标记-复制中的完全 GC 比标记-清除-压缩更节省时间(就每个 GC 周期中花费的时间而言)。标记-复制做出的权衡是为了使各个完全 GC 周期更短,以节省空间。半空间的使用意味着,为了让应用保持与标记-清除相同的 GC 性能(以每秒分配的 MB 为单位),它需要两倍的内存。
One of the key differences between HotSpot’s Serial GC and GraalVM’s Serial GC is the way they perform full GC cycles. In HotSpot the algorithm used is mark-sweep-compact whereas in GraalVM it is mark-copy. Both need to traverse all live objects, but in mark-copy this traversal is also used to copy live objects to a secondary space or semi-space. As objects are copied from one semi-space to another they’re also compacted. In mark-sweep-compact, the compacting requires a second pass on the live objects. This makes full GCs in mark-copy more time efficient (in terms of time spent in each GC cycle) than mark-sweep-compact. The tradeoff mark-copy makes in order to make individual full GC cycles shorter is space. The use of semi-spaces means that for an application to maintain the same GC performance that mark-sweep achieves (in terms of allocated MB per second), it requires double the amount of memory.
GC Collection Policy
GraalVM 的 Serial GC 实现提供两个不同收集策略供选择,默认策略称为“自适应”策略,另一种策略称为“空间/时间”策略。
GraalVM’s Serial GC implementation offers a choice between two different collection policies, the default is called "adaptive" and the alternative is called "space/time".
“自适应”收集策略基于 HotSpot 的 ParallelGC 自适应大小策略。与 HotSpot 的主要不同之处在于 GraalVM 专注于内存占用。这意味着 GraalVM 的自适应 GC 策略会尝试积极触发 GC,以降低内存消耗。
The “adaptive” collection policy is based on HotSpot’s ParallelGC adaptive size policy. The main difference with HotSpot is GraalVM’s focus on memory footprint. This means that GraalVM’s adaptive GC policy tries to aggressively trigger GCs in order to keep memory consumption down.
在 2.13 版本之前,Quarkus 默认使用“空间/时间”GC 收集策略,但是从 2.14 版本开始,它改用“自适应”策略。Quarkus 最初选择使用“空间/时间”策略的原因是,当时它比“自适应”策略有了相当大的性能提升。然而,最近的性能实验表明,“空间/时间”策略与“自适应”策略相比,可能会带来更差的开箱即用体验,同时,在“自适应”策略得到改进之后,它曾经提供的优势也已经大大降低。因此,“自适应”策略似乎是大多数(如果不是全部)Quarkus 应用程序的最佳选择。可以在 this issue 中阅读有关此切换的详细信息。
Up to version 2.13, Quarkus used the “space/time” GC collection policy by default, but starting with version 2.14, it switched to using the “adaptive” policy instead. The reason why Quarkus initially chose to use "space/time" is because at that time it had considerable performance improvements over "adaptive". Recent performance experiments, however, indicate that the "space/time" policy can result in worse out-of-the-box experience compared to the "adaptive" policy, while at the same time the benefits it used to offer have diminished considerably after improvements made to the "adaptive" policy. As a result, the "adaptive" policy appears to be the best option for most, if not all, Quarkus applications. Full details on this switch can be read in this issue.
仍然可以使用 GraalVM 的 -H:InitialCollectionPolicy
标志更改 GC 收集策略。可以通过以下命令行方式切换到“空间/时间”策略:
It is still possible to change the GC collection policy using GraalVM’s -H:InitialCollectionPolicy
flag.
Switching to the "space/time" policy can be done by passing the following via command line:
-Dquarkus.native.additional-build-args=-H:InitialCollectionPolicy=com.oracle.svm.core.genscavenge.CollectionPolicy\$BySpaceAndTime
或者将此内容添加到 application.properties
文件中:
Or adding this to the application.properties
file:
quarkus.native.additional-build-args=-H:InitialCollectionPolicy=com.oracle.svm.core.genscavenge.CollectionPolicy$BySpaceAndTime
如果在 Bash 中通过命令行传递,则需要转义 Escaping the |
Epsilon GC
Epsilon GC 是一款无操作垃圾回收器,它不会进行任何内存回收。从 Quarkus 的角度来看,此垃圾回收器最相关的用例之一是寿命极短的作业,例如无服务器函数。要使用 epsilon GC 构建 Quarkus native,请在构建时传递以下参数:
Epsilon GC is a no-op garbage collector which does not do any memory reclamation. From a Quarkus perspective, some of the most relevant use cases for this garbage collector are extremely short-lived jobs, e.g. serverless functions. To build Quarkus native with epsilon GC, pass the following argument at build time:
-Dquarkus.native.additional-build-args=--gc=epsilon
Memory Management Options
有关控制最大堆大小、年轻代和其他典型 JVM 用例的选项的信息,请参阅 GraalVM Memory Management 指南。通常建议将最大堆大小设置为百分比或显式值。
For information about options to control maximum heap size, young space, and other typical use cases found in the JVM, see the GraalVM Memory Management guide. Setting the maximum heap size, either as a percentage or an explicit value, is generally recommended.
GC Logging
有多种选项可用于打印有关垃圾回收周期的信息,具体取决于所需的详细信息级别。-XX:+PrintGC
提供了最少的详细信息,它会为发生的每一个 GC 周期打印一条消息:
Multiple options exist to print information about garbage collection cycles, depending on the level of detail required.
The minimum detail is provided -XX:+PrintGC
, which prints a message for each GC cycle that occurs:
$ quarkus-project-0.1-SNAPSHOT-runner -XX:+PrintGC -Xmx64m
...
[Incremental GC (CollectOnAllocation) 20480K->11264K, 0.0003223 secs]
[Full GC (CollectOnAllocation) 19456K->5120K, 0.0031067 secs]
当将此选项与 -XX:+VerboseGC
结合使用时,你仍然会为每个 GC 周期收到一条消息,但它包含额外信息。此外,添加此选项还会显示 GC 算法在启动时做出的调整决策:
When you combine this option with -XX:+VerboseGC
you still get a message per GC cycle,
but it contains extra information.
Also, adding this option shows the sizing decisions made by the GC algorithm at startup:
$ quarkus-project-0.1-SNAPSHOT-runner -XX:+PrintGC -XX:+VerboseGC -Xmx64m
[Heap policy parameters:
YoungGenerationSize: 25165824
MaximumHeapSize: 67108864
MinimumHeapSize: 33554432
AlignedChunkSize: 1048576
LargeArrayThreshold: 131072]
...
[[5378479783321 GC: before epoch: 8 cause: CollectOnAllocation]
[Incremental GC (CollectOnAllocation) 16384K->9216K, 0.0003847 secs]
[5378480179046 GC: after epoch: 8 cause: CollectOnAllocation policy: adaptive type: incremental
collection time: 384755 nanoSeconds]]
[[5379294042918 GC: before epoch: 9 cause: CollectOnAllocation]
[Full GC (CollectOnAllocation) 17408K->5120K, 0.0030556 secs]
[5379297109195 GC: after epoch: 9 cause: CollectOnAllocation policy: adaptive type: complete
collection time: 3055697 nanoSeconds]]
除了这两个选项之外,-XX:+PrintHeapShape
和 -XX:+TraceHeapChunks
还提供了有关不同内存区域之上构建的内存块的更底层详细信息。
Beyond these two options, -XX:+PrintHeapShape
and -XX:+TraceHeapChunks
provide even lower level details about memory chunks on top of which the different memory regions are constructed.
可以通过打印可以传递给本机可执行文件的标志列表来获取有关 GC 日志记录标志的最新信息:
The most up-to-date information on GC logging flags can be obtained by printing the list of flags that can be passed to native executables:
$ quarkus-project-0.1-SNAPSHOT-runner -XX:PrintFlags=
...
-XX:±PrintGC Print summary GC information after each collection. Default: - (disabled).
-XX:±PrintGCSummary Print summary GC information after application main method returns. Default: - (disabled).
-XX:±PrintGCTimeStamps Print a time stamp at each collection, if +PrintGC or +VerboseGC. Default: - (disabled).
-XX:±PrintGCTimes Print the time for each of the phases of each collection, if +VerboseGC. Default: - (disabled).
-XX:±PrintHeapShape Print the shape of the heap before and after each collection, if +VerboseGC. Default: - (disabled).
...
-XX:±TraceHeapChunks Trace heap chunks during collections, if +VerboseGC and +PrintHeapShape. Default: - (disabled).
-XX:±VerboseGC Print more information about the heap before and after each collection. Default: - (disabled).
Resident Set Size (RSS)
如 Measuring Performance guide 中所述,使用驻留集大小(RSS)测量 Quarkus 应用程序的占用。这也适用于本机应用程序,但在此情况下管理占用的运行时引擎内置于本机可执行文件中,而不是 JVM 中。
As described in the Measuring Performance guide, the footprint of Quarkus applications is measured using the resident set size (RSS). This is also applicable to native applications, but the runtime engine that manages the footprint in this case is built in the native executable itself rather than the JVM.
Measuring Performance guide 中指定的报告技术也适用于本机应用程序,但导致 RSS 较高或较低的原因取决于生成的本机可执行文件的运行方式。
The reporting techniques specified in the Measuring Performance guide are applicable to native applications too, but what causes the RSS to be higher or lower is specific to how the generated native executables work.
当应用程序的一个本机版本中的 RSS 高于另一个本机版本中的 RSS 时,应首先执行以下检查:
When the RSS is higher in one native version of the application versus another, the following checks should be carried out first:
-
Check the native-reports and see if there are big discrepancies in the number of used packages, used classes or used methods. A bigger universe will result in bigger memory footprint.
-
Check the size of the binary for differences. Using
readelf
you can observe the size of different sections and compare them. The.text
section where code lives, and the.svm_heap
section where heap produced at build time lives, are particularly interesting. -
Generate heap-dumps and inspect them with tools such as VisualVM or Eclipse MAT.
通常,对应用程序进行分析、检测或追踪是弄清楚事物如何工作的最佳方式。对于 RSS 和本机应用程序,Brendan Gregg 在 "Memory Leak (and Growth) Flame Graphs" 指南中解释的技术特别有用。本部分将应用该文章中的信息,展示如何使用 perf
和 bcc/eBPF 来了解导致 Quarkus 本机可执行文件在启动时消耗内存的原因。
Often profiling, instrumenting or tracing applications is the best way to figure out how things work.
In the case of RSS and native applications,
the techniques that Brendan Gregg explains in the
"Memory Leak (and Growth) Flame Graphs" guide are particularly useful.
This section will apply the information in that article to show how to use perf
and
bcc/eBPF
to understand what causes Quarkus native executables to consume memory on startup.
Perf
perf
适用于旧版 Linux 系统,而 eBPF 则需要一个较新的 Linux 内核。perf
的开销高于 eBPF,但是它能理解使用 DWARF 调试符号生成的堆栈跟踪,而 eBPF 无法理解这种堆栈跟踪。
perf
works in older Linux systems, whereas eBPF requires a newer Linux kernel.
The overhead of perf
is higher than eBPF,
but it can understand stack traces generated with DWARF debug symbols, which eBPF can’t.
在 GraalVM 上下文中,DWARF 堆栈跟踪包含更多的详细信息,而且比那些用帧指针生成的堆栈跟踪更容易理解。作为第一步,使用调试信息启用和一些额外标记生成一个 Quarkus 原生可执行文件。禁用优化的一个标记,以及不会从堆栈跟踪中省略内联方法的另一个标记。添加这两个标记是为了获得包含尽可能多信息的堆栈跟踪。
In the context of GraalVM, DWARF stack traces contain more detail and are easier to understand than those generated with frame pointers. As first step, build a Quarkus native executable with debug info enabled and a couple of extra flags. One flag to disable optimizations, and another to avoid inlined methods being omitted from the stack traces. These two flags have been added to obtain stack traces that contain as much information as possible.
$ mvn package -DskipTests -Dnative \
-Dquarkus.native.debug.enabled \
-Dquarkus.native.additional-build-args=-O0,-H:-OmitInlinedMethodDebugLineInfo
禁用优化可以更轻松地了解如何使用 perf
并尽可能获得尽可能详细的堆栈跟踪,因为它显示了有关在何处调用什么的更多信息。但是,这样做可能会导致更多分配发生,而如果应用了优化,则不会发生这种情况。换句话说,传入 -O0
将改变许多应用程序的分配模式,因为它禁用了优化,比如逸出分析或死代码消除。若要正确评估已在生产中部署的应用程序所做的分配,请使用默认优化 (-O2
) 运行。使用默认优化,通过 perf
获得的堆栈跟踪可能更难理解。
Disabling optimizations makes it easier to learn how to use perf
and get as detailed stack traces as possible,
because it shows more info about what gets called where.
However, doing so might lead to more allocations happening which would not happen if optimizations would have been applied.
In other words, passing in -O0
will change the allocation pattens for many applications,
because it disables optimizations such as escape analysis or dead code elimination.
To properly assess the allocations made by an application deployed in production,
run with default optimizations (-O2
).
With default optimizations the stack traces obtained with perf
maybe be harder to decipher.
让我们来测量一下在这个特定环境下,Quarkus 原生可执行文件在启动时使用了多少 RSS:
Let’s measure how much RSS a Quarkus native executable takes on startup on this particular environment:
$ ps -o pid,rss,command -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner)
PID RSS COMMAND
1915 35472 ./target/code-with-quarkus-1.0.0-SNAPSHOT-runner -Xmx128m
Quarkus 原生可执行文件在启动时消耗约 35MB RSS 的原因是什么?为了了解此数值,本部分将使用 perf
追踪对 syscalls:sys_enter_mmap
的调用。假设使用默认 GraalVM Serial Garbage Collector,此系统调用对于由 GraalVM 的 native-image
生成的原生可执行文件来说尤其有趣,因为它的内存分配方式。在由 GraalVM 的 native-image
生成的原生可执行文件中,堆使用对齐或未对齐的堆块进行分配。所有非数组对象都在线程局部对齐块中进行分配。这些块的默认大小均为 1MB。对于数组,如果其大小大于对齐块大小的 1/8,则它们将分配在未对齐堆块中,该堆块的大小取决于对象本身。线程首次分配对象或小数组时,它会请求一个对齐的堆块,并且会一直使用此堆块,直到此堆块中的空间用完,在这种情况下它将请求另一个对齐的堆块。因此,通过追踪这些系统调用,将记录最终请求新的对齐或未对齐堆块的代码路径。接下来,通过追踪 mmap
系统调用,运行 Quarkus 原生可执行文件 perf record
:
How come this Quarkus native executable consumes ~35MB RSS on startup?
To get an understanding of this number, this section will use perf
to trace calls to syscalls:sys_enter_mmap
.
Assuming the default GraalVM Serial Garbage Collector is in use, this system call is particularly interesting for native executables generated by GraalVM’s native-image
because of how it allocates heap.
In native executables generated by GraalVM’s native-image
, the heap is allocated using either aligned or unaligned heap chunks.
All non-array objects get allocated in thread local aligned chunks.
Each of these are 1MB in size by default.
For arrays, if they are bigger than 1/8 of the aligned chunk size,
they will be allocated in unaligned heap chunks which have a size dependant on the object itself.
The very first time a thread allocates an object or small array,
it will request an aligned heap chunk that it will use exclusively until it has run out of space in that chunk,
in which case it will request another aligned heap chunk.
So by tracing these system calls,
the code paths that end up requesting new aligned or unaligned heap chunks will be recorded.
Next, run the Quarkus native executable through perf record
tracing the mmap
system call:
$ sudo perf record -e syscalls:sys_enter_mmap --call-graph dwarf -a -- target/code-with-quarkus-1.0.0-SNAPSHOT-runner -Xmx128m
对齐的堆块大小可以在原生构建期间更改。可以通过 The size of the aligned heap chunks can be changed during native build time.
A custom value (in number of bytes) can be passed via the |
启动完成后,停止进程并生成堆栈:
Once the startup completes, stop the process and generate the stacks:
$ perf script > out.stacks
最后一步,使用生成的堆栈生成 flamegraph:
As a final step, generate a flamegraph with the generated stacks:
$ export FG_HOME=...
$ ${FG_HOME}/stackcollapse-perf.pl < out.stacks | ${FG_HOME}/flamegraph.pl \
--color=mem --title="mmap Flame Graph" --countname="calls" > out.svg
火焰图应该与此类似:
The flamegraph should look similar to this:
这里有几个有趣的地方需要注意:
There are several things of interest to notice there:
首先,包含对 com.oracle.svm.core.genscavenge.ThreadLocalAllocation
方法调用的堆栈跟踪与上面所述的对齐或未对齐的堆块分配相关。如前所述,对于大多数分配,这些块的默认大小为 1MB,因此它们很有趣,因为每个已分配块都会对 RSS 消耗产生相当大的影响。
First, the stack traces that contain method calls to com.oracle.svm.core.genscavenge.ThreadLocalAllocation
are related to aligned or unaligned heap chunk allocations explained above.
As noted earlier,
for the majority of allocations these chunks will be 1MB by default,
so they’re interesting because each allocated chunk has a considerable effect on the RSS consumption.
其次,在线程分配堆栈中,start_thread
下的那些尤其具有启示意义。在此环境中,考虑到传入的 -Xmx
值,Quarkus 创建了 12 个事件循环线程。除此之外,还有 6 个额外线程。所有这 18 个线程的名称都超过 16 个字符。这可以通过 ps
命令观察到:
Second, of the thread allocation stacks,
the ones under start_thread
are particularly revealing.
In this environment, taking into account the -Xmx
value passed in,
Quarkus creates 12 event loop threads.
Aside from those, there are 6 extra threads.
The names of all those 18 threads exceed 16 characters.
This can be observed via the ps
command:
$ ps -e -T | grep $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner)
2320 2320 pts/0 00:00:00 code-with-quark
2320 2321 pts/0 00:00:00 ference Handler
2320 2322 pts/0 00:00:00 gnal Dispatcher
2320 2324 pts/0 00:00:00 ecutor-thread-0
2320 2325 pts/0 00:00:00 -thread-checker
2320 2326 pts/0 00:00:00 ntloop-thread-0
2320 2327 pts/0 00:00:00 ntloop-thread-1
2320 2328 pts/0 00:00:00 ntloop-thread-2
2320 2329 pts/0 00:00:00 ntloop-thread-3
2320 2330 pts/0 00:00:00 ntloop-thread-4
2320 2331 pts/0 00:00:00 ntloop-thread-5
2320 2332 pts/0 00:00:00 ntloop-thread-6
2320 2333 pts/0 00:00:00 ntloop-thread-7
2320 2334 pts/0 00:00:00 ntloop-thread-8
2320 2335 pts/0 00:00:00 ntloop-thread-9
2320 2336 pts/0 00:00:00 tloop-thread-10
2320 2337 pts/0 00:00:00 tloop-thread-11
2320 2338 pts/0 00:00:00 ceptor-thread-0
所有这些线程所做的第一个分配是获取线程名称并进行修剪,以便使其落在内核强制的字符限制内。对于这些分配中的每一个,都有 2 个 mmap
调用,一个用于保留内存,另一个用于提交内存。在记录 syscalls:sys_enter_mmap
系统调用时,perf
实现会追踪对 _GImmap64
的调用。但此 glibc _GImmap64
实现会对 _GI__mmap64
进行另一个调用:
The very first allocation that all these threads do is taking the thread name and trimming it so that it can fall within the character limit enforced by kernels.
For each of those allocations,
there are 2 mmap
calls,
one to reserve the memory and the other to commit it.
When recording syscalls:sys_enter_mmap
system call,
the perf
implementation tracks calls to _GImmap64
.
But this glibc _GImmap64
implementation makes another call into _GI__mmap64
:
(gdb) break __GI___mmap64
(gdb) set scheduler-locking step
...
Thread 2 "code-with-quark" hit Breakpoint 1, __GI___mmap64 (offset=0, fd=-1, flags=16418, prot=0, len=2097152, addr=0x0) at ../sysdeps/unix/sysv/linux/mmap64.c:58
58 return (void *) MMAP_CALL (mmap, addr, len, prot, flags, fd, offset);
(gdb) bt
#0 __GI___mmap64 (offset=0, fd=-1, flags=16418, prot=0, len=2097152, addr=0x0) at ../sysdeps/unix/sysv/linux/mmap64.c:58
#1 __GI___mmap64 (addr=0x0, len=2097152, prot=0, flags=16418, fd=-1, offset=0) at ../sysdeps/unix/sysv/linux/mmap64.c:46
#2 0x00000000004f4033 in com.oracle.svm.core.posix.headers.Mman$NoTransitions::mmap (__0=<optimized out>, __1=<optimized out>, __2=<optimized out>, __3=<optimized out>, __4=<optimized out>, __5=<optimized out>)
#3 0x00000000004f194e in com.oracle.svm.core.posix.PosixVirtualMemoryProvider::reserve (this=0x7ffff7691220, nbytes=0x100000, alignment=0x100000, executable=false) at com/oracle/svm/core/posix/PosixVirtualMemoryProvider.java:126
#4 0x00000000004ef3b3 in com.oracle.svm.core.os.AbstractCommittedMemoryProvider::allocate (this=0x7ffff7658cb0, size=0x100000, alignment=0x100000, executable=false) at com/oracle/svm/core/os/AbstractCommittedMemoryProvider.java:124
#5 0x0000000000482f40 in com.oracle.svm.core.os.AbstractCommittedMemoryProvider::allocateAlignedChunk (this=0x7ffff7658cb0, nbytes=0x100000, alignment=0x100000) at com/oracle/svm/core/os/AbstractCommittedMemoryProvider.java:107
#6 com.oracle.svm.core.genscavenge.HeapChunkProvider::produceAlignedChunk (this=0x7ffff7444398) at com/oracle/svm/core/genscavenge/HeapChunkProvider.java:112
#7 0x0000000000489485 in com.oracle.svm.core.genscavenge.ThreadLocalAllocation::slowPathNewArrayLikeObject0 (hub=0x7ffff6ff6110, length=15, size=0x20, podReferenceMap=0x7ffff6700000) at com/oracle/svm/core/genscavenge/ThreadLocalAllocation.java:306
#8 0x0000000000489165 in com.oracle.svm.core.genscavenge.ThreadLocalAllocation::slowPathNewArrayLikeObject (objectHeader=0x8f6110 <io.smallrye.common.expression.ExpressionNode::toString+160>, length=15, podReferenceMap=0x7ffff6700000) at com/oracle/svm/core/genscavenge/ThreadLocalAllocation.java:279
#9 0x0000000000489066 in com.oracle.svm.core.genscavenge.ThreadLocalAllocation::slowPathNewArray (objectHeader=0x8f6110 <io.smallrye.common.expression.ExpressionNode::toString+160>, length=15) at com/oracle/svm/core/genscavenge/ThreadLocalAllocation.java:242
#10 0x0000000000d202a1 in java.util.Arrays::copyOfRange (original=0x7ffff6a33410, from=2, to=17) at java/util/Arrays.java:3819
#11 0x0000000000acf8e6 in java.lang.StringLatin1::newString (val=0x7ffff6a33410, index=2, len=15) at java/lang/StringLatin1.java:769
#12 0x0000000000acac59 in java.lang.String::substring (this=0x7ffff6dc0d48, beginIndex=2, endIndex=17) at java/lang/String.java:2712
#13 0x0000000000acaba2 in java.lang.String::substring (this=0x7ffff6dc0d48, beginIndex=2) at java/lang/String.java:2680
#14 0x00000000004f96cd in com.oracle.svm.core.posix.thread.PosixPlatformThreads::setNativeName (this=0x7ffff7658d10, thread=0x7ffff723fb30, name=0x7ffff6dc0d48) at com/oracle/svm/core/posix/thread/PosixPlatformThreads.java:163
#15 0x00000000004f9285 in com.oracle.svm.core.posix.thread.PosixPlatformThreads::beforeThreadRun (this=0x7ffff7658d10, thread=0x7ffff723fb30) at com/oracle/svm/core/posix/thread/PosixPlatformThreads.java:212
#16 0x00000000005237a2 in com.oracle.svm.core.thread.PlatformThreads::threadStartRoutine (threadHandle=0x1) at com/oracle/svm/core/thread/PlatformThreads.java:760
#17 0x00000000004f9627 in com.oracle.svm.core.posix.thread.PosixPlatformThreads::pthreadStartRoutine (data=0x2a06e20) at com/oracle/svm/core/posix/thread/PosixPlatformThreads.java:203
#18 0x0000000000462ab0 in com.oracle.svm.core.code.IsolateEnterStub::PosixPlatformThreads_pthreadStartRoutine_38d96cbc1a188a6051c29be1299afe681d67942e (__0=<optimized out>) at com/oracle/svm/core/code/IsolateEnterStub.java:1
#19 0x00007ffff7e4714d in start_thread (arg=<optimized out>) at pthread_create.c:442
#20 0x00007ffff7ec8950 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
在 Quarkus 原生可执行文件通过 Setting |
这就是上面的火焰图显示出 thread name abbreviation 堆栈跟踪中总共有 72 次对 _GI__mmap64
的调用,因为 Quarkus 原生可执行文件运行了 18 个线程。
This is how the above flamegraph shows a total of 72 calls to _GI__mmap64
for the thread name abbreviation stack trace,
given that Quarkus native executable runs 18 threads.
第三个也是最后一个观察结果是,如果您捕获 syscalls:sys_enter_munmmap
事件,您可能会观察到某些分配还会导致对 munmap
的调用。在计算要保留的大小时,要分配的请求大小可能会四舍五入到页面大小。为了保持对齐,对于对齐的块为 1MB,对于未对齐的块为 1 字节,某些保留的内存可能未保留。这些 munmap
调用就来自这里。
A third, and final observation,
is that if you capture the syscalls:sys_enter_munmmap
event,
you might observe that some allocations also result in calls to munmap
.
When calculating the size to reserve,
the requested size to allocate can be rounded up to the page size.
The maintain alignment,
1MB in case of aligned chunks or 1 byte for unaligned chunks,
some of the reserved memory might be unreserved.
That is where these munmap
calls would come from.
只需查看火焰图并计算从线程局部分配中产生的 Just by looking at the flamegraph and counting the number of |
bcc/eBPF
能够进行堆栈跟踪的 bcc/ eBPF 版本仅从 Linux 内核 4.8 起可用。它可以在内核中生成摘要,这使其更有效且开销更低。不幸的是,它不理解 DWARF 调试符号,因此获得的信息可能更难读取且包含的详细信息更少。
A version of bcc/ eBPF that can do stack traces is only available from Linux kernel 4.8 onwards. It can do in-kernel summaries, which makes it more efficient and has lower overhead. Unfortunately it doesn’t understand DWARF debug symbols, so the information obtained might be harder to read and contain less detail.
bcc/eBPF 具有很强的可扩展性,因此可以更轻松地定制脚本以追踪特定指标。bcc
项目包含一个 stackcount
程序,该程序可以用于计数堆栈跟踪,类似于 perf
在上一部分中所做的那样。但在某些情况下,除对系统调用的调用次数之外,拥有其他指标可能更有用。malloc
就是这样一个例子。malloc
调用的数量并不重要,重要的是分配的大小。因此,与其生成显示样本计数的火焰图,可以生成显示已分配字节的火焰图。
bcc/eBPF is very extensible so it’s easier to tailor make scripts to track specific metrics.
The bcc
project contains a stackcount
program that can be used to count stack traces in similar way to what perf
did in the previous section.
But in some cases, it might be more useful to have other metrics other than number of calls to a system call.
malloc
is one such example.
The number of malloc
calls is not so important,
but rather the size of the allocations.
So rather than having a flamegraph showing sample counts,
a flamegraph can be generated that shows bytes allocated.
除了 mmap
, malloc
系统调用也存在于由 GraalVM 生成的原生可执行文件中。让我们执行 bcc/eBPF 以使用 malloc
生成已分配字节的火焰图。
Aside from mmap
,
malloc
system calls are also present in native executables generated by GraalVM.
Let’s put bcc/eBPF in action to generate a flamegraph of bytes allocated using malloc
.
若要执行此操作,请首先重新生成一个 Quarkus 原生可执行文件,删除 bcc/eBPF 无法识别的调试信息,而是使用带有本地符号的帧指针来获取堆栈跟踪:
To do this, first re-generate a Quarkus native executable removing debug info, which bcc/eBPF does not understand, and instead use frame pointer with local symbols to get the stack traces:
$ mvn package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=-H:-DeleteLocalSymbols,-H:+PreserveFramePointer
mallocstacks.py bcc/eBPF 脚本将用于捕获 malloc
堆栈跟踪及其已分配大小。此脚本和其他典型的 bcc/eBPF 脚本(例如 stackcount
)需要提供进程 ID (PID)。当您要跟踪启动时会有些棘手,但您可以使用 gdb
(即使您未启用调试信息)来解决此障碍,因为它允许您在第一条指令处停止应用程序。让我们通过 gdb
来运行原生可执行文件:
The mallocstacks.py
bcc/eBPF script will be used to capture the malloc
stacktraces with their allocated size.
This script, and other typical bcc/eBPF scripts (e.g. stackcount
), need to be given a process ID (PID).
This makes a bit tricky when you want to trace startup,
but you can use gdb
(even if you haven’t enabled debug info)
to get around this obstacle because it allows you to stop the application at the first instruction.
Let’s start by running the native executable via gdb
:
$ gdb --args ./target/code-with-quarkus-1.0.0-SNAPSHOT-runner -Xmx128m
...
(No debugging symbols found in ./target/code-with-quarkus-1.0.0-SNAPSHOT-runner)
starti
是一个 gdb
命令,在程序执行的第一条指令处设置一个临时断点。
starti
is a gdb
command that sets a temporary breakpoint at the very first instruction of the program’s execution.
(gdb) starti
Starting program: <..>/code-with-quarkus/target/code-with-quarkus-1.0.0-SNAPSHOT-runner -Xmx128m
Program stopped.
0x00007ffff7fe4790 in _start () from /lib64/ld-linux-x86-64.so.2
接下来调用 bcc/eBPF 脚本,并向其提供 Quarkus 进程的 PID,以便它可以跟踪 malloc
调用、捕获堆栈跟踪并将其转储到文件中进行后处理:
Next invoke the bcc/eBPF script giving it the PID of the Quarkus process,
so that it can track the malloc
calls,
capture stack traces and dump them to a file for post-processing:
$ sudo ./mallocstacks.py -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner) -f > out.stacks
然后返回到 gdb
shell 并指示它在命中第一条指令后继续启动过程:
Then go back to the gdb
shell and instruct it to continue the startup procedure after hitting the first instruction:
(gdb) continue
Continuing.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff65ff6c0 (LWP 3342)]
...
[New Thread 0x7fffc6ffd6c0 (LWP 3359)]
__ ____ __ _____ ___ __ ____ ______
--/ __ \/ / / / _ | / _ \/ //_/ / / / __/
-/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/
2023-02-09 18:02:32,794 INFO [io.quarkus] (main) code-with-quarkus 1.0.0-SNAPSHOT native (powered by Quarkus 2.16.1.Final) started in 0.011s. Listening on: http://0.0.0.0:8080
2023-02-09 18:02:32,794 INFO [io.quarkus] (main) Profile prod activated.
2023-02-09 18:02:32,794 INFO [io.quarkus] (main) Installed features: [cdi, rest, smallrye-context-propagation, vertx]
启动完成后,按 Ctrl-C
上的 stackcount
shell。
Once the startup has complete,
press Ctrl-C
on the stackcount
shell.
然后将堆栈文件处理为火焰图。请注意,此脚本生成的堆栈已经折叠,因此可以这样生成火焰图:
Then process the stacks file as a flamegraph. Note that the stacks generated by this script are already collapsed, so the flamegraph can be generated just like this:
$ cat out.stacks | ${FG_HOME}/flamegraph.pl --color=mem --title="malloc bytes Flame Graph" --countname="bytes" > out.svg
生成的火焰图应类似于以下内容:
The flamegraph produced should look something like this:
这表明使用 malloc
请求的大多数内存都来自 Java NIO 中的 epoll,但通过 malloc
分配的总量仅为 268KB。可以在火焰图底部的 all
上方悬停来观察此 274,269 字节数量(您可能需要要求浏览器在不同的选项卡或窗口中打开火焰图来观察这一点)。与使用 mmap
为堆分配的数量相比,这非常小。
This shows that most of the memory requested using malloc
comes from epoll in Java NIO,
but the overall amount allocated via malloc
is barely 268KB.
This amount of 274,269 bytes can be observed by hovering on top of all
at the bottom of flamegraph
(you might need to ask the browser to open the flamegraph in a different tab or window to observe this).
This is very small compared with the amount allocated for the heap with mmap
.
最后,简要提及其他 bcc/eBPF 命令,以及如何将它们转换为火焰图。
Finally, just a brief mention about other bcc/eBPF commands, and how to transform them into flamegraphs.
$ sudo /usr/share/bcc/tools/stackcount -P -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner) \
-U "t:syscalls:sys_enter_m*" # count stacks for mmap and munmap
$ sudo /usr/share/bcc/tools/stackcount -P -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner) \
-U "c:*alloc" # count stacks for malloc, calloc and realloc
$ sudo /usr/share/bcc/tools/stackcount -P -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner) \
-U "c:free" # count stacks for free
$ sudo /usr/share/bcc/tools/stackcount -P -p $(pidof code-with-quarkus-1.0.0-SNAPSHOT-runner) \
-U "t:exceptions:page_fault_*" # count stacks for page faults
在将 stackcount
生成的堆栈转换为火焰图之前,需要对其进行折叠。例如:
Stacks produced by stackcount
need to be collapsed before they can be transformed into flamegraphs.
For example:
${FG_HOME}/stackcollapse.pl < out.stacks | ${FG_HOME}/flamegraph.pl \
--color=mem --title="mmap munmap Flame Graph" --countname="calls" > out.svg
Native Image Tracing Agent Integration
希望将新的库/组件集成到原生映像进程(例如 smbj )中的 Quarkus 用户,或者希望使用需要大量原生映像配置才能工作的 JDK API(例如图形用户界面),面临着提出原生映像配置的重大挑战来使他们的用例正常工作。这些用户可以调整他们的应用程序,以便在 JVM 模式下使用原生映像代理来自动生成原生映像配置,这将帮助他们快速启动应用程序作为原生可执行文件。
Quarkus users that want to integrate new libraries/components into native image process (e.g. smbj), or want to use JDK APIs that require extensive native image configuration to work (e.g. graphical user interfaces), face a considerable challenge coming up with the native image configuration to make their use cases work. These users can tweak their applications to run in JVM mode with the native image agent in order to auto-generate native image configuration that will help them get a head start getting applications to work as native executables.
原生映像跟踪代理是一个 JVM 工具接口 (JVMTI) 代理,在 GraalVM 和 Mandrel 中均可用,它会在应用程序的常规 JVM 执行过程中跟踪动态特性的所有用法,例如反射、JNI、动态代理、访问类路径资源等。当 JVM 停止时,它会将运行期间使用的动态特性的信息转储到一系列原生映像配置文件中,这些配置文件可用于后续原生映像构建。
The native image tracing agent is a JVM tool interface (JVMTI) agent available within both GraalVM and Mandrel that tracks all usages of dynamic features such as reflection, JNI, dynamic proxies, access classpath resources…etc, during an application’s regular JVM execution. When the JVM stops, it dumps the information on the dynamic features used during the run onto a collection of native image configuration files that can be used in subsequent native image builds.
对于 Quarkus 用户来说,使用该代理并应用生成的数据可能很困难。首先,该代理可能很麻烦,因为它要求修改 JVM 参数,并且生成的配置需要放置在特定位置,以便后续的原生映像构建能够找到它们。其次,生成的原生映像配置包含许多 Quarkus 集成所处理的冗余配置。
Using the agent and applying the generated data can be difficult for Quarkus users. First, the agent can be cumbersome because it requires the JVM arguments to be modified, and the generated configuration needs to be placed in a specific location such that the subsequent native image builds picks them up. Secondly, the native image configuration produced contains a lot of superfluous configuration that the Quarkus integration takes care of.
Quarkus 中包含原生映像跟踪代理集成,以便更轻松地使用该代理。在本节中,您将了解集成以及如何将其应用于您的 Quarkus 应用程序。
Native image tracing agent integration is included in Quarkus to make the agent easier to consume. In this section you will learn about the integration and how to apply it to your Quarkus application.
集成目前仅适用于 Maven 应用程序。 Gradle integration 将继续跟进。 The integration is currently only available for Maven applications. Gradle integration will follow up. |
Integration Testing with the Tracing Agent
Quarkus 用户现在可以在 Quarkus Maven 应用程序上透明地运行 JVM 模式集成测试,同时使用原生映像跟踪代理。为此,请确保容器运行时可用,因为 JVM 模式集成测试将使用默认 Mandrel 构建器容器映像中的 JVM 运行。此映像包含生成原生映像配置所需的代理库,因此无需本地 Mandrel 或 GraalVM 安装。
Quarkus users can now run JVM mode integration tests on Quarkus Maven applications transparently with the native image tracing agent. To do this make sure a container runtime is available, because JVM mode integration tests will run using the JVM within the default Mandrel builder container image. This image contains the agent libraries required to produce native image configuration, hence avoiding the need for a local Mandrel or GraalVM installation.
强烈建议将集成测试中使用的 Mandrel 版本与用于构建原生可执行文件的 Mandrel 版本保持一致。使用默认 Mandrel 构建器映像进行容器内原生构建是保持两个版本一致的最安全方法。 It is highly recommended to align the Mandrel version used in integration testing with the Mandrel version used to build native executables. Doing in-container native builds with the default Mandrel builder image, is the safest way to keep both versions aligned. |
此外,确保 native-image-agent
目标出现在 quarkus-maven-plugin
配置中:
Additionally make sure that the native-image-agent
goal is present in the quarkus-maven-plugin
configuration:
<plugin>
<groupId>${quarkus.platform.group-id}</groupId>
<artifactId>quarkus-maven-plugin</artifactId>
...
<executions>
<execution>
<goals>
...
<goal>native-image-agent</goal>
</goals>
</execution>
</executions>
</plugin>
在容器运行时运行时,用 -DskipITs=false -Dquarkus.test.integration-test-profile=test-with-native-agent
调用 Maven 的 verify
目标,运行 JVM 模式集成测试并生成本机映像配置。例如:
With a container runtime running,
invoke Maven’s verify
goal with -DskipITs=false -Dquarkus.test.integration-test-profile=test-with-native-agent
to run the JVM mode integration tests and
generate the native image configuration.
For example:
$ ./mvnw verify -DskipITs=false -Dquarkus.test.integration-test-profile=test-with-native-agent
...
[INFO] --- failsafe:3.3.1:integration-test (default) @ new-project ---
...
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.acme.GreetingResourceIT
2024-05-14 16:29:53,941 INFO [io.qua.tes.com.DefaultDockerContainerLauncher] (main) Executing "podman run --name quarkus-integration-test-PodgW -i --rm --user 501:20 -p 8081:8081 -p 8444:8444 --entrypoint java -v /tmp/new-project/target:/project --env QUARKUS_LOG_CATEGORY__IO_QUARKUS__LEVEL=INFO --env QUARKUS_HTTP_PORT=8081 --env QUARKUS_HTTP_SSL_PORT=8444 --env TEST_URL=http://localhost:8081 --env QUARKUS_PROFILE=test-with-native-agent --env QUARKUS_TEST_INTEGRATION_TEST_PROFILE=test-with-native-agent quay.io/quarkus/ubi-quarkus-mandrel-builder-image:jdk-21 -agentlib:native-image-agent=access-filter-file=quarkus-access-filter.json,caller-filter-file=quarkus-caller-filter.json,config-output-dir=native-image-agent-base-config, -jar quarkus-app/quarkus-run.jar"
...
[INFO]
[INFO] --- quarkus:{quarkus-version}:native-image-agent (default) @ new-project ---
[INFO] Discovered native image agent generated files in /tmp/new-project/target/native-image-agent-final-config
[INFO]
...
Maven 调用完成后,可以在 target/native-image-agent-final-config
文件夹中检查生成的配置:
When the Maven invocation completes,
you can inspect the generated configuration in the target/native-image-agent-final-config
folder:
$ cat ./target/native-image-agent-final-config/reflect-config.json
[
...
{
"name":"org.acme.Alice",
"methods":[{"name":"<init>","parameterTypes":[] }, {"name":"sayMyName","parameterTypes":[] }]
},
{
"name":"org.acme.Bob"
},
...
]
Informative By Default
默认情况下,后续本机映像构建进程不会使用生成的本机映像配置文件。采取此预防措施是为了避免看似无关的动作对生成的本机可执行文件产生意想不到的后果的情况,例如禁用随机失败的测试。
By default the generated native image configuration files are not used by subsequent native image building processes. This precaution is taken to avoid situations where seemingly unrelated actions have unintended consequences on the native executable produced, e.g. disabling randomly failing tests.
Quarkus 用户可以从构建中报告的文件夹中复制文件,将它们存储在源代码控制中,并根据需要进行演变。理想情况下,这些文件应存储在 src/main/resources/META-INF/native-image/<group-id>/<artifact-id>`
文件夹下,在这种情况下,本机映像进程将自动获取它们。
Quarkus users are free to copy the files from the folder reported in the build,
store them under source control and evolve as needed.
Ideally these files should be stored under the src/main/resources/META-INF/native-image/<group-id>/<artifact-id>`
folder,
in which case the native image process will automatically pick them up.
如果手动管理本机映像代理配置文件,强烈建议每次更新 Mandrel 版本时都重新生成它们,因为使应用程序工作的必要配置可能因 Mandrel 内部更改而有所不同。
If managing native image agent configuration files manually, it is highly recommended to regenerate them each time a Mandrel version update occurs, because the configuration necessary to make the application work might have varied due to internal Mandrel changes.
可以通过设置 -Dquarkus.native.agent-configuration-apply` 属性来指示 Quarkus 适当地将生成的本机映像配置文件应用到后续的本机映像进程中。这有助于验证本机集成测试是否按预期工作,假设 JVM 单元测试已生成正确的本机映像配置。此处的典型工作流首先使用本机映像代理运行集成测试,如上一节所示:
It is possible to instruct Quarkus to optionally apply the generated native image configuration files into subsequent native image processes, by setting the -Dquarkus.native.agent-configuration-apply` property. This can be useful to verify that the native integration tests work as expected, assuming that the JVM unit tests have generated the correct native image configuration. The typical workflow here would be to first run the integration tests with the native image agent as shown in the previous section:
$ ./mvnw verify -DskipITs=false -Dquarkus.test.integration-test-profile=test-with-native-agent
...
[INFO] --- quarkus:{quarkus-version}:native-image-agent (default) @ new-project ---
[INFO] Discovered native image agent generated files in /tmp/new-project/target/native-image-agent-final-config
然后请求传入配置应用标志的本机构建。本机构建流程期间的一条消息将指示正在应用本机映像代理生成的配置文件:
And then request a native build passing in the configuration apply flag. A message during the native build process will indicate that the native image agent generated configuration files are being applied:
$ ./mvnw verify -Dnative -Dquarkus.native.agent-configuration-apply
...
[INFO] --- quarkus:{quarkus-version}:build (default) @ new-project ---
[INFO] [io.quarkus.deployment.pkg.steps.JarResultBuildStep] Building native image source jar: /tmp/new-project/target/new-project-1.0.0-SNAPSHOT-native-image-source-jar/new-project-1.0.0-SNAPSHOT-runner.jar
[INFO] [io.quarkus.deployment.steps.ApplyNativeImageAgentConfigStep] Applying native image agent generated files to current native executable build
Debugging the Tracing Agent Integration
如果生成的本机映像代理配置不令人满意,可以使用以下任何技术获取更多信息:
If the generated native image agent configuration is not satisfactory, more information can be obtained using any of the following techniques:
Debugging Filters
Quarkus 生成本机映像跟踪代理配置筛选器。这些筛选器排除了 Quarkus 已经应用了必要配置的常用包。
Quarkus generates native image tracing agent configuration filters. These filters exclude commonly used packages for which Quarkus already applies the necessary configuration.
如果本机映像代理正在生成无法按预期工作的配置,则应检查配置文件是否包含预期信息。例如,如果某些方法在运行时通过反射进行访问并且出现错误,则要确认配置文件包含该方法的反射条目。
If native image agent is generating a configuration that it’s not working as expected, you should check that the configuration files include the expected information. For example, if some method happens to be accessed via reflection at runtime and you get an error, you want to verify that the configuration file contains a reflection entry for that method.
如果条目丢失,可能是过滤掉了某个调用路径,而该路径可能不应该被过滤掉。要验证这一点,请检查 target/quarkus-caller-filter.json
和 target/quarkus-access-filter.json
文件的内容,并确认发出调用或被访问的类和/或包未被筛选掉。
If the entry is missing, it could be that some call path is being filtered that maybe shouldn’t have been.
To verify that, inspect the contents of target/quarkus-caller-filter.json
and target/quarkus-access-filter.json
files,
and confirm that the class and/or package making the call or being accessed is not being filtered out.
如果缺少的条目与某些资源相关,则应检查 Quarkus 构建调试输出并验证正在丢弃的资源模式,例如:
If the missing entry is related to some resource, you should inspect the Quarkus build debug output and verify which resource patterns are being discarded, e.g.
$ ./mvnw -X verify -DskipITs=false -Dquarkus.test.integration-test-profile=test-with-native-agent
...
[INFO] --- quarkus:{quarkus-version}:native-image-agent (default) @ new-project ---
...
[DEBUG] Discarding resources from native image configuration that match the following regular expression: .*(application.properties|jakarta|jboss|logging.properties|microprofile|quarkus|slf4j|smallrye|vertx).*
[DEBUG] Discarded included resource with pattern: \\QMETA-INF/microprofile-config.properties\\E
[DEBUG] Discarded included resource with pattern: \\QMETA-INF/services/io.quarkus.arc.ComponentsProvider\\E
...
Tracing Agent Logging
本机映像跟踪代理可以将导致生成配置的方法调用记录到 JSON 文件中。这有助于了解生成配置条目的原因。要启用此日志记录,需要添加 -Dquarkus.test.native.agent.output.property.name=trace-output
和 -Dquarkus.test.native.agent.output.property.value=native-image-agent-trace-file.json
系统属性。例如:
The native image tracing agent can log the method invocations that result in the generated configuration to a JSON file.
This can help understand why a configuration entry is generated.
To enable this logging,
-Dquarkus.test.native.agent.output.property.name=trace-output
and
-Dquarkus.test.native.agent.output.property.value=native-image-agent-trace-file.json
system properties need to be added.
For example:
$ ./mvnw verify -DskipITs=false \
-Dquarkus.test.integration-test-profile=test-with-native-agent \
-Dquarkus.test.native.agent.output.property.name=trace-output \
-Dquarkus.test.native.agent.output.property.value=native-image-agent-trace-file.json
配置跟踪输出时,不会生成本机映像配置,而是会生成包含跟踪信息的 target/native-image-agent-trace-file.json
文件。例如:
When trace output is configured, no native image configuration is generated,
and instead a target/native-image-agent-trace-file.json
file is generated that contains trace information.
For example:
[
{"tracer":"meta", "event":"initialization", "version":"1"},
{"tracer":"meta", "event":"phase_change", "phase":"start"},
{"tracer":"jni", "function":"FindClass", "caller_class":"java.io.ObjectStreamClass", "result":true, "args":["java/lang/NoSuchMethodError"]},
...
{"tracer":"reflect", "function":"findConstructorHandle", "class":"io.vertx.core.impl.VertxImpl$1$1$$Lambda/0x000000f80125f4e8", "caller_class":"java.lang.invoke.InnerClassLambdaMetafactory", "result":true, "args":[["io.vertx.core.Handler"]]},
{"tracer":"meta", "event":"phase_change", "phase":"dead"},
{"tracer":"meta", "event":"phase_change", "phase":"unload"}
]
不幸的是,跟踪输出不考虑应用的配置筛选器,因此输出包含代理做出的所有配置决策。这在近期内不太可能改变(参见 oracle/graal#7635)。
Unfortunately the trace output does not take into account the applied configuration filters, so the output contains all configuration decisions made by the agent. This is unlikely to change in the near future (see oracle/graal#7635).
Configuration With Origins (Experimental)
作为跟踪输出的替代方案,可以配置一个表明配置条目来源的实验标志的本机映像代理。可以通过以下附加系统属性启用它:
Alternative to the trace output, it is possible to configure the native image agent with an experimental flag that shows the origins of the configuration entries. You can enable that with the following additional system property:
$ ./mvnw verify -DskipITs=false \
-Dquarkus.test.integration-test-profile=test-with-native-agent \
-Dquarkus.test.native.agent.additional.args=experimental-configuration-with-origins
配置条目的来源可以在 target/native-image-agent-base-config
文件夹中的文本文件中找到。例如:
The origins of the configuration entries can be found in text files inside the target/native-image-agent-base-config
folder.
For example:
$ cat target/native-image-agent-base-config/reflect-origins.txt
root
├── java.lang.Thread#run()
│ └── java.lang.Thread#runWith(java.lang.Object,java.lang.Runnable)
│ └── io.netty.util.concurrent.FastThreadLocalRunnable#run()
│ └── org.jboss.threads.ThreadLocalResettingRunnable#run()
│ └── org.jboss.threads.DelegatingRunnable#run()
│ └── org.jboss.threads.EnhancedQueueExecutor$ThreadBody#run()
│ └── org.jboss.threads.EnhancedQueueExecutor$Task#run()
│ └── org.jboss.threads.EnhancedQueueExecutor$Task#doRunWith(java.lang.Runnable,java.lang.Object)
│ └── io.quarkus.vertx.core.runtime.VertxCoreRecorder$14#runWith(java.lang.Runnable,java.lang.Object)
│ └── org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext#run()
│ └── io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext#invokeHandler(int)
│ └── org.jboss.resteasy.reactive.server.handlers.InvocationHandler#handle(org.jboss.resteasy.reactive.server.core.ResteasyReactiveRequestContext)
│ └── org.acme.GreetingResource$quarkusrestinvoker$greeting_709ef95cd764548a2bbac83843a7f4cdd8077016#invoke(java.lang.Object,java.lang.Object[])
│ └── org.acme.GreetingResource#greeting(java.lang.String)
│ └── org.acme.GreetingService_ClientProxy#greeting(java.lang.String)
│ └── org.acme.GreetingService#greeting(java.lang.String)
│ ├── java.lang.Class#forName(java.lang.String) - [ { "name":"org.acme.Alice" }, { "name":"org.acme.Bob" } ]
│ ├── java.lang.Class#getDeclaredConstructor(java.lang.Class[]) - [ { "name":"org.acme.Alice", "methods":[{"name":"<init>","parameterTypes":[] }] } ]
│ ├── java.lang.reflect.Constructor#newInstance(java.lang.Object[]) - [ { "name":"org.acme.Alice", "methods":[{"name":"<init>","parameterTypes":[] }] } ]
│ ├── java.lang.reflect.Method#invoke(java.lang.Object,java.lang.Object[]) - [ { "name":"org.acme.Alice", "methods":[{"name":"sayMyName","parameterTypes":[] }] } ]
│ └── java.lang.Class#getMethod(java.lang.String,java.lang.Class[]) - [ { "name":"org.acme.Alice", "methods":[{"name":"sayMyName","parameterTypes":[] }] } ]
...
Debugging With GDB
本机映像代理本身是使用 GraalVM 生成的本机可执行文件,它使用 JVMTI 拦截需要本机映像配置的调用。作为最后的手段,可以用 GDB 调试本机映像代理,参见 here 以了解如何执行此操作。
The native image agent itself is a native executable produced with GraalVM that uses JVMTI to intercept the calls that require native image configuration. As a last resort, it is possible to debug the native image agent with GDB, see here for instructions on how to do that.
Inspecting and Debugging Native Executables
此调试指南提供了更多关于调试 Quarkus 原生可执行文件问题的详细信息,这些问题可能会在开发或生产期间发生。
This debugging guide provides further details on debugging issues in Quarkus native executables that might arise during development or production.
它以在 Getting Started Guide 中开发的应用程序作为输入。您可以在本指南中找到有关如何快速设置此应用程序的说明。
It takes as input the application developed in the Getting Started Guide. You can find instructions on how to quickly set up this application in this guide.
Requirements and Assumptions
此调试指南具有以下要求:
This debugging guide has the following requirements:
-
JDK 17 installed with
JAVA_HOME
configured appropriately -
Apache Maven {maven-version}
-
A working container runtime (Docker, podman)
本指南在 Linux 环境中构建和执行 Quarkus 原生可执行文件。为了在所有环境中提供同质体验,本指南依赖容器运行时环境来构建和运行原生可执行文件。以下说明使用 Docker 作为示例,但非常相似的命令应该适用于其他容器运行时,例如 podman。
This guide builds and executes Quarkus native executables within a Linux environment. To offer a homogeneous experience across all environments, the guide relies on a container runtime environment to build and run the native executables. The instructions below use Docker as example, but very similar commands should work on alternative container runtimes, e.g. podman.
构建原生可执行文件是一个昂贵的过程,因此请确保容器运行时有足够的 CPU 和内存来执行此操作。最低需要 4 个 CPU 和 4GB 内存。
Building native executables is an expensive process, so make sure the container runtime has enough CPU and memory to do this. A minimum of 4 CPUs and 4GB of memory is required.
最后,本指南假设使用 GraalVM Mandrel distribution 来构建原生可执行文件,并且这些文件是在容器中构建的,因此无需在主机上安装 Mandrel。
Finally, this guide assumes the use of the Mandrel distribution of GraalVM for building native executables, and these are built within a container so there is no need for installing Mandrel on the host.
Bootstrapping the project
首先创建一个新的 Quarkus 项目。打开一个终端并运行以下命令:
Start by creating a new Quarkus project. Open a terminal and run the following command:
适用于 Linux 和 macOS 用户
For Linux & MacOS users
Unresolved directive in native-reference.adoc - include::{includes}/devtools/create-app.adoc[]
适用于 Windows 用户
For Windows users
-
If using cmd , (don’t use backward slash
\
and put everything on the same line) -
If using Powershell , wrap
-D
parameters in double quotes e.g."-DprojectArtifactId=debugging-native"
Configure Quarkus properties
在本调试指南中,会经常使用某些 Quarkus 配置选项,因此为了帮助整理命令行调用,建议将这些选项添加到 application.properties
文件中。所以,继续操作并在该文件中添加以下选项:
Some Quarkus configuration options will be used constantly throughout this debugging guide,
so to help declutter command line invocations,
it’s recommended to add these options to the application.properties
file.
So, go ahead and add the following options to that file:
quarkus.native.container-build=true
quarkus.native.builder-image=quay.io/quarkus/ubi-quarkus-mandrel-builder-image:{mandrel-flavor}
quarkus.container-image.build=true
quarkus.container-image.group=test
First Debugging Steps
作为第一步,更改为项目目录并为应用程序构建原生可执行文件:
As a first step, change to the project directory and build the native executable for the application:
./mvnw package -DskipTests -Dnative
运行应用程序以验证其按预期工作。在一个终端中:
Run the application to verify it works as expected. In one terminal:
docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
在另一个终端中:
In another:
curl -w '\n' http://localhost:8080/hello
本节的剩余部分探讨了用额外信息构建原生可执行文件的方法,但首先,停止正在运行的应用程序。我们可以通过使用 -Dquarkus.native.additional-build-args
添加额外的原生映像构建选项在构建原生可执行文件时获取此信息,例如:
The rest of this section explores ways to build the native executable with extra information,
but first, stop the running application.
We can obtain this information while building the native executable by adding additional native-image build options using -Dquarkus.native.additional-build-args
, e.g.
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=--native-image-info
执行此操作将产生类似以下内容的附加输出行:
Executing that will produce additional output lines like this:
...
# Printing compilation-target information to: /project/reports/target_info_20220223_100915.txt
…
# Printing native-library information to: /project/reports/native_library_info_20220223_100925.txt
请注意, Note that |
目标信息文件包含的目标平台、用来编译可执行文件和正在使用的 C 库等信息:
The target info file contains information such as the target platform, the toolchain used to compile the executable, and the C library in use:
$ cat target/*/reports/target_info_*.txt
Building image for target platform: org.graalvm.nativeimage.Platform$LINUX_AMD64
Using native toolchain:
Name: GNU project C and C++ compiler (gcc)
Vendor: redhat
Version: 8.5.0
Target architecture: x86_64
Path: /usr/bin/gcc
Using CLibrary: com.oracle.svm.core.posix.linux.libc.GLib
本机库信息文件包含添加到二进制文件中的静态库信息和其他动态链接到可执行文件中的库:
The native library info file contains information on the static libraries added to the binary and the other libraries dynamically linked to the executable:
$ cat target/*/reports/native_library_info_*.txt
Static libraries:
../opt/mandrel/lib/svm/clibraries/linux-amd64/liblibchelper.a
../opt/mandrel/lib/static/linux-amd64/glibc/libnet.a
../opt/mandrel/lib/static/linux-amd64/glibc/libextnet.a
../opt/mandrel/lib/static/linux-amd64/glibc/libnio.a
../opt/mandrel/lib/static/linux-amd64/glibc/libjava.a
../opt/mandrel/lib/static/linux-amd64/glibc/libfdlibm.a
../opt/mandrel/lib/static/linux-amd64/glibc/libsunec.a
../opt/mandrel/lib/static/linux-amd64/glibc/libzip.a
../opt/mandrel/lib/svm/clibraries/linux-amd64/libjvm.a
Other libraries: stdc++,pthread,dl,z,rt
可以通过传递 `--verbose`作为附加本机镜像构建参数来获取更多细节。此选项在检测您通过 Quarkus 在高层传递的选项是否被向下传递到本机可执行文件制作阶段时非常有用,或者检测是否存在某些第三方 JAR 具有嵌入在其中并在本机镜像调用环节生效的本机镜像配置:
Even more detail can be obtained by passing in --verbose
as an additional native-image build argument.
This option can be very useful in detecting whether the options that you pass at a high level via Quarkus are being passed down to the native executable production,
or whether some third party jar has some native-image configuration embedded in it that is reaching the native-image invocation:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=--verbose
使用 `--verbose`运行展示了本机镜像构建过程怎样由两个连续的 Java 进程组成:
Running with --verbose
demonstrates how the native-image building process is two sequential java processes:
-
The first is a very short Java process that does some basic validation and builds the arguments for the second process (in a stock GraalVM distribution, this is executed as native code).
-
The second Java process is where the main part of the native executable production happens. The
--verbose
option shows the actual Java process executed. You could take the output and run it yourself.
还可以通过逗号分隔来组合多个本机构建选项,例如:
One may also combine multiple native build options by separating with a comma, e.g.:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=--native-image-info,--verbose
请记住,如果 Remember that if an argument for |
Inspecting Native Executables
鉴于存在本机可执行文件,可以使用各种 Linux 工具来检查它。为了支持各种环境,将从 Linux 容器内执行检查。让我们创建一个包含此指南所需所有工具的 Linux 容器镜像:
Given a native executable, various Linux tools can be used to inspect it. To allow supporting a variety of environments, inspections will be done from within a Linux container. Let’s create a Linux container image with all the tools required for this guide:
FROM fedora:35
RUN dnf install -y \
binutils \
gdb \
git \
perf \
perl-open
ENV FG_HOME /opt/FlameGraph
RUN git clone https://github.com/brendangregg/FlameGraph $FG_HOME
WORKDIR /data
ENTRYPOINT /bin/bash
在非 Linux 环境中使用 docker,可以通过以下方式使用此 Dockerfile 创建镜像:
Using docker in the non-Linux environment, you can create an image using this Dockerfile via:
docker build -t fedora-tools:v1 .
然后,转到项目的根目录并以以下方式运行我们刚刚创建的 Docker 容器:
Then, go to the root of the project and run the Docker container we have just created as:
docker run -t -i --rm -v ${PWD}:/data -p 8080:8080 fedora-tools:v1
`ldd`显示可执行文件的共享库依赖项:
ldd
shows the shared library dependencies of an executable:
ldd ./target/debugging-native-1.0.0-SNAPSHOT-runner
`strings`可用于寻找二进制文件中的文本消息:
strings
can be used to look for text messages inside the binary:
strings ./target/debugging-native-1.0.0-SNAPSHOT-runner | grep Hello
使用 strings
,您还可以获得给定二进制文件的 Mandrel 信息:
Using strings
you can also get Mandrel information given the binary:
strings ./target/debugging-native-1.0.0-SNAPSHOT-runner | grep core.VM
最后,使用 readelf
,我们可以检查二进制文件的不同部分。例如,我们可以看到堆和文本部分占据了大部分二进制文件:
Finally, using readelf
we can inspect different sections of the binary.
For example, we can see how the heap and text sections take most of the binary:
readelf -SW ./target/debugging-native-1.0.0-SNAPSHOT-runner
Quarkus 为运行本机可执行文件而制作的运行时容器不会包含上述工具。要浏览运行时容器中的本机可执行文件,最好运行容器本身,然后 `docker cp`本地可执行文件,例如: Runtime containers produced by Quarkus to run native executables will not include the tools mentioned above.
To explore a native executable within a runtime container,
it’s best to run the container itself and then
在其中,您可以直接检查可执行文件或使用如上所示的工具容器。 From there, you can either inspect the executable directly or use a tools container like above. |
Native Reports
或者,原生构建过程可以生成报告,显示进入二进制文件的内容:
Optionally, the native build process can generate reports that show what goes into the binary:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.enable-reports
报告将在 target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/reports/
下创建。在遇到缺少方法/类或 Mandrel 禁止方法的问题时,这些报告是最有用的资源。
The reports will be created under target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/reports/
.
These reports are some of the most useful resources when encountering issues with missing methods/classes, or encountering forbidden methods by Mandrel.
Call Tree Reports
call_tree
csv 文件报告是在传入 -Dquarkus.native.enable-reports
选项时生成的一些默认报告。这些 csv 文件可以导入到图数据库(如 Neo4j)中,以便更轻松地检查它们并在调用树上运行查询。这对于确定方法/类包含在二进制文件中的原因非常有用。
call_tree
csv file reports are some of the default reports generated when the -Dquarkus.native.enable-reports
option is passed in.
These csv files can be imported into a graph database, such as Neo4j,
to inspect them more easily and run queries against the call tree.
This is useful for getting an approximation on why a method/class is included in the binary.
让我们来看看它在实际中的应用。
Let’s see this in action.
首先,启动 Neo4j 实例:
First, start a Neo4j instance:
export NEO_PASS=...
docker run \
--detach \
--rm \
--name testneo4j \
-p7474:7474 -p7687:7687 \
--env NEO4J_AUTH=neo4j/${NEO_PASS} \
neo4j:latest
一旦容器启动,即可访问 Neo4j browser。用 neo4j
作为用户名,用 NEO_PASS
的值作为密码进行登录。
Once the container is running,
you can access the Neo4j browser.
Use neo4j
as the username and the value of NEO_PASS
as the password to log in.
要导入 CSV 文件,我们需要以下 cypher 脚本,它将导入 CSV 文件中的数据并创建图数据库节点和边:
To import the CSV files, we need the following cypher script which will import the data within the CSV files and create graph database nodes and edges:
CREATE CONSTRAINT unique_vm_id FOR (v:VM) REQUIRE v.vmId IS UNIQUE;
CREATE CONSTRAINT unique_method_id FOR (m:Method) REQUIRE m.methodId IS UNIQUE;
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_vm.csv' AS row
MERGE (v:VM {vmId: row.Id, name: row.Name})
RETURN count(v);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_methods.csv' AS row
MERGE (m:Method {methodId: row.Id, name: row.Name, type: row.Type, parameters: row.Parameters, return: row.Return, display: row.Display})
RETURN count(m);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_virtual_methods.csv' AS row
MERGE (m:Method {methodId: row.Id, name: row.Name, type: row.Type, parameters: row.Parameters, return: row.Return, display: row.Display})
RETURN count(m);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_entry_points.csv' AS row
MATCH (m:Method {methodId: row.Id})
MATCH (v:VM {vmId: '0'})
MERGE (v)-[:ENTRY]->(m)
RETURN count(*);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_direct_edges.csv' AS row
MATCH (m1:Method {methodId: row.StartId})
MATCH (m2:Method {methodId: row.EndId})
MERGE (m1)-[:DIRECT {bci: row.BytecodeIndexes}]->(m2)
RETURN count(*);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_override_by_edges.csv' AS row
MATCH (m1:Method {methodId: row.StartId})
MATCH (m2:Method {methodId: row.EndId})
MERGE (m1)-[:OVERRIDEN_BY]->(m2)
RETURN count(*);
LOAD CSV WITH HEADERS FROM 'file:///reports/call_tree_virtual_edges.csv' AS row
MATCH (m1:Method {methodId: row.StartId})
MATCH (m2:Method {methodId: row.EndId})
MERGE (m1)-[:VIRTUAL {bci: row.BytecodeIndexes}]->(m2)
RETURN count(*);
复制并粘贴脚本的内容到名为 import.cypher
的文件中。
Copy and paste the contents of the script into a file called import.cypher
.
Mandrel 22.0.0 中存在一个错误,该错误导致导入 cypher 文件所用的符号链接在容器中生成报告时没有正确设置(有关更多详细信息,请参阅 here)。可以通过将以下脚本复制到文件中并执行它来解决此问题:
Mandrel 22.0.0 contains a bug where the symbolic links used by the import cypher file are not correctly set when generating reports within a container (for more details see here). This can be worked around by copying the following script into a file and executing it:
set -e
project="debugging-native"
pushd target/*-native-image-source-jar/reports
rm -f call_tree_vm.csv
ln -s call_tree_vm_${project}-* call_tree_vm.csv
rm -f call_tree_direct_edges.csv
ln -s call_tree_direct_edges_${project}-* call_tree_direct_edges.csv
rm -f call_tree_entry_points.csv
ln -s call_tree_entry_points_${project}-* call_tree_entry_points.csv
rm -f call_tree_methods.csv
ln -s call_tree_methods_${project}-* call_tree_methods.csv
rm -f call_tree_virtual_edges.csv
ln -s call_tree_virtual_edges_${project}-* call_tree_virtual_edges.csv
rm -f call_tree_virtual_methods.csv
ln -s call_tree_virtual_methods_${project}-* call_tree_virtual_methods.csv
rm -f call_tree_override_by_edges.csv
ln -s call_tree_override_by_edges_${project}-* call_tree_override_by_edges.csv
popd
接下来,将导入 cypher 脚本和 CSV 文件复制到 Neo4j 的导入文件夹:
Next, copy the import cypher script and CSV files into Neo4j’s import folder:
docker cp \
target/*-native-image-source-jar/reports \
testneo4j:/var/lib/neo4j/import
docker cp import.cypher testneo4j:/var/lib/neo4j
复制所有文件后,调用导入脚本:
After copying all the files, invoke the import script:
docker exec testneo4j bin/cypher-shell -u neo4j -p ${NEO_PASS} -f import.cypher
导入完成后(不应超过几分钟),转到 Neo4j browser,你将能够观察到图中数据的简要摘要:
Once the import completes (shouldn’t take more than a couple of minutes), go to the Neo4j browser, and you’ll be able to observe a small summary of the data in the graph:
上述数据表明,该图中约有 60000 个方法,以及它们之间约 200000 条边。此处演示的 Quarkus 应用程序非常基础,因此我们无法探索太多内容,但这里有一些你可以运行的示例查询,以便更详细地探索该图。通常,你应从查找给定方法开始:
The data above shows that there are ~60000 methods, and just over ~200000 edges between them. The Quarkus application demonstrated here is very basic, so there’s not a lot we can explore, but here are some example queries you can run to explore the graph in more detail. Typically, you’d start by looking for a given method:
match (m:Method) where m.name = "hello" return *
在此基础上,你可以将给定方法缩小到特定类型:
From there, you can narrow down to a given method on a specific type:
match (m:Method) where m.name = "hello" and m.type =~ ".*GreetingResource" return *
找到你所寻找的特定方法的节点后,你通常会希望得到答案的一个典型问题是:此方法为何包含在调用树中?为此,从该方法开始,按给定深度寻找传入连接,从结束方法开始。例如,可以直接调用方法的方法可以通过以下方式找到:
Once you’ve located the node for the specific method you’re after, a typical question you’d want to get an answer for is: why does this method get included in the call tree? To do that, start from the method and look for incoming connections at a given depth, starting from the end method. For example, methods that directly call a method can be located via:
match (m:Method) <- [*1..1] - (o) where m.name = "hello" and m.type =~ ".*GreetingResource" return *
然后,你可以按深度 2 查找直接调用,因此你将搜索调用调用目标方法的方法的方法:
Then you can look for direct calls at depth of 2, so you’d search for methods that call methods that call into the target method:
match (m:Method) <- [*1..2] - (o) where m.name = "hello" and m.type =~ ".*GreetingResource" return *
你可以继续向上逐层搜索,但不幸的是,如果你到达节点过多的深度,Neo4j 浏览器将无法将它们全部可视化。出现这种情况时,你可以选择直接针对 cypher shell 运行查询:
You can continue going up layers, but unfortunately if you reach a depth with too many nodes, the Neo4j browser will be unable to visualize them all. When that happens, you can alternatively run the queries directly against the cypher shell:
docker exec testneo4j bin/cypher-shell -u neo4j -p ${NEO_PASS} \
"match (m:Method) <- [*1..10] - (o) where m.name = 'hello' and m.type =~ '.*GreetingResource' return *"
有关更多信息,请查看本 blog post,它使用上述技术探索了 Quarkus Hibernate ORM 快速入门。
For further information, check out this blog post that explores the Quarkus Hibernate ORM quickstart using the techniques explained above.
Used Packages/Classes/Methods Reports
used_packages
、`used_classes`和`used_methods`文本文件报告对于比较不同版本的应用程序非常方便,例如,为什么图像构建需要更长时间?或者为什么图像现在更大?
used_packages
, used_classes
and used_methods
text file reports come in handy when comparing different versions of the application,
e.g. why does the image take longer to build? Or why is the image bigger now?
Further Reports
除了使用 -Dquarkus.native.enable-reports
选项启用的报告之外,Mandrel 还可以生成进一步的报告。这些报告称为专家选项,您可以通过运行以下命令来了解有关它们的更多信息:
Mandrel can produce further reports beyond the ones that are enabled with the -Dquarkus.native.enable-reports
option.
These are called expert options and you can learn more about them by running:
docker run quay.io/quarkus/ubi-quarkus-mandrel-builder-image:{mandrel-flavor} --expert-options-all
这些专家选项不被视为 GraalVM 原生映像 API 的一部分,因此它们可能随时更改。
These expert options are not considered part of the GraalVM native image API, so they might change anytime.
要使用这些专家选项,请以逗号分隔的方式将它们添加到 -Dquarkus.native.additional-build-args
参数中。
To use these expert options, add them comma separated to the -Dquarkus.native.additional-build-args
parameter.
Build-time vs Run-time Initialization
Quarkus 指示 Mandrel 在构建时尽可能多地进行初始化,以便运行时启动能够尽可能快。这在容器化环境中很重要,其中启动速度对应用程序准备就绪以执行工作的时间有很大影响。构建时初始化还可以最大程度地降低由于不受支持的功能通过运行时初始化变得可达而导致运行时故障的风险,从而使 Quarkus 更加可靠。
Quarkus instructs Mandrel to initialize as much as possible at build time, so that runtime startup can be as fast as possible. This is important in containerized environments where the startup speed has a big impact on how quickly an application is ready to do work. Build time initialization also minimizes the risk of runtime failures due to unsupported features becoming reachable through runtime initialization, thus making Quarkus more reliable.
构建时初始化代码最常见的示例是静态变量和块。虽然 Mandrel 默认在运行时执行这些代码,但出于给出的原因,Quarkus 指示 Mandrel 在构建时运行这些代码。
The most common examples of build-time initialized code are static variables and blocks. Although Mandrel executes those at run-time by default, Quarkus instructs Mandrel to run them at build-time for the reasons given.
这意味着任何已初始化的静态变量,或在静态块中已初始化的变量,即使重新启动应用程序,也会保持相同的值。与以 Java 执行时的情况相比,这是一个不同的行为。
This means that any static variables initialized inline, or initialized in a static block, will keep the same value even if the application is restarted. This is a different behaviour compared to what would happen if executed as Java.
要通过一个非常基本的示例来观察这一点,请向应用程序添加一个如下所示的新 TimestampResource
:
To see this in action with a very basic example,
add a new TimestampResource
to the application that looks like this:
package org.acme;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
@Path("/timestamp")
public class TimestampResource {
static long firstAccess = System.currentTimeMillis();
@GET
@Produces(MediaType.TEXT_PLAIN)
public String timestamp() {
return "First access " + firstAccess;
}
}
使用以下命令重建二进制文件:
Rebuild the binary using:
./mvnw package -DskipTests -Dnative
在一个终端中运行应用程序(确保在此之前停止任何其他本机可执行容器的运行):
Run the application in one terminal (make sure you stop any other native executable container runs before executing this):
docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
从另一个终端多次发送 GET
请求:
Send a GET
request multiple times from another terminal:
curl -w '\n' http://localhost:8080/timestamp # run this multiple times
以查看当前时间如何已经融入二进制文件。此时间是在二进制文件构建时计算得出的,因此应用程序重启不会产生任何影响。
to see how the current time has been baked into the binary. This time was calculated when the binary was being built, hence application restarts have no effect.
在某些情况下,构建时初始化可能会导致在构建本机可执行文件时出错。一个示例是当在构建时计算出某个不得驻留在将融入到二进制文件中的 JVM 堆中的值。要观察这一点,请添加此 REST 资源:
In some situations, built time initializations can lead to errors when building native executables. One example is when a value gets computed at build time which is forbidden to reside in the heap of the JVM that gets baked into the binary. To see this in action, add this REST resource:
package org.acme;
import javax.crypto.Cipher;
import javax.crypto.NoSuchPaddingException;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import java.nio.charset.StandardCharsets;
import java.security.KeyPair;
import java.security.KeyPairGenerator;
import java.security.NoSuchAlgorithmException;
@Path("/encrypt-decrypt")
public class EncryptDecryptResource {
static final KeyPairGenerator KEY_PAIR_GEN;
static final Cipher CIPHER;
static {
try {
KEY_PAIR_GEN = KeyPairGenerator.getInstance("RSA");
KEY_PAIR_GEN.initialize(1024);
CIPHER = Cipher.getInstance("RSA");
} catch (NoSuchAlgorithmException | NoSuchPaddingException e) {
throw new RuntimeException(e);
}
}
@GET
@Path("/{message}")
public String encryptDecrypt(String message) throws Exception {
KeyPair keyPair = KEY_PAIR_GEN.generateKeyPair();
byte[] text = message.getBytes(StandardCharsets.UTF_8);
// Encrypt with private key
CIPHER.init(Cipher.ENCRYPT_MODE, keyPair.getPrivate());
byte[] encrypted = CIPHER.doFinal(text);
// Decrypt with public key
CIPHER.init(Cipher.DECRYPT_MODE, keyPair.getPublic());
byte[] unencrypted = CIPHER.doFinal(encrypted);
return new String(unencrypted, StandardCharsets.UTF_8);
}
}
在尝试重建应用程序时,将遇到错误:
When trying to rebuild the application, you’ll encounter an error:
./mvnw package -DskipTests -Dnative
...
Error: Unsupported features in 2 methods
Detailed message:
Error: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected. To see how this object got instantiated use --trace-object-instantiation=java.security.SecureRandom. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
Trace: Object was reached by
reading field java.security.KeyPairGenerator$Delegate.initRandom of
constant java.security.KeyPairGenerator$Delegate@58b0fe1b reached by
reading field org.acme.EncryptDecryptResource.KEY_PAIR_GEN
Error: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected. To see how this object got instantiated use --trace-object-instantiation=java.security.SecureRandom. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
Trace: Object was reached by
reading field sun.security.rsa.RSAKeyPairGenerator.random of
constant sun.security.rsa.RSAKeyPairGenerator$Legacy@3248a092 reached by
reading field java.security.KeyPairGenerator$Delegate.spi of
constant java.security.KeyPairGenerator$Delegate@58b0fe1b reached by
reading field org.acme.EncryptDecryptResource.KEY_PAIR_GEN
因此,以上消息告诉我们,我们的应用程序将应该随机的值缓存为常量。这是不可取的,因为应该随机的东西不再是随机的了,因为种子已经融入到了映像中。以上消息非常清楚地表明了导致这种情况的原因,但在其他情况下,原因可能会更加模糊。作为下一步,我们将向原生可执行文件生成添加一些额外的标志以获取更多信息。
So, what the message above is telling us is that our application caches a value that is supposed to be random as a constant. This is not desirable because something that’s supposed to be random is no longer so, because the seed is baked in the image. The message above makes it quite clear what is causing this, but in other situations the cause might be more obfuscated. As a next step, we’ll add some extra flags to the native executable generation to get more information.
根据消息的建议,让我们首先添加一个选项来跟踪对象实例化:
As suggested by the message, let’s start by adding an option to track object instantiation:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args="--trace-object-instantiation=java.security.SecureRandom"
...
Error: Unsupported features in 2 methods
Detailed message:
Error: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected. Object has been initialized by the com.sun.jndi.dns.DnsClient class initializer with a trace:
at java.security.SecureRandom.<init>(SecureRandom.java:218)
at sun.security.jca.JCAUtil$CachedSecureRandomHolder.<clinit>(JCAUtil.java:59)
at sun.security.jca.JCAUtil.getSecureRandom(JCAUtil.java:69)
at com.sun.jndi.dns.DnsClient.<clinit>(DnsClient.java:82)
. Try avoiding to initialize the class that caused initialization of the object. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
Trace: Object was reached by
reading field java.security.KeyPairGenerator$Delegate.initRandom of
constant java.security.KeyPairGenerator$Delegate@4a5058f9 reached by
reading field org.acme.EncryptDecryptResource.KEY_PAIR_GEN
Error: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected. Object has been initialized by the com.sun.jndi.dns.DnsClient class initializer with a trace:
at java.security.SecureRandom.<init>(SecureRandom.java:218)
at sun.security.jca.JCAUtil$CachedSecureRandomHolder.<clinit>(JCAUtil.java:59)
at sun.security.jca.JCAUtil.getSecureRandom(JCAUtil.java:69)
at com.sun.jndi.dns.DnsClient.<clinit>(DnsClient.java:82)
. Try avoiding to initialize the class that caused initialization of the object. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
Trace: Object was reached by
reading field sun.security.rsa.RSAKeyPairGenerator.random of
constant sun.security.rsa.RSAKeyPairGenerator$Legacy@71880cf1 reached by
reading field java.security.KeyPairGenerator$Delegate.spi of
constant java.security.KeyPairGenerator$Delegate@4a5058f9 reached by
reading field org.acme.EncryptDecryptResource.KEY_PAIR_GEN
错误消息指向示例中的代码,但是出现对 DnsClient
的引用可能会令人惊讶。为什么呢?关键在于 KeyPairGenerator.initialize()
方法调用内部发生的事情。它使用了 JCAUtil.getSecureRandom()
,这就是出现问题的原因,但有时跟踪选项可能会显示一些不代表实际发生情况的堆栈跟踪。最佳选择是在源代码中进行深入挖掘并将跟踪输出用作指导,但不能作为全部事实。
The error messages point to the code in the example,
but it can be surprising that a reference to DnsClient
appears.
Why is that?
The key is in what happens inside KeyPairGenerator.initialize()
method call.
It uses JCAUtil.getSecureRandom()
which is why this is problematic,
but sometimes the tracing options can show some stack traces that do not represent what happens in reality.
The best option is to dig through the source code and use tracing output for guidance but not as full truth.
将 KEY_PAIR_GEN.initialize(1024);
调用移动到运行时执行方法 encryptDecrypt
就足以解决此特定问题。重建应用程序并验证 encrypt/decrypt 端点是否如预期的那样工作,方法是发送任何消息并检查回复是否与传入消息相同:
Moving the KEY_PAIR_GEN.initialize(1024);
call to the run-time executed method encryptDecrypt
is enough to solve this particular issue.
Rebuild the application and verify that encrypt/decrypt endpoint works as expected by sending any message and check if the reply is the same as the incoming message:
$ ./mvnw package -DskipTests -Dnative
...
$ docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
...
$ curl -w '\n' http://localhost:8080/encrypt-decrypt/hellomandrel
hellomandrel
可以通过使用 -Dquarkus.native.additional-build-args
传入 -H:+PrintClassInitialization
标志获得有关初始化了哪些类以及原因的附加信息。
Additional information on which classes are initialized and why can be obtained by passing in the -H:+PrintClassInitialization
flag via -Dquarkus.native.additional-build-args
.
Profile Runtime Behaviour
Single Thread
在本练习中,我们对编译为本机可执行文件的一些 Quarkus 应用程序的运行时行为进行分析,以确定瓶颈所在。假设您所处的情况无法对纯 Java 版本进行分析,可能是因为问题只出现在应用程序的本机版本中。
In this exercise, we profile the runtime behaviour of some Quarkus application that was compiled to a native executable to determine where the bottleneck is. Assume that you’re in a scenario where profiling the pure Java version is not possible, maybe because the issue only occurs with the native version of the application.
添加一个 REST 资源,代码如下(示例由 Andrei Pangin’s Java Profiling presentation 提供):
Add a REST resource with the following code (example courtesy of Andrei Pangin’s Java Profiling presentation):
package org.acme;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
@Path("/string-builder")
public class StringBuilderResource {
@GET
@Produces(MediaType.TEXT_PLAIN)
public String appendDelete() {
StringBuilder sb = new StringBuilder();
sb.append(new char[1_000_000]);
do
{
sb.append(12345);
sb.delete(0, 5);
} while (Thread.currentThread().isAlive());
return "Never happens";
}
}
重新编译应用程序、重新构建二进制文件并运行它。尝试一个简单的 curl 命令永远都无法完成,这是预料之中的:
Recompile the application, rebuild the binary and run it. Attempting a simple curl will never complete, as expected:
$ ./mvnw package -DskipTests -Dnative
...
$ docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
...
$ curl http://localhost:8080/string-builder # this will never complete
但是,我们试图回答的问题是:此类代码的瓶颈是什么?是追加字符吗?是删除它吗?是检查线程是否处于活动状态吗?
However, the question we’re trying to answer here is: what would be the bottleneck of such code? Is it appending the characters? Is it deleting it? Is it checking whether the thread is alive?
由于我们处理的是 Linux 本机可执行文件,我们可以直接使用 perf
等工具。要使用 perf
,请转到项目的根目录,并将前面创建的工具容器作为特权用户启动:
Since we’re dealing with a linux native executable,
we can use tools like perf
directly.
To use perf
,
go to the root of the project and start the tools container created earlier as a privileged user:
docker run --privileged -t -i --rm -v ${PWD}:/data -p 8080:8080 fedora-tools:v1
请注意,为了在指南中使用 Note that in order to use |
容器运行后,您需要确保内核已准备好进行分析练习:
Once the container is running, you need to ensure that the kernel is ready for the profiling exercises:
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
上述内核修改也适用于 Linux 虚拟机。如果在裸机 Linux 计算机上运行,则只需要调整 The kernel modifications above also apply to Linux virtual machines.
If running on a bare metal Linux machine,
tweaking only |
然后,从工具容器内部执行:
Then, from inside the tools container we execute:
perf record -F 1009 -g -a ./target/debugging-native-1.0.0-SNAPSHOT-runner
上述 The |
在 perf record
运行时,打开另一个窗口并访问端点:
While perf record
is running, open another window and access the endpoint:
curl http://localhost:8080/string-builder # this will never complete
几秒钟后,停止 perf record
进程。这将生成一个 perf.data
文件。我们可以使用 perf report
来检查 perf 数据,但您通常可以使用火焰图的形式更直观地展示这些数据。要生成火焰图,我们将使用 FlameGraph GitHub repository,该工具已安装在工具容器内部。
After a few seconds, halt the perf record
process.
This will generate a perf.data
file.
We could use perf report
to inspect the perf data,
but you can often get a better picture showing that data as a flame graph.
To generate flame graphs, we will use
FlameGraph GitHub repository,
which has already been installed inside the tools container.
接下来,使用通过 perf record
捕获的数据生成火焰图:
Next, generate a flame graph using the data captured via perf record
:
perf script -i perf.data | ${FG_HOME}/stackcollapse-perf.pl | ${FG_HOME}/flamegraph.pl > flamegraph.svg
火焰图是一个 svg 文件,Web 浏览器(如 Firefox)可以轻松显示它。在上述两个命令完成后,您可以在浏览器中打开 flamegraph.svg
:
The flame graph is a svg file that a web browser, such as Firefox, can easily display.
After the above two commands complete one can open flamegraph.svg
in their browser:
我们看到大部分时间都花在我们的主函数上,但我们没有看到 StringBuilderResource
类,也没有看到我们正在调用的 StringBuilder
类。我们应该查看二进制文件的符号表:我们是否可以找到我们的类和 StringBuilder
的符号?我们需要这些信息才能获取有意义的数据。在工具容器中,查询符号表:
We see a big majority of time spent in what is supposed to be our main,
but we see no trace of the StringBuilderResource
class,
nor the StringBuilder
class we’re calling.
We should look at the symbol table of the binary:
can we find symbols for our class and StringBuilder
?
We need those in order to get meaningful data.
From within the tools container, query the symbol table:
objdump -t ./target/debugging-native-1.0.0-SNAPSHOT-runner | grep StringBuilder
[no output]
查询符号表时没有输出。这就是我们看不到火焰图中任何调用图的原因。这是 native-image 做出的一个深思熟虑的决定。默认情况下,它会从二进制文件中删除符号。
No output appears when querying the symbol table. This is why we don’t see any call graphs in the flame graphs. This is a deliberate decision that native-image makes. By default, it removes symbols from the binary.
要重新获取符号,我们需要重建二进制文件,指示 GraalVM 不删除符号。除此之外,启用 DWARF 调试信息,以便堆栈跟踪可以用该信息填充。在工具容器外执行:
To regain the symbols, we need to rebuild the binary instructing GraalVM not to delete the symbols. On top of that, enable DWARF debug info so that the stack traces can be populated with that information. From outside the tools container, execute:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.debug.enabled \
-Dquarkus.native.additional-build-args=-H:-DeleteLocalSymbols
接下来,如果您退出,请重新进入工具容器,并使用 objdump
检查本机可执行文件,看看符号现在是否存在:
Next, re-enter the tools container if you exited,
and inspect the native executable with objdump
,
and see how the symbols are now present:
$ objdump -t ./target/debugging-native-1.0.0-SNAPSHOT-runner | grep StringBuilder
000000000050a940 l F .text 0000000000000091 .hidden ReflectionAccessorHolder_StringBuilderResource_appendDelete_9e06d4817d0208a0cce97ebcc0952534cac45a19_e22addf7d3eaa3ad14013ce01941dc25beba7621
000000000050a9e0 l F .text 00000000000000bb .hidden ReflectionAccessorHolder_StringBuilderResource_constructor_0f8140ea801718b80c05b979a515d8a67b8f3208_12baae06bcd6a1ef9432189004ae4e4e176dd5a4
...
您应该看到一个与该模式匹配的符号长列表。
You should see a long list of symbols that match that pattern.
然后通过perf运行可执行文件,indicating that the call graph is dwarf:
Then, run the executable through perf, indicating that the call graph is dwarf:
perf record -F 1009 --call-graph dwarf -a ./target/debugging-native-1.0.0-SNAPSHOT-runner
再次运行curl命令,停止二进制文件,生成火焰图并打开它:
Run the curl command once again, stop the binary, generate the flamegraphs and open it:
perf script -i perf.data | ${FG_HOME}/stackcollapse-perf.pl | ${FG_HOME}/flamegraph.pl > flamegraph.svg
火焰图现在显示了瓶颈所在之处。这是当调用`StringBuilder.delete()`并调用`System.arraycopy()`时。问题在于需要以非常小的增量移动100万个字符:
The flamegraph now shows where the bottleneck is.
It’s when StringBuilder.delete()
is called which calls System.arraycopy()
.
The issue is that 1 million characters need to be shifted in very small increments:
Multi-Thread
尝试理解多线程程序的运行时行为时,可能需要特别注意。为了演示这一点,将该`MulticastResource`代码添加到你的项目(示例由 Andrei Pangin’s Java Profiling presentation提供):
Multithreaded programs might require special attention when trying to understand their runtime behaviour.
To demonstrate this, add this MulticastResource
code to your project
(example courtesy of Andrei Pangin’s Java Profiling presentation):
package org.acme;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.DatagramChannel;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.atomic.AtomicInteger;
@Path("/multicast")
public class MulticastResource
{
@GET
@Produces(MediaType.TEXT_PLAIN)
public String send() throws Exception {
sendMulticasts();
return "Multicast packets sent";
}
static void sendMulticasts() throws Exception {
DatagramChannel ch = DatagramChannel.open();
ch.bind(new InetSocketAddress(5555));
ch.configureBlocking(false);
ExecutorService pool =
Executors.newCachedThreadPool(new ShortNameThreadFactory());
for (int i = 0; i < 10; i++) {
pool.submit(() -> {
final ByteBuffer buf = ByteBuffer.allocateDirect(1000);
final InetSocketAddress remoteAddr =
new InetSocketAddress("127.0.0.1", 5556);
while (true) {
buf.clear();
ch.send(buf, remoteAddr);
}
});
}
System.out.println("Warming up...");
Thread.sleep(3000);
System.out.println("Benchmarking...");
Thread.sleep(5000);
}
private static final class ShortNameThreadFactory implements ThreadFactory {
private final AtomicInteger threadNumber = new AtomicInteger(1);
private final String namePrefix = "thread-";
public Thread newThread(Runnable r) {
return new Thread(r, namePrefix + threadNumber.getAndIncrement());
}
}
}
使用调试信息构建本机可执行文件:
Build the native executable with debug info:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.debug.enabled \
-Dquarkus.native.additional-build-args=-H:-DeleteLocalSymbols
从工具容器内部(作为特权用户)通过`perf`运行本机可执行文件:
From inside the tools container (as privileged user) run the native executable through perf
:
perf record -F 1009 --call-graph dwarf -a ./target/debugging-native-1.0.0-SNAPSHOT-runner
调用端点以发送多播数据包:
Invoke the endpoint to send the multicast packets:
curl -w '\n' http://localhost:8080/multicast
创建并打开火焰图:
Make and open a flamegraph:
perf script -i perf.data | ${FG_HOME}/stackcollapse-perf.pl | ${FG_HOME}/flamegraph.pl > flamegraph.svg
产生的火焰图看起来很奇怪。每个线程都独立处理,即使它们都执行相同的工作。这使得难以清楚了解程序中的瓶颈。
The flamegraph produced looks odd. Each thread is treated independently even though they all do the same work. This makes it difficult to have a clear picture of the bottlenecks in the program.
发生这种情况是因为从`perf`的角度来看,每个线程都是一个不同的命令。如果我们检查`perf report`,我们可以看到:
This is happening because from a perf
perspective, each thread is a different command.
We can see that if we inspect perf report
:
perf report --stdio
# Children Self Command Shared Object Symbol
# ........ ........ ............... ...................................... ......................................................................................
...
6.95% 0.03% thread-2 debugging-native-1.0.0-SNAPSHOT-runner [.] MulticastResource_lambda$sendMulticasts$0_cb1f7b5dcaed7dd4e3f90d18bad517d67eae4d88
...
4.60% 0.02% thread-10 debugging-native-1.0.0-SNAPSHOT-runner [.] MulticastResource_lambda$sendMulticasts$0_cb1f7b5dcaed7dd4e3f90d18bad517d67eae4d88
...
可以通过对perf输出应用某些修改来解决此问题,以使所有线程具有相同的名称。例如:
This can be worked around by applying some modifications to the perf output, in order to make all threads have the same name. E.g.
perf script | sed -E "s/thread-[0-9]*/thread/" | ${FG_HOME}/stackcollapse-perf.pl | ${FG_HOME}/flamegraph.pl > flamegraph.svg
当你打开火焰图时,你将看到所有线程的工作都折叠到一个区域。然后,你可以清楚地看到一些锁定可能会影响性能。
When you open the flamegraph, you will see all threads' work collapsed into a single area. Then, you can clearly see that there’s some locking that could affect performance.
Debugging Native Crashes
使用本机可执行文件的一个缺点是,它们无法使用标准Java调试器进行调试,而需要使用`gdb`,GNU项目调试器对它们进行调试。为了演示如何执行此操作,我们将生成一个本机Quarkus应用程序,该应用程序在访问[role="bare"][role="bare"]http://localhost:8080/crash时会因分段错误而崩溃。为此,将以下REST资源添加到项目:
One of the drawbacks of using native executables is that they cannot be debugged using the standard Java debuggers,
instead we need to debug them using gdb
, the GNU Project debugger.
To demonstrate how to do this,
we are going to generate a native Quarkus application that crashes due to a Segmentation Fault when accessing [role="bare"]http://localhost:8080/crash.
To achieve this, add the following REST resource to the project:
package org.acme;
import sun.misc.Unsafe;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import java.lang.reflect.Field;
@Path("/crash")
public class CrashResource {
@GET
@Produces(MediaType.TEXT_PLAIN)
public String hello() {
Field theUnsafe = null;
try {
theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
Unsafe unsafe = (Unsafe) theUnsafe.get(null);
unsafe.copyMemory(0, 128, 256);
} catch (NoSuchFieldException | IllegalAccessException e) {
e.printStackTrace();
}
return "Never happens";
}
}
此代码将尝试从地址`0x0`复制256个字节到`0x80`,从而导致分段错误。为了验证这一点,编译并运行示例应用程序:
This code will try to copy 256 bytes from address 0x0
to 0x80
resulting in a Segmentation Fault.
To verify this, compile and run the example application:
$ ./mvnw package -DskipTests -Dnative
...
$ docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
...
$ curl http://localhost:8080/crash
这将产生以下输出:
This will result in the following output:
$ docker run -i --rm -p 8080:8080 test/debugging-native:1.0.0-SNAPSHOT
...
Segfault detected, aborting process. Use runtime option -R:-InstallSegfaultHandler if you don't want to use SubstrateSegfaultHandler.
...
以上省略的输出包含有关问题原因的线索,但在本练习中,我们将假设未提供任何信息。让我们尝试使用`gdb`调试分段错误。为此,转到项目根目录并进入工具容器:
The omitted output above contains clues to what caused the issue,
but in this exercise we are going to assume that no information was provided.
Let’s try to debug the segmentation fault using gdb
.
To do that, go to the root of the project and enter the tools container:
docker run -t -i --rm -v ${PWD}:/data -p 8080:8080 fedora-tools:v1 /bin/bash
然后在`gdb`中启动应用程序并执行`run`。
Then start the application in gdb
and execute run
.
gdb ./target/debugging-native-1.0.0-SNAPSHOT-runner
...
Reading symbols from ./target/debugging-native-1.0.0-SNAPSHOT-runner...
(No debugging symbols found in ./target/debugging-ntaive-1.0.0-SNAPSHOT-runner)
(gdb) run
Starting program: /data/target/debugging-native-1.0.0-SNAPSHOT-runner
接下来,尝试访问[role="bare"][role="bare"]http://localhost:8080/crash:
Next, try to access [role="bare"]http://localhost:8080/crash:
curl http://localhost:8080/crash
这将在`gdb`中产生以下消息:
This will result in the following message in gdb
:
Thread 4 "ecutor-thread-0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fe103dff640 (LWP 190)]
0x0000000000461f6e in ?? ()
如果我们尝试获取更多有关导致该崩溃的回溯信息,我们将会看到没有足够的信息可用。
If we try to get more info about the backtrace that led to this crash we will see that there is not enough information available.
(gdb) bt
#0 0x0000000000418b5e in ?? ()
#1 0x00007ffff6f2d328 in ?? ()
#2 0x0000000000418a04 in ?? ()
#3 0x00007ffff44062a0 in ?? ()
#4 0x00000000010c3dd3 in ?? ()
#5 0x0000000000000100 in ?? ()
#6 0x0000000000000000 in ?? ()
这是因为我们没有使用 -Dquarkus.native.debug.enabled
编译 Quarkus 应用程序,因此,正如 gdb
前面的“No debugging symbols found in ./target/debugging-native-1.0.0-SNAPSHOT-runner”消息所指出的,gdb
无法找到针对我们本机可执行文件的调试符号。
This is because we didn’t compile the Quarkus application with -Dquarkus.native.debug.enabled
,
so gdb
cannot find debugging symbols for our native executable,
as indicated by the "No debugging symbols found in ./target/debugging-native-1.0.0-SNAPSHOT-runner" message in the beginning of gdb
.
使用 -Dquarkus.native.debug.enabled
重新编译 Quarkus 应用程序并通过 gdb
重新运行它,我们现在能够获得一个回溯,清楚地说明导致崩溃的原因。另外,添加 -H:-OmitInlinedMethodDebugLineInfo
选项以避免从回溯中省略内联方法:
Recompiling the Quarkus application with -Dquarkus.native.debug.enabled
and rerunning it through gdb
we are now able to get a backtrace making clear what caused the crash.
On top of that, add -H:-OmitInlinedMethodDebugLineInfo
option to avoid inlined methods being omitted from the backtrace:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.debug.enabled \
-Dquarkus.native.additional-build-args=-H:-OmitInlinedMethodDebugLineInfo
...
$ gdb ./target/debugging-native-1.0.0-SNAPSHOT-runner
Reading symbols from ./target/debugging-native-1.0.0-SNAPSHOT-runner...
(gdb) run
Starting program: /data/target/debugging-native-1.0.0-SNAPSHOT-runner
...
$ curl http://localhost:8080/crash
这将在`gdb`中产生以下消息:
This will result in the following message in gdb
:
Thread 4 "ecutor-thread-0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeffff640 (LWP 362984)]
com.oracle.svm.core.UnmanagedMemoryUtil::copyLongsBackward(org.graalvm.word.Pointer *, org.graalvm.word.Pointer *, org.graalvm.word.UnsignedWord *) ()
at com/oracle/svm/core/UnmanagedMemoryUtil.java:169
169 com/oracle/svm/core/UnmanagedMemoryUtil.java: No such file or directory.
我们已经看到 gdb
能够告诉我们哪个方法导致了崩溃以及它在源代码中的位置。我们还可以获得导致我们达到此状态的调用图的回溯:
We already see that gdb
is able to tell us which method caused the crash and where it’s located in the source code.
We can also get a backtrace of the call graph that led us to this state:
(gdb) bt
#0 com.oracle.svm.core.UnmanagedMemoryUtil::copyLongsBackward(org.graalvm.word.Pointer *, org.graalvm.word.Pointer *, org.graalvm.word.UnsignedWord *) () at com/oracle/svm/core/UnmanagedMemoryUtil.java:169
#1 0x0000000000461e14 in com.oracle.svm.core.UnmanagedMemoryUtil::copyBackward(org.graalvm.word.Pointer *, org.graalvm.word.Pointer *, org.graalvm.word.UnsignedWord *) () at com/oracle/svm/core/UnmanagedMemoryUtil.java:110
#2 0x0000000000461dc8 in com.oracle.svm.core.UnmanagedMemoryUtil::copy(org.graalvm.word.Pointer *, org.graalvm.word.Pointer *, org.graalvm.word.UnsignedWord *) () at com/oracle/svm/core/UnmanagedMemoryUtil.java:67
#3 0x000000000045d3c0 in com.oracle.svm.core.JavaMemoryUtil::unsafeCopyMemory(java.lang.Object *, long, java.lang.Object *, long, long) () at com/oracle/svm/core/JavaMemoryUtil.java:276
#4 0x00000000013277de in jdk.internal.misc.Unsafe::copyMemory0 () at com/oracle/svm/core/jdk/SunMiscSubstitutions.java:125
#5 jdk.internal.misc.Unsafe::copyMemory(java.lang.Object *, long, java.lang.Object *, long, long) () at jdk/internal/misc/Unsafe.java:788
#6 0x00000000013b1a3f in jdk.internal.misc.Unsafe::copyMemory () at jdk/internal/misc/Unsafe.java:799
#7 sun.misc.Unsafe::copyMemory () at sun/misc/Unsafe.java:585
#8 org.acme.CrashResource::hello(void) () at org/acme/CrashResource.java:22
同样,我们也可以获取其他线程的调用图的后溯。
Similarly, we can get a backtrace of the call graph of other threads.
-
First, we can list the available threads with:[source]
(gdb) info threads Id Target Id Frame 1 Thread 0x7fcc62a07d00 (LWP 322) "debugging-nativ" 0x00007fcc62b8b77a in __futex_abstimed_wait_common () from /lib64/libc.so.6 2 Thread 0x7fcc60eff640 (LWP 326) "gnal Dispatcher" 0x00007fcc62b8b77a in __futex_abstimed_wait_common () from /lib64/libc.so.6 * 4 Thread 0x7fcc5b7fe640 (LWP 328) "ecutor-thread-0" com.oracle.svm.core.UnmanagedMemoryUtil::copyLongsBackward(org.graalvm.word.Pointer *, org.graalvm.word.Pointer *, org.graalvm.word.UnsignedWord *) () at com/oracle/svm/core/UnmanagedMemoryUtil.java:169 5 Thread 0x7fcc5abff640 (LWP 329) "-thread-checker" 0x00007fcc62b8b77a in __futex_abstimed_wait_common () from /lib64/libc.so.6 6 Thread 0x7fcc59dff640 (LWP 330) "ntloop-thread-0" 0x00007fcc62c12c9e in epoll_wait () from /lib64/libc.so.6 ...
-
select the thread we want to inspect, e.g. thread 1:[source]
(gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff7a58d00 (LWP 1028851))] #0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x2cd7adc) at futex-internal.c:57 57 return INTERNAL_SYSCALL_CANCEL (futex_time64, futex_word, op, expected,
-
and, finally, print the stack trace:[source]
(gdb) bt #0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x2cd7adc) at futex-internal.c:57 #1 __futex_abstimed_wait_common (futex_word=futex_word@entry=0x2cd7adc, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87 #2 0x00007ffff7bdd79f in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x2cd7adc, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139 #3 0x00007ffff7bdfeb0 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x2ca07b0, cond=0x2cd7ab0) at pthread_cond_wait.c:504 #4 ___pthread_cond_wait (cond=0x2cd7ab0, mutex=0x2ca07b0) at pthread_cond_wait.c:619 #5 0x00000000004e2014 in com.oracle.svm.core.posix.headers.Pthread::pthread_cond_wait () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:252 #6 com.oracle.svm.core.posix.thread.PosixParkEvent::condWait(void) () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:252 #7 0x0000000000547070 in com.oracle.svm.core.thread.JavaThreads::park(void) () at com/oracle/svm/core/thread/JavaThreads.java:764 #8 0x0000000000fc5f44 in jdk.internal.misc.Unsafe::park(boolean, long) () at com/oracle/svm/core/thread/Target_jdk_internal_misc_Unsafe_JavaThreads.java:49 #9 0x0000000000eac1ad in java.util.concurrent.locks.LockSupport::park(java.lang.Object *) () at java/util/concurrent/locks/LockSupport.java:194 #10 0x0000000000ea5d68 in java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject::awaitUninterruptibly(void) () at java/util/concurrent/locks/AbstractQueuedSynchronizer.java:2018 #11 0x00000000008b6b30 in io.quarkus.runtime.ApplicationLifecycleManager::run(io.quarkus.runtime.Application *, java.lang.Class *, java.util.function.BiConsumer *, java.lang.String[] *) () at io/quarkus/runtime/ApplicationLifecycleManager.java:144 #12 0x00000000008bc055 in io.quarkus.runtime.Quarkus::run(java.lang.Class *, java.util.function.BiConsumer *, java.lang.String[] *) () at io/quarkus/runtime/Quarkus.java:67 #13 0x000000000045c88b in io.quarkus.runtime.Quarkus::run () at io/quarkus/runtime/Quarkus.java:41 #14 io.quarkus.runtime.Quarkus::run () at io/quarkus/runtime/Quarkus.java:120 #15 0x000000000045c88b in io.quarkus.runner.GeneratedMain::main () #16 com.oracle.svm.core.JavaMainWrapper::runCore () at com/oracle/svm/core/JavaMainWrapper.java:150 #17 com.oracle.svm.core.JavaMainWrapper::run(int, org.graalvm.nativeimage.c.type.CCharPointerPointer *) () at com/oracle/svm/core/JavaMainWrapper.java:186 #18 0x000000000048084d in com.oracle.svm.core.code.IsolateEnterStub::JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b(int, org.graalvm.nativeimage.c.type.CCharPointerPointer *) () at com/oracle/svm/core/JavaMainWrapper.java:280
或者,我们也可以用一个命令来列出所有线程的后溯:
Alternatively, we can list the backtraces of all threads with a single command:
(gdb) thread apply all backtrace
Thread 22 (Thread 0x7fffc8dff640 (LWP 1028872) "tloop-thread-15"):
#0 0x00007ffff7c64c2e in epoll_wait (epfd=8, events=0x2ca3880, maxevents=1024, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0x000000000166e01c in Java_sun_nio_ch_EPoll_wait ()
#2 0x00000000011bfece in sun.nio.ch.EPoll::wait(int, long, int, int) () at com/oracle/svm/core/stack/JavaFrameAnchors.java:42
#3 0x00000000011c08d2 in sun.nio.ch.EPollSelectorImpl::doSelect(java.util.function.Consumer *, long) () at sun/nio/ch/EPollSelectorImpl.java:120
#4 0x00000000011d8977 in sun.nio.ch.SelectorImpl::lockAndDoSelect(java.util.function.Consumer *, long) () at sun/nio/ch/SelectorImpl.java:124
#5 0x0000000000705720 in sun.nio.ch.SelectorImpl::select () at sun/nio/ch/SelectorImpl.java:141
#6 io.netty.channel.nio.SelectedSelectionKeySetSelector::select(void) () at io/netty/channel/nio/SelectedSelectionKeySetSelector.java:68
#7 0x0000000000703c2e in io.netty.channel.nio.NioEventLoop::select(long) () at io/netty/channel/nio/NioEventLoop.java:813
#8 0x0000000000701a5f in io.netty.channel.nio.NioEventLoop::run(void) () at io/netty/channel/nio/NioEventLoop.java:460
#9 0x00000000008496df in io.netty.util.concurrent.SingleThreadEventExecutor$4::run(void) () at io/netty/util/concurrent/SingleThreadEventExecutor.java:986
#10 0x0000000000860762 in io.netty.util.internal.ThreadExecutorMap$2::run(void) () at io/netty/util/internal/ThreadExecutorMap.java:74
#11 0x0000000000840da4 in io.netty.util.concurrent.FastThreadLocalRunnable::run(void) () at io/netty/util/concurrent/FastThreadLocalRunnable.java:30
#12 0x0000000000b7dd04 in java.lang.Thread::run(void) () at java/lang/Thread.java:829
#13 0x0000000000547dcc in com.oracle.svm.core.thread.JavaThreads::threadStartRoutine(org.graalvm.nativeimage.ObjectHandle *) () at com/oracle/svm/core/thread/JavaThreads.java:597
#14 0x00000000004e15b1 in com.oracle.svm.core.posix.thread.PosixJavaThreads::pthreadStartRoutine(com.oracle.svm.core.thread.JavaThreads$ThreadStartData *) () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:194
#15 0x0000000000480984 in com.oracle.svm.core.code.IsolateEnterStub::PosixJavaThreads_pthreadStartRoutine_e1f4a8c0039f8337338252cd8734f63a79b5e3df(com.oracle.svm.core.thread.JavaThreads$ThreadStartData *) () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:182
#16 0x00007ffff7be0b1a in start_thread (arg=<optimized out>) at pthread_create.c:443
#17 0x00007ffff7c65650 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 21 (Thread 0x7fffc97fa640 (LWP 1028871) "tloop-thread-14"):
#0 0x00007ffff7c64c2e in epoll_wait (epfd=53, events=0x2cd0970, maxevents=1024, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0x000000000166e01c in Java_sun_nio_ch_EPoll_wait ()
#2 0x00000000011bfece in sun.nio.ch.EPoll::wait(int, long, int, int) () at com/oracle/svm/core/stack/JavaFrameAnchors.java:42
#3 0x00000000011c08d2 in sun.nio.ch.EPollSelectorImpl::doSelect(java.util.function.Consumer *, long) () at sun/nio/ch/EPollSelectorImpl.java:120
#4 0x00000000011d8977 in sun.nio.ch.SelectorImpl::lockAndDoSelect(java.util.function.Consumer *, long) () at sun/nio/ch/SelectorImpl.java:124
#5 0x0000000000705720 in sun.nio.ch.SelectorImpl::select () at sun/nio/ch/SelectorImpl.java:141
#6 io.netty.channel.nio.SelectedSelectionKeySetSelector::select(void) () at io/netty/channel/nio/SelectedSelectionKeySetSelector.java:68
#7 0x0000000000703c2e in io.netty.channel.nio.NioEventLoop::select(long) () at io/netty/channel/nio/NioEventLoop.java:813
#8 0x0000000000701a5f in io.netty.channel.nio.NioEventLoop::run(void) () at io/netty/channel/nio/NioEventLoop.java:460
#9 0x00000000008496df in io.netty.util.concurrent.SingleThreadEventExecutor$4::run(void) () at io/netty/util/concurrent/SingleThreadEventExecutor.java:986
#10 0x0000000000860762 in io.netty.util.internal.ThreadExecutorMap$2::run(void) () at io/netty/util/internal/ThreadExecutorMap.java:74
#11 0x0000000000840da4 in io.netty.util.concurrent.FastThreadLocalRunnable::run(void) () at io/netty/util/concurrent/FastThreadLocalRunnable.java:30
#12 0x0000000000b7dd04 in java.lang.Thread::run(void) () at java/lang/Thread.java:829
#13 0x0000000000547dcc in com.oracle.svm.core.thread.JavaThreads::threadStartRoutine(org.graalvm.nativeimage.ObjectHandle *) () at com/oracle/svm/core/thread/JavaThreads.java:597
#14 0x00000000004e15b1 in com.oracle.svm.core.posix.thread.PosixJavaThreads::pthreadStartRoutine(com.oracle.svm.core.thread.JavaThreads$ThreadStartData *) () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:194
#15 0x0000000000480984 in com.oracle.svm.core.code.IsolateEnterStub::PosixJavaThreads_pthreadStartRoutine_e1f4a8c0039f8337338252cd8734f63a79b5e3df(com.oracle.svm.core.thread.JavaThreads$ThreadStartData *) () at com/oracle/svm/core/posix/thread/PosixJavaThreads.java:182
#16 0x00007ffff7be0b1a in start_thread (arg=<optimized out>) at pthread_create.c:443
#17 0x00007ffff7c65650 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 20 (Thread 0x7fffc9ffb640 (LWP 1028870) "tloop-thread-13"):
...
不过请注意,尽管我们能够获取后溯,但我们仍然无法使用 list
命令列出该点处的源代码。
Note, however, that despite being able to get a backtrace we can still not list the source code at point with the list
command.
(gdb) list
164 in com/oracle/svm/core/UnmanagedMemoryUtil.java
这是因为 gdb
并不了解源文件的位置。我们正在目标目录之外运行可执行文件。为了解决此问题,我们可以从目标目录重新运行 gdb
,或者运行 directory target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/sources
,例如:
This is because gdb
is not aware of the location of the source files.
We are running the executable outside the target directory.
To fix this we can either rerun gdb
from the target directory or,
run directory target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/sources
e.g.:
(gdb) directory target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/sources
Source directories searched: /data/target/debugging-native-1.0.0-SNAPSHOT-native-image-source-jar/sources:$cdir:$cwd
(gdb) list
164 UnsignedWord offset = size;
165 while (offset.aboveOrEqual(32)) {
166 offset = offset.subtract(32);
167 Pointer src = from.add(offset);
168 Pointer dst = to.add(offset);
169 long l24 = src.readLong(24);
170 long l16 = src.readLong(16);
171 long l8 = src.readLong(8);
172 long l0 = src.readLong(0);
173 dst.writeLong(24, l24);
现在我们可以在第`169`行检查并获得可能错误的首条提示(在这种情况下,我们发现它在从包含地址`0x0000`的 src 中执行第一次读取时失败),或者使用 gdb
的 up
命令逐步向上遍历堆栈,以查看我们代码的哪一部分导致了这种情况。有关使用 gdb
调试原生可执行文件的更多信息,请参阅 GraalVM Debug Info Feature 指南。
We can now examine line 169
and get a first hint of what might be wrong
(in this case we see that it fails at the first read from src which contains the address 0x0000
),
or walk up the stack using gdb
’s up
command to see what part of our code led to this situation.
For more information about using gdb
to debug native executables, see the
GraalVM Debug Info Feature guide.
Frequently Asked Questions
Why is the process of generating a native executable slow?
原生可执行文件生成是一个多步骤过程。分析和编译步骤是所有步骤中最昂贵的,因此它们是生成原生可执行文件耗时最长的步骤。
Native executable generation is a multi-step process. The analysis and compile steps are the most expensive of all and hence the ones that dominate the time spent generating the native executable.
在分析阶段,静态点到分析从程序的主方法开始,以找出可触及的内容。随着新类的发现,根据配置,其中一些类将在此过程中进行初始化。在下一步中,对堆进行快照,并进行检查以查看在运行时需要哪些类型。初始化和堆快照可能会导致发现新类型,在这种情况下,该过程将被重复。当达到固定点(即当可触及程序不再增长时),该过程就会停止。
In the analysis phase, a static points-to analysis starts from the main method of the program to find out what is reachable. As new classes are discovered, some of them will be initialized during this process depending on the configuration. In the next step, the heap is snapshotted and checks are made to see which types need to be available at runtime. The initialization and heap snapshotting can cause new types to be discovered, in which case the process is repeated. The process stops when a fixed point is reached, that is when the reachable program grows no more.
编译步骤非常简单,它只是编译所有可触及的代码。
The compilation step is pretty straightforward, it simply compiles all the reachable code.
分析和编译阶段耗费的时间取决于应用程序的大小。应用程序越大,编译所需的时间就越长。然而,某些功能具有指数级的影响。例如,在为反射访问注册类型和方法时,分析无法轻松了解这些类型或方法的后果,因此它必须执行更多工作才能完成分析步骤。
The time spent in analysis and compilation phases depends on how big the application is. The bigger the application, the longer it takes to compile it. However, there are certain features that can have an exponential effect. For example, when registering types and methods for reflection access, the analysis can’t easily see what’s behind those types or methods, so it has to do more work to complete the analysis step.
I get a warning about using experimental options, what can I do?
从 Mandrel 23.1 和 GraalVM for JDK 21 开始,原生可执行文件生成过程会警告使用试验选项,并显示类似这样的信息:
Starting with Mandrel 23.1 and GraalVM for JDK 21, the native executable generation process will warn about the use of experimental options with a message like this:
Warning: The option '-H:ReflectionConfigurationResources=META-INF/native-image/io.micrometer/micrometer-core/reflect-config.json' is experimental and must be enabled via '-H:+UnlockExperimentalVMOptions' in the future.
如果提到的选项是由第三方库添加的,就像上面的示例中一样,您应该考虑在库的存储库中创建一个问题,要求移除该选项。如果该选项是由您的应用程序添加的,您应该考虑将其删除(如果没必要)或用 -H:+UnlockExperimentalVMOptions
和 -H:-UnlockExperimentalVMOptions
将其包装起来。
If the mentioned option is added by a third party library like in the example above, you should consider opening an issue in the library’s repository to ask for the option to be removed.
If the option is added by your application, you should consider either removing it (if it’s not necessary) or wrapping it between -H:+UnlockExperimentalVMOptions
and -H:-UnlockExperimentalVMOptions
.
I get a AnalysisError\$ParsingError
when building a native executable due to an UnresolvedElementException
, what can I do?
在构建原生可执行文件时,Quarkus 需要所有代码引用的类,无论它们是在构建时还是在运行时初始化的,都存在于类路径中。通过这种方式,它确保在运行时不会由于潜在的 NoClassDefFoundError
异常而崩溃。为了实现这一点,它使用了 GraalVM 的 --link-at-build-time
参数:
When building a native executable Quarkus requires all classes being referenced by the code, no matter if they are build-time or run-time initialized, to be present in the classpath.
This way it ensures that there will be no crashes at runtime due to potential NoClassDefFoundError
exceptions.
To achieve this it makes use of GraalVM’s --link-at-build-time
parameter:
--link-at-build-time require types to be fully defined at image build-time. If used
without args, all classes in scope of the option are required to
be fully defined.
不过,这可能会在构建时导致 AnalysisError\$ParsingError
出现,因为 UnresolvedElementException
中存在。这种情况经常出现,因为应用程序引用了 optional dependency 中的一个类。
This, however, may result in an AnalysisError\$ParsingError
due to an UnresolvedElementException
at build time.
This is often caused because the application references a class from an optional dependency.
如果你可以访问导致引用缺失依赖项的源代码,并可以对其进行更改,那么你应该考虑以下选项之一:
If you have access to the source code responsible for the reference to the missing dependency and can alter it, you should consider one of the following:
-
Remove the reference if it’s not actually necessary.
-
Move the affected code in a sub-module and make the dependency non-optional (as is the best practice).
-
Make the dependency non-optional.
在由你无法修改的第三方库导致此问题时,你应该考虑以下选项之一:
In the unfortunate case where the reference causing the issue is made by a 3rd party library, that you cannot modify, you should consider one of the following:
-
Use a class/method substitution to remove the said reference.
-
Add the optional dependency as a non-optional dependency of your project.
请注意,虽然选项 (1) 是最佳选择,是因为它最小化了应用程序占用空间,但实现它可能并不容易。更糟糕的是,由于它与第三方库的实施紧密集成了,所以它也不容易维护。选项 (2) 是解决这个问题的直接备选方案,不过其代价是可能会在生成的本机可执行文件中包含永远不会调用的代码。
Note that although option (1) is the best choice performance wise, as it minimizes the applications footprint,it might not be trivial to implement. To make matters worse, it’s also not easy to maintain as it is tightly coupled to the 3rd party library implementation. Option (2) is a straight forward alternative to work around the issue, but comes at the cost of including possibly never invoked code in the resulting native executable.
I get an OutOfMemoryError
(OOME) building native executables, what can I do?
构建本机可执行文件不仅耗时,而且还需要大量的内存。例如,构建一个本机 Quarkus Jakarta Persistence 应用程序(如 Hibernate ORM 快速入门)可能在内存中使用 6GB 到 8GB 的常驻集大小。此内存中的一大块是 Java 堆,但 JVM 运行本机构建进程的其他方面需要额外的内存。在总内存接近限制的环境中构建此类应用程序仍然是可能的,但是要做到这一点,必须缩减 GraalVM 本机映像进程的最大堆大小。要做到这一点,请使用 quarkus.native.native-image-xmx
属性设置最大堆大小。例如,我们可以通过在命令行中传递 -Dquarkus.native.native-image-xmx=5g
来指示 GraalVM 使用 5GB 的最大堆大小。
Building native executables is not only time consuming, but it also takes a fair amount of memory.
For example, building a sample native Quarkus Jakarta Persistence application such as the Hibernate ORM quickstart,
may use 6GB to 8GB resident set size in memory.
A big chunk of this memory is Java heap,
but extra memory is required for other aspects of the JVM that runs the native building process.
It is still possible to build such applications in environments that have total memory close to the limits,
but to do that it is necessary to shrink the maximum heap size of the GraalVM native image process.
To do that, set a maximum heap size using the quarkus.native.native-image-xmx
property.
For example, we can instruct GraalVM to use 5GB of maximum heap size by passing in
-Dquarkus.native.native-image-xmx=5g
in the command line.
以这种方式构建本机可执行文件可能会产生需要更多时间才能完成的副作用。这是因为垃圾回收必须更努力地工作,才能让本机映像生成有足够的可用空间来完成其工作。
Building native executables this way might have the side effect of requiring more time to complete. This is due to garbage collection having to work harder for native image generation to have free space to do its job.
请注意,与快速入门相比,典型的应用程序可能会更大,因此内存需求也可能会更高。
Note that typical applications are likely bigger than quickstarts, so the memory requirements will also likely be higher.
Why is runtime performance of a native executable inferior compared to JVM mode?
与生活中的大多数事物一样,在选择本机编译而非 JVM 模式时涉及一些权衡。因此,根据应用程序的不同,本机应用程序的运行时性能可能比 JVM 模式慢,尽管并非总是如此。
As with most things in life there are some trade-offs involved when choosing native compilation over JVM mode. So depending on the application the runtime performance of a native application might be slower compared to JVM mode, though that’s not always the case.
应用程序的 JVM 执行包括代码的运行时优化,该代码从执行期间建立的配置文件信息中获益。其中包括更多内联代码的机会、在直接路径上定位热点代码(即确保更好的指令高速缓存局部性)以及减少在冷路径上的许多代码(在 JVM 中,许多代码在某个东西尝试执行它之前都没有得到编译,而是一个会导致反优化和重新编译的陷阱)。移除冷路径提供比提前编译可用的更多优化机会,因为它显著降低了较小热点代码的分支复杂性和组合逻辑。
JVM execution of an application includes runtime optimization of the code that profits from profile information built up during execution. That includes the opportunities to inline a lot more of the code, locate hot code on direct paths (i.e. ensure better instruction cache locality) and cut out a lot of the code on cold paths (on the JVM a lot of code does not get compiled until something tries to execute it — it is replaced with a trap that causes deoptimization and recompilation). Removal of cold paths provides many more optimization opportunities than are available for ahead of time compilation because it significantly reduces the branch complexity and combinatorial logic of the smaller amount of hot code that is compiled.
相比之下,本机可执行文件编译在脱机编译代码时必须满足所有可能的执行路径,因为它不知道哪些是热点或冷路径,并且不能利用设置陷阱和击中后再重新编译的技巧。出于同样的原因,它不能加载骰子,以确保通过将热点路径并排放置来最大限度地减少代码高速缓存冲突。本机可执行文件生成能够移除一些代码,这是由于封闭世界假设,但这通常不足以弥补 JVM JIT 编译器提供的性能分析和运行时反优化和重新编译的所有好处。
By contrast, native executable compilation has to cater for all possible execution paths when it compiles code offline since it does not know which are the hot or cold paths and cannot use the trick of planting a trap and recompiling if it is hit. For the same reason it cannot load the dice to ensure that code cache conflicts are minimized by co-locating hot paths adjacent. Native executable generation is able to remove some code because of the closed world hypothesis but that is often not enough to make up for all the benefits that profiling and runtime deopt & recompile provides to the JVM JIT compiler.
但是,请注意,你要为这种潜在的更高 JVM 速度付出代价,并且这个代价是对资源使用量(CPU 和内存)和启动时间的增加,这是因为:
Note, however, that there is a price you pay for that potentially higher JVM speed, and that price is in increased resource usage (both CPU and memory) and startup time because:
-
it takes some time before the JIT kicks in and fully optimizes the code.
-
the JIT compiler consumes resources that could be utilized by the application.
-
the JVM has to retain a lot more metadata and compiler/profiler data to support the better optimizations that it can offer.
1) 的原因是代码需要解释一段时间,而且可能需要编译多次才能实现所有潜在的优化,以确保:
The reason for 1) is that code needs to be run interpreted for some time and, possibly, to be compiled several times before all potential optimizations are realized to ensure that:
-
it’s worth compiling that code path, i.e. it’s being executed enough times, and that
-
we have enough profiling data to perform meaningful optimizations.
1)的一个含义是,对于小型的、生命周期较短的应用程序,原生可执行文件可能是一个更好的选择。尽管编译后的代码优化程度不高,但它可立即获得。
An implication of 1) is that for small, short-lived applications a native executable may well be a better bet. Although the compiled code is not as well optimized it is available straight away.
2)的原因是 JVM 基本上是在运行时并行于应用程序本身运行编译器。在原生可执行文件的情况下,编译器预先运行,从而无需与应用程序并行运行编译器。
The reason for 2) is that the JVM is essentially running the compiler at runtime in parallel with the application itself. In the case of native executables the compiler is run ahead of time removing the need to run the compiler in parallel with the application.
3)有几个原因。JVM 没有封闭世界的假设。因此,它必须能够重新编译代码,如果加载新类意味着它需要修改编译时做出的乐观假设。例如,如果一个接口只有一个实现,它可以使一个调用直接跳转到该代码。然而,在加载第二个实现类的情况下,需要修补调用站点以测试接收器实例的类型并跳转到属于其类的代码。支持像这样的优化需要跟踪比原生可执行文件类基础更多的细节,包括记录完整的类和接口层次结构、方法覆盖其他方法的详细信息,所有方法字节码等。在原生可执行文件中,可以在运行时忽略类结构和字节码的大多数细节。
There are several reasons for 3). The JVM does not have a closed world assumption. So, it has to be able to recompile code if loading of new classes implies that it needs to revise optimistic assumptions made at compile time. For example, if an interface has only one implementation it can make a call jump directly to that code. However, in the case where a second implementation class is loaded the call site needs to be patched to test the type of the receiver instance and jump to the code that belongs to its class. Supporting optimizations like this one requires keeping track of a lot more details of the class base than a native executable, including recording the full class and interface hierarchy, details of which methods override other methods, all method bytecode etc. In a native executable most of the details of class structure and bytecode can be ignored at run time.
JVM 还必须应对类基础或执行配置文件的变化,从而导致线程进入以前未执行的路径。在这一点上,JVM 必须从编译代码跳出到解释器并重新编译代码以适应包括以前未执行的路径的新执行配置文件。这需要保留运行时信息,使编译后的堆栈帧能够被一个或多个解释器帧替换。它还要求运行时可扩展配置文件计数器被分配并更新,以跟踪什么已经执行或尚未执行。
The JVM also has to cope with changes to the class base or execution profiles that result in a thread going down a previously cold path. At that point the JVM has to jump out of the compiled code into the interpreter and recompile the code to cater for a new execution profile that includes the previously cold path. That requires keeping runtime info that allow a compiled stack frame to be replaced with one or more interpreter frames. It also requires runtime extensible profile counters to be allocated and updated to track what has or has not been executed.
Why are native executables “big”?
这可以归因于许多不同的原因:
This can be attributed to a number of different reasons:
-
Native executables include not only the application code but also, library code, and JDK code. As a result a more fair comparison would be to compare the native executable’s size with the size of the application, plus the size of the libraries it uses, plus the size of the JDK. Especially the JDK part is not negligible even in simple applications like HelloWorld. To get a glance on what is being pulled in the image one can use
-H:+PrintUniverse
when building the native executable. -
Some features are always included in a native executable even though they might never be actually used at run time. An example of such a feature is garbage collection. At compile time we can’t be sure whether an application will need to run garbage collection at run time, so garbage collection is always included in native executables increasing their size even if not necessary. Native executable generation relies on static code analysis to identify which code paths are reachable, and static code analysis can be imprecise leading to more code getting into the image than what’s actually needed.
有一个 GraalVM upstream issue ,其中有一些关于该主题的有趣讨论。
There is a GraalVM upstream issue with some interesting discussions about that topic.
What version of Mandrel was used to generate a binary?
可以通过按以下方式检查二进制文件来查看用于生成二进制文件的 Mandrel 版本:
One can see which Mandrel version was used to generate a binary by inspecting the binary as follows:
$ strings target/debugging-native-1.0.0-SNAPSHOT-runner | grep GraalVM
com.oracle.svm.core.VM=GraalVM 22.0.0.2-Final Java 11 Mandrel Distribution
How do I enable GC logging in native executables?
有关详细信息,请参见 Native Memory Management GC Logging section 。
See gc-logging for details.
Can I get a heap dump of a native executable? e.g. if it runs out of memory
从 GraalVM 22.2.0 开始,可以根据请求创建堆转储,例如 kill -SIGUSR1 <pid>
。接下来将支持在内存不足错误时转储堆转储。
Starting with GraalVM 22.2.0 it is possible to create heap dumps upon request,
e.g. kill -SIGUSR1 <pid>
.
Support for dumping the heap dump upon an out of memory error will follow up.
Can I build and run this examples outside a container in Linux?
是的,你可以。事实上,在 Linux 裸机上调试原生可执行文件提供了尽可能好的体验。在这种环境中,除了安装运行某些调试步骤所需的包或启用 perf
以在内核收集事件之外,不需要 root 访问权限。
Yes you can.
In fact, debugging native executables on a Linux bare metal box offers the best possible experience.
In this kind of environments, root access is not needed except to install packages required to run some debug steps,
or to enable perf
to gather events at the kernel.
以下是您在 Linux 环境中运行不同调试部分所需的包:
These are the packages you’ll need on your Linux environment to run through the different debugging sections:
# dnf (rpm-based)
sudo dnf install binutils gdb perf perl-open
# Debian-based distributions:
sudo apt install binutils gdb perf
Generating flame graphs is slow, or produces errors, what can I do?
Mandrel 生成的原生可执行文件可以使用多种方式进行分析。所有方法都需要您传入 -H:-DeleteLocalSymbols
选项。
There are multiple ways in which a native executable produced by Mandrel can be profiled.
All the methods require you to pass in the -H:-DeleteLocalSymbols
option.
本参考指南中显示的方法使用 DWARF 调试信息生成二进制文件,通过 perf record
运行它,然后使用 perf script
和火焰图工具生成火焰图。但是,对该二进制文件执行的 perf script
后处理步骤可能显示较慢或显示一些 DWARF 错误。
The method shown in this reference guide generates a binary with DWARF debug information,
runs it via perf record
and then uses perf script
and flame graph tooling to generate the flamegraphs.
However, the perf script
post-processing step done on this binary can appear to be slow or can show some DWARF errors.
生成火焰图的另一种方法是在生成原生可执行文件时传入 -H:+PreserveFramePointer
以代替生成 DWARF 调试信息。它指示二进制文件为帧指针使用一个额外的寄存器。这使 perf
能够进行堆栈遍历以分析运行时行为。要使用这些标志生成原生可执行文件,请执行以下操作:
An alternative method to generate flame graphs is to pass in -H:+PreserveFramePointer
when generating the native executable instead of generating the DWARF debug information.
It instructs the binary to use an extra register for the frame pointer.
This enables perf
to do stack walking to profile the runtime behaviour.
To generate the native executable using these flags, do the following:
./mvnw package -DskipTests -Dnative
-Dquarkus.native.additional-build-args=-H:+PreserveFramePointer,-H:-DeleteLocalSymbols
要从原生可执行文件中获取运行时分析信息,只需执行以下操作:
To get runtime profiling information out of the native executable, simply do:
perf record -F 1009 -g -a ./target/debugging-native-1.0.0-SNAPSHOT-runner
生成运行时分析信息建议的方法是使用调试信息,而不是生成保留帧指针的二进制文件。这是因为将调试信息添加到本机可执行文件生成过程不会对运行时性能产生负面影响,而保留帧指针会产生负面影响。
The recommended method for generating runtime profiling information is using the debug information rather than generating a binary that preserves the frame pointer. This is because adding debug information to the native executable build process has no negative runtime performance whereas preserving the frame pointer does.
DWARF 调试信息会生成在一个单独的文件中,甚至可以在默认部署中省略,而且只有在需要时才会传输和使用,用于分析或调试。此外,调试信息的存在使 perf
也能够向我们展示相关的源代码行,因此它不会使本机可执行文件本身膨胀。要做到这一点,只需添加额外参数来调用 perf report
,以展示源代码行:
DWARF debug info is generated in a separate file and can even be omitted in the default deployment and only be transferred and used on demand,
for profiling or debugging purposes.
Furthermore, the presence of debug info enables perf
to show us the relevant source code lines as well,
hence it does not bloat the native executable itself.
To do that, simply call perf report
with an extra parameter to show source code lines:
perf report --stdio -F+srcline
...
83.69% 0.00% GreetingResource.java:20 ...
...
83.69% 0.00% AbstractStringBuilder.java:1025 ...
...
83.69% 0.00% ArraycopySnippets.java:95 ...
保留帧指针的性能损失是由于使用额外的寄存器进行堆栈遍历,尤其是在 x86_64
与 aarch64
相比的情况下,后者中的可用寄存器更少。使用这个额外的寄存器会减少可用于其他工作的寄存器数量,这可能导致性能损失。
The performance penalty of preserving the frame pointer is due to using the extra register for stack walking,
particularly in x86_64
compared to aarch64
where there are fewer registers available.
Using this extra register reduces the number of registers that are available for other work,
which can lead to performance penalties.
I think I’ve found a bug in native-image, how can I debug it with the IDE?
虽然可以在容器中远程调试进程,但通过在本地安装 Mandrel 并将其添加到 shell 进程的路径中,可能更容易分步调试本机映像。
Although it is possible to remote debug processes within containers, it might be easier to step-by-step debug native-image by installing Mandrel locally and adding it to the path of the shell process.
本机可执行文件生成是依次执行两个 Java 进程的结果。第一个进程非常短,其主要工作是为第二个进程做好准备。第二个进程负责完成大部分工作。调试一个或另一个进程的步骤略有不同。
Native executable generation is the result of two Java processes that are executed sequentially. The first process is very short and its main job is to set things up for the second process. The second process is the one that takes care of most of the work. The steps to debug one process or the other vary slightly.
让我们首先讨论如何调试第二个进程,这正是你最可能想要调试的进程。第二个进程的起始点是 com.oracle.svm.hosted.NativeImageGeneratorRunner
类。要调试此进程,只需将 --debug-attach=*:8000
添加为额外的构建时刻参数:
Let’s discuss first how to debug the second process,
which is the one you most likely to want to debug.
The starting point for the second process is the com.oracle.svm.hosted.NativeImageGeneratorRunner
class.
To debug this process, simply add --debug-attach=*:8000
as an additional build time argument:
./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=--debug-attach=*:8000
第一个进程的起始点是 com.oracle.svm.driver.NativeImages
类。在 GraalVM CE 发行版中,此第一个进程是二进制文件,因此无法使用 Java IDE 以传统方式进行调试。不过,Mandrel 发行版(或本地构建的 GraalVM CE 实例)将其保持为一个正常的 Java 进程,因此你可以通过将 --vm.agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=*:8000
作为额外的构建参数进行远程调试,例如:
The starting point for the first process is the com.oracle.svm.driver.NativeImages
class.
In GraalVM CE distributions, this first process is a binary, so debugging it in the traditional way with a Java IDE is not possible.
However, Mandrel distributions (or locally built GraalVM CE instances) keep this as a normal Java process,
so you can remote debug this process by adding the --vm.agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=*:8000
as an additional build argument, e.g.
$ ./mvnw package -DskipTests -Dnative \
-Dquarkus.native.additional-build-args=--vm.agentlib:jdwp=transport=dt_socket\\,server=y\\,suspend=y\\,address=*:8000
Can I use JFR/JMC to debug or profile native binaries?
自 GraalVM CE 21.2.0 开始,可以使用 Java Flight Recorder (JFR) 和 JDK Mission Control (JMC) 来分析本机二进制文件。但是,GraalVM 中的 JFR 目前在功能上受到限制,与 HotSpot 相比。自定义事件 API 得到充分支持,但一些 VM 级别功能不可用。后续版本将继续添加更多事件和 JFR 功能。下表概述了按版本划分的本机映像 JFR 支持和限制。
Java Flight Recorder (JFR) and JDK Mission Control (JMC) can be used to profile native binaries since GraalVM CE 21.2.0. However, JFR in GraalVM is currently limited in capabilities compared to HotSpot. The custom event API is fully supported, but some VM level features are unavailable. More events and JFR features will continue to be added in later releases. The following table outlines Native Image JFR support and limitations by version.
GraalVM Version | Supports | Limitations |
---|---|---|
GraalVM CE 21.3 and Mandrel 21.3 |
|
|
GraalVM CE 22.3 and Mandrel 22.3 |
|
|
GraalVM CE for JDK 17/20 and Mandrel 23.0 |
|
|
要向 Quarkus 可执行文件添加 JFR 支持,请添加应用程序属性: -Dquarkus.native.monitoring=jfr
。例如:
To add JFR support to your Quarkus executable, add the application property: -Dquarkus.native.monitoring=jfr
.
E.g.
./mvnw package -DskipTests -Dnative -Dquarkus.native.container-build=true \
-Dquarkus.native.builder-image=quay.io/quarkus/ubi-quarkus-mandrel-builder-image:{mandrel-flavor} \
-Dquarkus.native.monitoring=jfr
编译映像后,通过运行时标志启用并启动 JFR: -XX:+FlightRecorder
和 -XX:StartFlightRecording
。例如:
Once the image is compiled, enable and start JFR via runtime flags: -XX:+FlightRecorder
and -XX:StartFlightRecording
. For example:
./target/debugging-native-1.0.0-SNAPSHOT-runner \
-XX:+FlightRecorder \
-XX:StartFlightRecording="filename=recording.jfr"
有关使用 JFR 的更多信息,请参阅 GraalVM JDK Flight Recorder (JFR) with Native Image 指南。
For more information about using JFR, see the GraalVM JDK Flight Recorder (JFR) with Native Image guide.
How can we troubleshoot performance problems only reproducible in production?
在这种情况下,最好先尝试切换到 JVM 模式。如果在切换到 JVM 模式后性能问题仍然存在,则可以使用更成熟的工具来找出根本原因。如果性能问题仅限于 native 模式,则您可能无法使用 perf
,因此 JFR 是在这种情况下收集任何信息的唯一途径。随着对原生 JFR 支持的不断扩展,直接在生产环境中解决性能问题根源的能力将得以提升。
In this situation, switching to JVM mode would be the best thing to try first.
If the performance issues continue after switching to JVM mode,
you can use more established and mature tooling to figure out the root cause.
If the performance issue is limited to native mode only,
you might not be able to use perf
,
so JFR is the only way to gather any information in this situation.
As JFR support for native expands,
the ability to detect root causes of performance issues directly in production will improve.
What information helps most debug issues that happen either at build-time or run-time?
若要在构建时修复类路径、类初始化或禁止 API 错误,最好使用 build time reports来理解封闭世界范围。完整了解该范围以及不同类和方法之间的关系将有助于发现和修复大多数问题。
To fix classpath, class initialization or forbidden API errors at build time it’s best to use native-reports to understand the closed world universe. A complete picture of the universe, along with the relationships between the different classes and methods will help uncover and fix most of the issues.
若要修复运行时特定的原生错误,最好准备好原生可执行文件的 debug info builds,以便可以迅速连接 `gdb`来调试问题。如果您在调试信息构建中还添加了本地符号,则您还将获得精确的 profiling information。
To fix runtime native specific errors,
it’s best to have debug-info of the native executables around,
so that gdb
can be hooked up quickly to debug the issue.
If you also add local symbols to the debug info builds,
you will obtain precise profiling as well.
Build stalled for minutes, barely using any CPU
构建可能会陷入停滞状态,甚至出现以下情况:
It might so happen that the build gets stalled and even ends up with:
Image generator watchdog detected no activity.
其中一个可能的解释是熵不足,例如,在受熵约束的 VM 上,如果需要这种来源,就像 Bouncycastle 在构建时的要求。
One of the possible explanations could be a lack of entropy, e.g. on an entropy constrained VM, if such a source is needed as it is the case with Bouncycastle at build time.
可以使用以下命令检查 Linux 系统上的可用熵:
One can check the available entropy on a Linux system with:
$ cat /proc/sys/kernel/random/entropy_avail
如果数量不是数百,则可能会出现问题。一种可能的解决方法是折衷,这在测试中是可以接受的,并进行如下设置:
If the amount is not in hundreds, it could be a problem. A possible workaround is to compromise, acceptable for testing, and set:
export JAVA_OPTS=-Djava.security.egd=/dev/urandom
正确的解决方案是为系统增加可用熵。这对于每种 OS 供应商和虚拟化解决方案来说都是独一无二的。
The proper solution is to increase the entropy available for the system. That is specific for each OS vendor and virtualization solution though.
Work around missing CPU features
在较新的机器上进行构建并在较旧的机器上运行您的原生可执行文件时,在启动应用程序时可能会看到以下失败:
When building on recent machines and running your native executable on older machines, you may see the following failure when starting the application:
The current machine does not support all of the following CPU features that are required by the image: [CX8, CMOV, FXSR, MMX, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, LZCNT, AVX, AVX2, BMI1, BMI2, FMA].
Please rebuild the executable with an appropriate setting of the -march option.
此错误消息表示,本机编译使用更高级的指令集,而不受运行应用程序的 CPU 支持。要解决该问题,请将以下行添加到 application.properties
:
This error message means that the native compilation used more advanced instruction sets, not supported by the CPU running the application.
To work around that issue, add the following line to the application.properties
:
quarkus.native.march=compatibility
然后,重新构建您的原生可执行文件。此设置强制本机编译使用较旧的指令集,从而增加了兼容性。
Then, rebuild your native executable. This setting forces the native compilation to use an older instruction set, increasing the chance of compatibility.
若要显式定义目标架构,请运行 native-image -march=list`以获取支持的配置,然后将 `-march`设置为其中之一,例如 `quarkus.native.additional-build-args=-march=x86-64-v4
。如果您针对的是 AMD64 主机,则 `-march=x86-64-v2`在大多数情况下都能正常工作。
To explicitly define the target architecture run native-image -march=list
to get the supported configurations and then set -march
to one of them, e.g., quarkus.native.additional-build-args=-march=x86-64-v4
.
If you are targeting an AMD64 host, -march=x86-64-v2
would work in most cases.
`march`参数仅在 GraalVM 23+ 上可用。 |
The |