Replace JCTools queues with VarHandle-based implementations for Java 25+#9896
Replace JCTools queues with VarHandle-based implementations for Java 25+#9896
Conversation
This comment has been minimized.
This comment has been minimized.
Debugger benchmarksParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 5 unstable metrics. See unchanged results
Request duration reports for reportsgantt
title reports - request duration [CI 0.99] : candidate=None, baseline=None
dateFormat X
axisFormat %s
section baseline
noprobe (317.508 µs) : 291, 344
. : milestone, 318,
basic (294.053 µs) : 287, 301
. : milestone, 294,
loop (8.959 ms) : 8956, 8963
. : milestone, 8959,
section candidate
noprobe (319.585 µs) : 290, 349
. : milestone, 320,
basic (293.196 µs) : 286, 300
. : milestone, 293,
loop (8.955 ms) : 8952, 8958
. : milestone, 8955,
|
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 55 metrics, 10 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.092 s) : 0, 1092483
Total [baseline] (10.853 s) : 0, 10852937
Agent [candidate] (1.088 s) : 0, 1088289
Total [candidate] (10.87 s) : 0, 10869797
section appsec
Agent [baseline] (1.264 s) : 0, 1263897
Total [baseline] (11.021 s) : 0, 11020649
Agent [candidate] (1.269 s) : 0, 1269351
Total [candidate] (11.086 s) : 0, 11086130
section iast
Agent [baseline] (1.228 s) : 0, 1227596
Total [baseline] (11.193 s) : 0, 11192880
Agent [candidate] (1.229 s) : 0, 1228831
Total [candidate] (11.199 s) : 0, 11198627
section profiling
Agent [baseline] (1.216 s) : 0, 1216203
Total [baseline] (11.121 s) : 0, 11120523
Agent [candidate] (1.21 s) : 0, 1209945
Total [candidate] (10.891 s) : 0, 10891468
gantt
title petclinic - break down per module: candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.18 ms) : 0, 1180
BytebuddyAgent [baseline] (656.487 ms) : 0, 656487
BytebuddyAgent [candidate] (652.877 ms) : 0, 652877
GlobalTracer [baseline] (284.913 ms) : 0, 284913
GlobalTracer [candidate] (284.36 ms) : 0, 284360
AppSec [baseline] (32.817 ms) : 0, 32817
AppSec [candidate] (32.601 ms) : 0, 32601
Debugger [baseline] (67.753 ms) : 0, 67753
Debugger [candidate] (68.364 ms) : 0, 68364
Remote Config [baseline] (679.037 µs) : 0, 679
Remote Config [candidate] (633.926 µs) : 0, 634
Telemetry [baseline] (9.071 ms) : 0, 9071
Telemetry [candidate] (8.893 ms) : 0, 8893
Flare Poller [baseline] (3.816 ms) : 0, 3816
Flare Poller [candidate] (3.848 ms) : 0, 3848
section appsec
crashtracking [baseline] (1.177 ms) : 0, 1177
crashtracking [candidate] (1.182 ms) : 0, 1182
BytebuddyAgent [baseline] (690.587 ms) : 0, 690587
BytebuddyAgent [candidate] (692.568 ms) : 0, 692568
GlobalTracer [baseline] (258.794 ms) : 0, 258794
GlobalTracer [candidate] (260.182 ms) : 0, 260182
AppSec [baseline] (173.427 ms) : 0, 173427
AppSec [candidate] (172.464 ms) : 0, 172464
Debugger [baseline] (66.111 ms) : 0, 66111
Debugger [candidate] (69.031 ms) : 0, 69031
Remote Config [baseline] (744.647 µs) : 0, 745
Remote Config [candidate] (805.037 µs) : 0, 805
Telemetry [baseline] (9.316 ms) : 0, 9316
Telemetry [candidate] (9.308 ms) : 0, 9308
Flare Poller [baseline] (3.74 ms) : 0, 3740
Flare Poller [candidate] (3.721 ms) : 0, 3721
IAST [baseline] (24.516 ms) : 0, 24516
IAST [candidate] (24.481 ms) : 0, 24481
section iast
crashtracking [baseline] (1.182 ms) : 0, 1182
crashtracking [candidate] (1.185 ms) : 0, 1185
BytebuddyAgent [baseline] (793.882 ms) : 0, 793882
BytebuddyAgent [candidate] (795.595 ms) : 0, 795595
GlobalTracer [baseline] (256.818 ms) : 0, 256818
GlobalTracer [candidate] (257.065 ms) : 0, 257065
AppSec [baseline] (34.58 ms) : 0, 34580
AppSec [candidate] (35.132 ms) : 0, 35132
Debugger [baseline] (65.844 ms) : 0, 65844
Debugger [candidate] (64.733 ms) : 0, 64733
Remote Config [baseline] (574.051 µs) : 0, 574
Remote Config [candidate] (605.438 µs) : 0, 605
Telemetry [baseline] (8.505 ms) : 0, 8505
Telemetry [candidate] (8.488 ms) : 0, 8488
Flare Poller [baseline] (3.651 ms) : 0, 3651
Flare Poller [candidate] (3.524 ms) : 0, 3524
IAST [baseline] (27.005 ms) : 0, 27005
IAST [candidate] (26.948 ms) : 0, 26948
section profiling
ProfilingAgent [baseline] (98.621 ms) : 0, 98621
ProfilingAgent [candidate] (96.64 ms) : 0, 96640
crashtracking [baseline] (1.23 ms) : 0, 1230
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (707.12 ms) : 0, 707120
BytebuddyAgent [candidate] (705.664 ms) : 0, 705664
GlobalTracer [baseline] (222.67 ms) : 0, 222670
GlobalTracer [candidate] (222.21 ms) : 0, 222210
AppSec [baseline] (32.681 ms) : 0, 32681
AppSec [candidate] (32.483 ms) : 0, 32483
Debugger [baseline] (69.297 ms) : 0, 69297
Debugger [candidate] (68.005 ms) : 0, 68005
Remote Config [baseline] (674.672 µs) : 0, 675
Remote Config [candidate] (681.335 µs) : 0, 681
Telemetry [baseline] (9.092 ms) : 0, 9092
Telemetry [candidate] (8.983 ms) : 0, 8983
Flare Poller [baseline] (4.541 ms) : 0, 4541
Flare Poller [candidate] (3.731 ms) : 0, 3731
Profiling [baseline] (99.21 ms) : 0, 99210
Profiling [candidate] (97.214 ms) : 0, 97214
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.085 s) : 0, 1084824
Total [baseline] (8.768 s) : 0, 8768170
Agent [candidate] (1.094 s) : 0, 1093647
Total [candidate] (8.751 s) : 0, 8751103
section iast
Agent [baseline] (1.232 s) : 0, 1231768
Total [baseline] (9.345 s) : 0, 9344899
Agent [candidate] (1.225 s) : 0, 1225437
Total [candidate] (9.431 s) : 0, 9431375
gantt
title insecure-bank - break down per module: candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.182 ms) : 0, 1182
crashtracking [candidate] (1.199 ms) : 0, 1199
BytebuddyAgent [baseline] (652.385 ms) : 0, 652385
BytebuddyAgent [candidate] (656.767 ms) : 0, 656767
GlobalTracer [baseline] (283.017 ms) : 0, 283017
GlobalTracer [candidate] (285.451 ms) : 0, 285451
AppSec [baseline] (32.458 ms) : 0, 32458
AppSec [candidate] (33.041 ms) : 0, 33041
Debugger [baseline] (66.836 ms) : 0, 66836
Debugger [candidate] (66.497 ms) : 0, 66497
Remote Config [baseline] (636.435 µs) : 0, 636
Remote Config [candidate] (632.56 µs) : 0, 633
Telemetry [baseline] (8.914 ms) : 0, 8914
Telemetry [candidate] (8.912 ms) : 0, 8912
Flare Poller [baseline] (3.768 ms) : 0, 3768
Flare Poller [candidate] (5.457 ms) : 0, 5457
section iast
crashtracking [baseline] (1.199 ms) : 0, 1199
crashtracking [candidate] (1.176 ms) : 0, 1176
BytebuddyAgent [baseline] (799.222 ms) : 0, 799222
BytebuddyAgent [candidate] (792.35 ms) : 0, 792350
GlobalTracer [baseline] (256.566 ms) : 0, 256566
GlobalTracer [candidate] (256.855 ms) : 0, 256855
IAST [baseline] (27.016 ms) : 0, 27016
IAST [candidate] (26.931 ms) : 0, 26931
AppSec [baseline] (34.592 ms) : 0, 34592
AppSec [candidate] (34.286 ms) : 0, 34286
Debugger [baseline] (64.685 ms) : 0, 64685
Debugger [candidate] (65.634 ms) : 0, 65634
Remote Config [baseline] (583.27 µs) : 0, 583
Remote Config [candidate] (587.938 µs) : 0, 588
Telemetry [baseline] (8.533 ms) : 0, 8533
Telemetry [candidate] (8.592 ms) : 0, 8592
Flare Poller [baseline] (3.66 ms) : 0, 3660
Flare Poller [candidate] (3.645 ms) : 0, 3645
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 1 performance regressions! Performance is the same for 13 metrics, 20 unstable metrics.
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section baseline
no_agent (1.184 ms) : 1173, 1196
. : milestone, 1184,
iast (3.158 ms) : 3117, 3198
. : milestone, 3158,
iast_FULL (6.011 ms) : 5950, 6073
. : milestone, 6011,
iast_GLOBAL (3.458 ms) : 3411, 3506
. : milestone, 3458,
profiling (1.965 ms) : 1946, 1984
. : milestone, 1965,
tracing (1.873 ms) : 1857, 1890
. : milestone, 1873,
section candidate
no_agent (1.213 ms) : 1201, 1225
. : milestone, 1213,
iast (3.091 ms) : 3051, 3130
. : milestone, 3091,
iast_FULL (5.676 ms) : 5620, 5733
. : milestone, 5676,
iast_GLOBAL (3.485 ms) : 3433, 3537
. : milestone, 3485,
profiling (1.926 ms) : 1910, 1942
. : milestone, 1926,
tracing (1.757 ms) : 1743, 1771
. : milestone, 1757,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section baseline
no_agent (18.182 ms) : 17996, 18369
. : milestone, 18182,
appsec (18.752 ms) : 18567, 18937
. : milestone, 18752,
code_origins (17.911 ms) : 17730, 18092
. : milestone, 17911,
iast (18.607 ms) : 18421, 18793
. : milestone, 18607,
profiling (18.735 ms) : 18551, 18920
. : milestone, 18735,
tracing (17.63 ms) : 17454, 17806
. : milestone, 17630,
section candidate
no_agent (17.979 ms) : 17794, 18165
. : milestone, 17979,
appsec (19.486 ms) : 19286, 19687
. : milestone, 19486,
code_origins (17.608 ms) : 17433, 17782
. : milestone, 17608,
iast (17.625 ms) : 17453, 17796
. : milestone, 17625,
profiling (18.519 ms) : 18334, 18704
. : milestone, 18519,
tracing (17.491 ms) : 17318, 17663
. : milestone, 17491,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section baseline
no_agent (14.983 s) : 14983000, 14983000
. : milestone, 14983000,
appsec (14.487 s) : 14487000, 14487000
. : milestone, 14487000,
iast (18.058 s) : 18058000, 18058000
. : milestone, 18058000,
iast_GLOBAL (17.707 s) : 17707000, 17707000
. : milestone, 17707000,
profiling (15.012 s) : 15012000, 15012000
. : milestone, 15012000,
tracing (14.642 s) : 14642000, 14642000
. : milestone, 14642000,
section candidate
no_agent (15.319 s) : 15319000, 15319000
. : milestone, 15319000,
appsec (14.513 s) : 14513000, 14513000
. : milestone, 14513000,
iast (18.046 s) : 18046000, 18046000
. : milestone, 18046000,
iast_GLOBAL (17.953 s) : 17953000, 17953000
. : milestone, 17953000,
profiling (15.342 s) : 15342000, 15342000
. : milestone, 15342000,
tracing (14.669 s) : 14669000, 14669000
. : milestone, 14669000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.57.0-SNAPSHOT~f8d77fd44e, baseline=1.59.0-SNAPSHOT~c91ab36874
dateFormat X
axisFormat %s
section baseline
no_agent (1.477 ms) : 1466, 1488
. : milestone, 1477,
appsec (3.678 ms) : 3462, 3894
. : milestone, 3678,
iast (2.229 ms) : 2164, 2295
. : milestone, 2229,
iast_GLOBAL (2.259 ms) : 2194, 2325
. : milestone, 2259,
profiling (2.081 ms) : 2027, 2136
. : milestone, 2081,
tracing (2.046 ms) : 1995, 2098
. : milestone, 2046,
section candidate
no_agent (1.478 ms) : 1467, 1490
. : milestone, 1478,
appsec (3.725 ms) : 3504, 3947
. : milestone, 3725,
iast (2.231 ms) : 2165, 2296
. : milestone, 2231,
iast_GLOBAL (2.267 ms) : 2201, 2332
. : milestone, 2267,
profiling (2.084 ms) : 2029, 2139
. : milestone, 2084,
tracing (2.06 ms) : 2008, 2112
. : milestone, 2060,
|
229f67a to
374d13d
Compare
21e0a65 to
259eeb5
Compare
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
9e7acbe to
b2850b3
Compare
|
Hi @amarziali I am one of the developers of JCTools and we are super happy if we could bring a var handle generation variant in our lib as well. Note: JCTools is at the very core of other frameworks which will soon hit the "no unsafe world" JVM barrier, including Netty. |
utils/queue-utils/src/main/java/datadog/common/queue/BaseQueue.java
Outdated
Show resolved
Hide resolved
85b0dcd to
fc49419
Compare
fc49419 to
183fc37
Compare
utils/queue-utils/src/jmh/java/datadog/common/queue/SPSCQueueBenchmark.java
Show resolved
Hide resolved
utils/queue-utils/src/main/java/datadog/common/queue/BaseQueue.java
Outdated
Show resolved
Hide resolved
utils/queue-utils/src/main/java/datadog/common/queue/BaseQueue.java
Outdated
Show resolved
Hide resolved
utils/queue-utils/src/main/java/datadog/common/queue/Queues.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>
Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>
….java Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>
|
Bravo @amarziali |
What Does This Do
This PR introduces a set of queue implementations in order to replace the JCTools-based queues, eliminating direct usage of sun.misc.Unsafe and providing full compatibility with Java 9+ runtimes through the VarHandle API.
The goal is to achieve similar high-performance concurrent queue behavior as JCTools while using supported, standard Java mechanisms.
A new
Queuesfactory class is introduced to dynamically select the optimal queue implementation based on the Java runtime environment:Introduced Classes Summary
SpscArrayQueueVarHandleSpmcArrayQueueVarHandleMpscArrayQueueVarHandle<E>MpscBlockingConsumerArrayQueueVarHandle<E>Memory Padding
All queue state fields (
head,tail, cached limits, etc.) are cache-line padded to prevent false sharing between producers and consumers.This ensures that frequently accessed hot fields do not reside on the same cache line across threads, minimizing cache invalidations and improving throughput under contention.
Memory Ordering Semantics
VarHandle access modes are carefully chosen to balance performance and correctness, using the weakest ordering that maintains visibility guarantees:
setRelease/getAcquire— Element publication and consumption. Release stores guarantee all preceding writes are visible before the element, while acquire loads ensure all subsequent reads see the published data. Provides efficient producer-consumer synchronization without full memory barriers.getOpaque— Hot-path reads of cached limits and blocked state. Ensures atomic access and eventual visibility without memory fence overhead. Safe when stale reads are benign (e.g., cachedproducerLimittriggers recalculation on mismatch, or when subsequent CAS provides full synchronization).getVolatile— Synchronization points requiring immediate visibility. Used when refreshing producer/consumer limits or checking queue state where correctness depends on seeing the latest value from other threads. Provides sequential consistency with full memory barriers.get/set) — Single-threaded paths where no inter-thread coordination is needed (e.g., single consumer reading its own index).Queue Benchmark Results
SPSC (Single-Producer / Single-Consumer)
Capacity = 1024
Capacity = 65536
MPSC (Multi-Producer / Single-Consumer)
Capacity = 1024
Capacity = 65536
MPSC (Blocking Consumer)
Capacity = 1024
Capacity = 65536
Takeaways:
Motivation
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any useful labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]