Optimize metric name validation to fix 2-3x performance regression #1662

Copilot · 2025-11-04T09:30:09Z

Regex validation in isValidLegacyLabelName(), isValidLegacyMetricName(), and validateUnitName() was being called on every metric name during text format export, causing significant overhead.

Changes

Replace regex pattern matching with character-by-character validation in isValidLegacyLabelName(), isValidLegacyMetricName(), and validateUnitName()
Deprecate unused METRIC_NAME_PATTERN, LEGACY_LABEL_NAME_PATTERN, and UNIT_NAME_PATTERN fields (kept for API compatibility)
Update JavaDoc to reflect validation approach

Implementation

Before:

public static boolean isValidLegacyLabelName(String name) {
  return LEGACY_LABEL_NAME_PATTERN.matcher(name).matches();
}

After:

public static boolean isValidLegacyLabelName(String name) {
  if (name.isEmpty()) return false;
  char first = name.charAt(0);
  if (!((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z') || first == '_')) {
    return false;
  }
  for (int i = 1; i < name.length(); i++) {
    char c = name.charAt(i);
    if (!((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c == '_')) {
      return false;
    }
  }
  return true;
}

Benchmark results show recovery to near-baseline performance (532k ops/s vs 534k ops/s on main).

[!WARNING]

Original prompt

This section details on the original issue you should resolve

<filter_complete></filter_complete>

<issue_title>Performance regression in text-format export on 1.4.0+</issue_title>
<issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels method due to it now calling io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName.

I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms.

I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old simpleclient to the new client-java and that drew my attention, as the text format export is now slower than in simpleclient.

I have attached the flamegraph I captured on both versions.

flamegraph-v141.html
flamegraph-v1310.html</issue_description>

Comments on the Issue (you are @copilot in this section)

@zeitlinger Thanks for the report

I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet).

## Report details
- **Date generated:**                              2025-11-04 08:16:07

## Hardware Information:
- **Hardware Model:**                              Micro-Star International Co., Ltd. MS-7D76
- **Memory:**                                      96.0 GiB
- **Processor:**                                   AMD Ryzen™ 9 7900 × 24

## Software Information:
- **Firmware Version:**                            A.N1
- **OS Name:**                                     Ubuntu 24.04.3 LTS
- **OS Build:**                                    (null)
- **OS Type:**                                     64-bit
- **GNOME Version:**                               46
- **Windowing System:**                            X11
- **Kernel Version:**                              Linux 6.14.0-114034-tuxedo

tooling

temurin-25.0.1+8.0.LTS

main

Benchmark                                             Mode  Cnt       Score       Error  Units
TextFormatUtilBenchmark.openMetricsWriteToByteArray  thrpt   25  489698.973 ± 16399.308  ops/s
TextFormatUtilBenchmark.openMetricsWriteToNull       thrpt   25  507779.365 ±  2619.768  ops/s
TextFormatUtilBenchmark.prometheusWriteToByteArray   thrpt   25  534028.708 ±  5998.689  ops/s
TextFormatUtilBenchmark.prometheusWriteToNull        thrpt   25  522323.579 ± 18123.729  ops/s

1.3.10
Benchmark                                             Mode  Cnt       Score       Error  Units
TextFormatUtilBenchmark.openMetricsWriteToByteArray  thrpt   25  934183.222 ± 10919.023  ops/s
TextFormatUtilBenchmark.openMetricsWriteToNull       thrpt   25  936023.986 ± 10402.193  ops/s
TextFormatUtilBenchmark.prometheusWriteToByteArray   thrpt   25  958813.578 ± 17958.230  ops/s
TextFormatUtilBenchmark.prometheusWriteToNull        thrpt   25  965133.616 ± 10907.457  ops/s

</comment_new>

Fixes Performance regression in text-format export on 1.4.0+ #1660

Original prompt

This section details on the original issue you should resolve

<issue_title>Performance regression in text-format export on 1.4.0+</issue_title>
<issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels method due to it now calling io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName.

I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms.

I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old simpleclient to the new client-java and that drew my attention, as the text format export is now slower than in simpleclient.

I have attached the flamegraph I captured on both versions.

flamegraph-v141.html
flamegraph-v1310.html</issue_description>

Comments on the Issue (you are @copilot in this section)
@zeitlinger Thanks for the report
I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet).
## Report details
- **Date generated:**                              2025-11-04 08:16:07

## Hardware Information:
- **Hardware Model:**                              Micro-Star International Co., Ltd. MS-7D76
- **Memory:**                                      96.0 GiB
- **Processor:**                                   AMD Ryzen™ 9 7900 × 24

## Software Information:
- **Firmware Version:**                            A.N1
- **OS Name:**                                     Ubuntu 24.04.3 LTS
- **OS Build:**                                    (null)
- **OS Type:**                                     64-bit
- **GNOME Version:**                               46
- **Windowing System:**                            X11
- **Kernel Version:**                              Linux 6.14.0-114034-tuxedo

tooling

temurin-25.0.1+8.0.LTS

main

Benchmark                                             Mode  Cnt       Score       Error  Units
TextFormatUtilBenchmark.openMetricsWriteToByteArray  thrpt   25  489698.973 ± 16399.308  ops/s
TextFormatUtilBenchmark.openMetricsWriteToNull       thrpt   25  507779.365 ±  2619.768  ops/s
TextFormatUtilBenchmark.prometheusWriteToByteArray   thrpt   25  534028.708 ±  5998.689  ops/s
TextFormatUtilBenchmark.prometheusWriteToNull        thrpt   25  522323.579 ± 18123.729  ops/s

1.3.10
Benchmark                                             Mode  Cnt       Score       Error  Units
TextFormatUtilBenchmark.openMetricsWriteToByteArray  thrpt   25  934183.222 ± 10919.023  ops/s
TextFormatUtilBenchmark.openMetricsWriteToNull       thrpt   25  936023.986 ± 10402.193  ops/s
TextFormatUtilBenchmark.prometheusWriteToByteArray   thrpt   25  958813.578 ± 17958.230  ops/s
TextFormatUtilBenchmark.prometheusWriteToNull        thrpt   25  965133.616 ± 10907.457  ops/s

Fixes Performance regression in text-format export on 1.4.0+ #1660

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

zeitlinger · 2025-11-04T09:38:08Z

@copilot the flamegraph suggests that calls to io.prometheus.metrics.model.snapshots.PrometheusNaming#METRIC_NAME_PATTERN are slow - either make the regex faster or cache the result (if the regex matches)

…aracter-by-character validation instead of regex Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Copilot · 2025-11-04T10:07:16Z

@copilot the flamegraph suggests that calls to io.prometheus.metrics.model.snapshots.PrometheusNaming#METRIC_NAME_PATTERN are slow - either make the regex faster or cache the result (if the regex matc...

I've already addressed this in commit 885dcab. Instead of making the regex faster or caching results, I eliminated the regex entirely by replacing METRIC_NAME_PATTERN.matcher(name).matches() with direct character-by-character validation. This approach is faster than both regex and caching since it's O(n) with minimal overhead and no memory footprint for caching.

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>

zeitlinger · 2025-11-04T11:49:55Z

@copilot also replace UNIT_NAME_PATTERN with manual check for consistency

Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Copilot · 2025-11-04T11:57:21Z

@copilot also replace UNIT_NAME_PATTERN with manual check for consistency

Done in commit a48d200. Replaced UNIT_NAME_PATTERN.matcher(name).matches() with character-by-character validation in validateUnitName() for consistency with the other optimizations.

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>

zeitlinger · 2025-11-04T12:26:41Z

Blocked by #1663

@zeitlinger

…rometheus#1662) Regex validation in `isValidLegacyLabelName()`, `isValidLegacyMetricName()`, and `validateUnitName()` was being called on every metric name during text format export, causing significant overhead. ## Changes - Replace regex pattern matching with character-by-character validation in `isValidLegacyLabelName()`, `isValidLegacyMetricName()`, and `validateUnitName()` - Deprecate unused `METRIC_NAME_PATTERN`, `LEGACY_LABEL_NAME_PATTERN`, and `UNIT_NAME_PATTERN` fields (kept for API compatibility) - Update JavaDoc to reflect validation approach ## Implementation Before: ```java public static boolean isValidLegacyLabelName(String name) { return LEGACY_LABEL_NAME_PATTERN.matcher(name).matches(); } ``` After: ```java public static boolean isValidLegacyLabelName(String name) { if (name.isEmpty()) return false; char first = name.charAt(0); if (!((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z') || first == '_')) { return false; } for (int i = 1; i < name.length(); i++) { char c = name.charAt(i); if (!((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c == '_')) { return false; } } return true; } ``` Benchmark results show recovery to near-baseline performance (532k ops/s vs 534k ops/s on main). > [!WARNING] > <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <__filter_complete__></__filter_complete__></details><issue_title>Performance regression in text-format export on 1.4.0+</issue_title> ><issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the `io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels` method due to it now calling `io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName`. > > I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms. > > I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old `simpleclient` to the new `client-java` and that drew my attention, as the text format export is now slower than in `simpleclient`. > > I have attached the flamegraph I captured on both versions. > ><a href="https://github.com/user-attachments/files/23315564/flamegraph-v141.html">flamegraph-v141.html</a> > <a href="https://github.com/user-attachments/files/23315565/flamegraph-v1310.html">flamegraph-v1310.html</a></issue_description> > > ## Comments on the Issue (you are @copilot in this section) > ><comments> ><comment_new><author>@zeitlinger</author> > Thanks for the report > > I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet). > > ``` > ## Report details > - **Date generated:** 2025-11-04 08:16:07 > > ## Hardware Information: > - **Hardware Model:** Micro-Star International Co., Ltd. MS-7D76 > - **Memory:** 96.0 GiB > - **Processor:** AMD Ryzen™ 9 7900 × 24 > > ## Software Information: > - **Firmware Version:** A.N1 > - **OS Name:** Ubuntu 24.04.3 LTS > - **OS Build:** (null) > - **OS Type:** 64-bit > - **GNOME Version:** 46 > - **Windowing System:** X11 > - **Kernel Version:** Linux 6.14.0-114034-tuxedo > > tooling > > temurin-25.0.1+8.0.LTS > > main > > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 489698.973 ± 16399.308 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 507779.365 ± 2619.768 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 534028.708 ± 5998.689 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 522323.579 ± 18123.729 ops/s > > 1.3.10 > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 934183.222 ± 10919.023 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 936023.986 ± 10402.193 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 958813.578 ± 17958.230 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 965133.616 ± 10907.457 ops/s > ``` > </comment_new> ></comments> > - Fixes prometheus#1660  <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Performance regression in text-format export on 1.4.0+</issue_title> > <issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the `io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels` method due to it now calling `io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName`. > > I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms. > > I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old `simpleclient` to the new `client-java` and that drew my attention, as the text format export is now slower than in `simpleclient`. > > I have attached the flamegraph I captured on both versions. > > [flamegraph-v141.html](https://github.com/user-attachments/files/23315564/flamegraph-v141.html) > [flamegraph-v1310.html](https://github.com/user-attachments/files/23315565/flamegraph-v1310.html)</issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@zeitlinger</author><body> > Thanks for the report > > I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet). > > ``` > ## Report details > - **Date generated:** 2025-11-04 08:16:07 > > ## Hardware Information: > - **Hardware Model:** Micro-Star International Co., Ltd. MS-7D76 > - **Memory:** 96.0 GiB > - **Processor:** AMD Ryzen™ 9 7900 × 24 > > ## Software Information: > - **Firmware Version:** A.N1 > - **OS Name:** Ubuntu 24.04.3 LTS > - **OS Build:** (null) > - **OS Type:** 64-bit > - **GNOME Version:** 46 > - **Windowing System:** X11 > - **Kernel Version:** Linux 6.14.0-114034-tuxedo > > tooling > > temurin-25.0.1+8.0.LTS > > main > > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 489698.973 ± 16399.308 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 507779.365 ± 2619.768 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 534028.708 ± 5998.689 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 522323.579 ± 18123.729 ops/s > > 1.3.10 > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 934183.222 ± 10919.023 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 936023.986 ± 10402.193 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 958813.578 ± 17958.230 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 965133.616 ± 10907.457 ops/s > ``` > </body></comment_new> > </comments> > </details> - Fixes prometheus#1660  --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com> Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Signed-off-by: huan.huynh <1612858@student.hcmus.edu.vn>

Initial plan

c0482d2

Copilot AI assigned Copilot and zeitlinger Nov 4, 2025

Copilot started work on behalf of zeitlinger November 4, 2025 09:30 View session

Copilot AI and others added 3 commits November 4, 2025 09:49

Optimize isValidLegacyLabelName and isValidLegacyMetricName to use ch…

885dcab

…aracter-by-character validation instead of regex Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Address code review: improve formatting consistency

2f8aa42

Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Update JavaDoc to reflect current validation approach

5266417

Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix performance regression in text-format export for 1.4.0+~~ Optimize legacy metric name validation to fix 2-3x performance regression Nov 4, 2025

Copilot AI requested a review from zeitlinger November 4, 2025 10:04

Copilot finished work on behalf of zeitlinger November 4, 2025 10:04

Copilot started work on behalf of zeitlinger November 4, 2025 10:04 View session

Copilot finished work on behalf of zeitlinger November 4, 2025 10:08

format

490aa6d

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>

zeitlinger mentioned this pull request Nov 4, 2025

Performance regression in text-format export on 1.4.0+ #1660

Closed

Copilot started work on behalf of zeitlinger November 4, 2025 11:50 View session

Replace UNIT_NAME_PATTERN with manual check for consistency

a48d200

Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com>

Copilot AI changed the title ~~Optimize legacy metric name validation to fix 2-3x performance regression~~ Optimize metric name validation to fix 2-3x performance regression Nov 4, 2025

Copilot finished work on behalf of zeitlinger November 4, 2025 11:58

format

5d52ebf

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>

zeitlinger requested a review from jaydeluca November 4, 2025 12:26

zeitlinger approved these changes Nov 4, 2025

View reviewed changes

Merge branch 'main' into copilot/fix-text-format-export-performance

012eb8c

jaydeluca approved these changes Nov 4, 2025

View reviewed changes

zeitlinger marked this pull request as ready for review November 4, 2025 13:15

zeitlinger merged commit c9cfa60 into main Nov 4, 2025
35 checks passed

zeitlinger deleted the copilot/fix-text-format-export-performance branch November 4, 2025 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize metric name validation to fix 2-3x performance regression #1662

Optimize metric name validation to fix 2-3x performance regression #1662

Uh oh!

Copilot AI commented Nov 4, 2025 •

edited

Loading

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Copilot AI commented Nov 4, 2025

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Copilot AI commented Nov 4, 2025

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimize metric name validation to fix 2-3x performance regression #1662

Optimize metric name validation to fix 2-3x performance regression #1662

Uh oh!

Conversation

Copilot AI commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Implementation

Comments on the Issue (you are @copilot in this section)

Comments on the Issue (you are @copilot in this section)

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Copilot AI commented Nov 4, 2025

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Copilot AI commented Nov 4, 2025

Uh oh!

zeitlinger commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Nov 4, 2025 •

edited

Loading