expr: Escape anchor characters within pattern#7842
Merged
sylvestre merged 16 commits intouutils:mainfrom Apr 27, 2025
Merged
Conversation
The anchor characters `^` and `$` are not considered special characters by `expr` unless they are used as expected on the start or end of the pattern.
It is not escaped by GNU `expr` either
sylvestre
reviewed
Apr 25, 2025
|
GNU testsuite comparison: |
The `rust-onig` bug still exists but parsing carets is fixed within uutils' `expr`
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
There was a problem hiding this comment.
Pull Request Overview
This PR improves the character escaping logic in regexp pattern definitions and reactivates/upgrades tests to verify the new behavior.
- Core pattern extraction now removes any extra leading '^' and escapes internal '^' characters.
- Tests have been updated to cover various scenarios including patterns starting with '^', '*' and multiple '^' occurrences.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/by-util/test_expr.rs | Updated tests to reflect the new escaping behavior for patterns. |
| src/uu/expr/src/syntax_tree.rs | Reworked regex string building logic to correctly escape internal '^' characters. |
Comments suppressed due to low confidence (2)
src/uu/expr/src/syntax_tree.rs:170
- [nitpick] Consider renaming the variable 'prev' to a more descriptive name like 'previous_char' to improve code clarity.
let mut prev = first.unwrap_or_default();
tests/by-util/test_expr.rs:293
- [nitpick] Consider adding an inline comment that explains the expected output for patterns starting with '^' to clarify the transformation rules for future maintainers.
new_ucmd!().args(&["^abc", ":", "^^abc"])
Contributor
|
and it passes: |
nickorlow
pushed a commit
to nickorlow/coreutils
that referenced
this pull request
Jul 17, 2025
* expr: Escape anchor characters within the core pattern The anchor characters `^` and `$` are not considered special characters by `expr` unless they are used as expected on the start or end of the pattern.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Escape the regexp start of string anchor characters '^' within pattern definitions. Only keep the start anchor as is if it is at the beginning of the user input pattern.
Changes
Testing
fixes #7663