py: Implement PEP 750 t-strings using existing f-string parser (WIP)#18650
Draft
dpgeorge wants to merge 20 commits intomicropython:masterfrom
Draft
py: Implement PEP 750 t-strings using existing f-string parser (WIP)#18650dpgeorge wants to merge 20 commits intomicropython:masterfrom
dpgeorge wants to merge 20 commits intomicropython:masterfrom
Conversation
9395826 to
7c6e8e2
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #18650 +/- ##
==========================================
+ Coverage 98.42% 98.45% +0.03%
==========================================
Files 174 175 +1
Lines 22333 22639 +306
==========================================
+ Hits 21982 22290 +308
+ Misses 351 349 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f103117 to
9916c48
Compare
|
Code size report: |
ac21499 to
e61ae6d
Compare
e61ae6d to
4deb1fb
Compare
4deb1fb to
3522298
Compare
Signed-off-by: Koudai Aono <koxudaxi@gmail.com>
Signed-off-by: Koudai Aono <koxudaxi@gmail.com>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
This now works in MicroPython. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Now OK in MicroPython. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Not worth supporting. Signed-off-by: Damien George <damien@micropython.org>
Not worth supporting. Signed-off-by: Damien George <damien@micropython.org>
Reusing the existing f-string parser in the lexer. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Should be enough for full coverage testing of `__template__()`. Signed-off-by: Damien George <damien@micropython.org>
All the limitation tests no longer apply to the new parser. All that's left here are whitespace and unicode tests for the lexer, which match CPython. They should go in tests/basics/... Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Koudai Aono <koxudaxi@gmail.com>
3522298 to
66a56cb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is an alternative to #17557 which aims to implement t-strings in a more efficient way (less code size), leveraging the existing f-string parser in the lexer. It includes:
py/lexer.c__template__()function to construct t-string objectsTemplateandInterpolationclasses which implement all the functionality from PEP 750stringmodule withtemplatelibsub-module, which contains the classesTemplateandInterpolationThis PR is built upon #18588.
The way it works is that an input t-string like:
is converted character-by-character by the lexer/tokenizer to:
(For reference, if it were an f-string it would be converted to
"hello {:5}".format(name).)Compared to #17557 which costs about +7400 bytes on stm32, this implementation costs +2844 bytes.
This is still a work-in-progress. It implements most of the t-string functionality including nested t-strings and f-strings, but there are a few corner cases yet to tidy up. I don't see any show stoppers though, and code size should hopefully not grow much more either.
Testing
All 16 tests from #17557 have been added here. So far 11 of them pass, and 1 is no longer relevant (testing runtime overflow limit which is no longer there).
Trade-offs and Alternatives
Being an alternative to #17557, it shows a different way to achieve the same end result. #17557 starts up a new parser instance each time a t-string is encountered and recursively parses the t-string, whereas the implementation here just transforms the input characters. After all, t-strings (and f-strings) are really just syntactic sugar.
This adds code size, but if t-strings are not used then there is very little execution overhead, all of which is contained to the lexer.
The changes to
py/lexer.care mildly complex, but not really much more complex than the existing f-string logic. It's just a different way of transforming the input stream.