This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re.sub does NOT substitute all the matching patterns when re.IGNORECASE is used
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: anitrajpurohit28, ezio.melotti, mrabarnett
Priority: normal Keywords:

Created on 2020-08-29 21:47 by anitrajpurohit28, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (2)
msg376083 - (view) Author: Anit Rajpurohit (anitrajpurohit28) Date: 2020-08-29 21:47
Usage of re flags leads to inconsistent results when 
1. The pattern directly used in re.sub
2. The pattern is re.compile'd and used 

Note 1: Input string is all in the lowercase 'all is fair in love and war'
Note 2: Results are always consistent in case of re.compile'd pattern
=======================================
1. The pattern directly used in re.sub
=======================================
>>> import re
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE)
'#ll #s fair in love and war'
>>> 
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE|re.DOTALL)
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> 
=======================================
2. The pattern is re.compile'd and used 
=======================================
>>> pattern = re.compile(r'[aeiou]', re.IGNORECASE)
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> pattern = re.compile(r'[aeiou]')
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> pattern = re.compile(r'[aeiou]', re.IGNORECASE | re.DOTALL)
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
msg376086 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2020-08-29 22:32
The 4th argument of re.sub is 'count', not 'flags'.

re.IGNORECASE has the numeric value of 2, so:

    re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE)

is equivalent to:

    re.sub(r'[aeiou]', '#', 'all is fair in love and war', count=2)
History
Date User Action Args
2022-04-11 14:59:35adminsetgithub: 85830
2020-08-29 22:32:24mrabarnettsetstatus: open -> closed
resolution: not a bug
messages: + msg376086

stage: resolved
2020-08-29 21:47:33anitrajpurohit28create