{article Dive into Python}{title} {text} {/article}

Checking for Thousands

Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import re
>>> pattern = '^M?M?M?$'
>>> re.search(pattern, 'M')
<_sre.SRE_Match object at 0x0000000002B7C168>
>>> print pattern
^M?M?M?$
>>> re.search(pattern, 'MM')
<_sre.SRE_Match object at 0x0000000002B7C238>
>>> print pattern
^M?M?M?$
>>> re.search(pattern, 'MMM')
<_sre.SRE_Match object at 0x0000000002B7C168>
>>> re.search(pattern, 'MMMM')
>>> re.search(pattern, '')
<_sre.SRE_Match object at 0x0000000002B7C238>
>>>

This pattern has three parts:

  • ^ to match what follows only at the beginning of the string. If this were not specified, the pattern would match no matter where the M characters were, which is not what you want. You want to make sure that the M characters, if they're there, are at the beginning of the string.
  • M? to optionally match a single M character. Since this is repeated three times, you're matching anywhere from zero to three M characters in a row.
  • $ to match what precedes only at the end of the string. When combined with the ^ character at the beginning, this means that the pattern must match the entire string, with no other characters before or after the M characters.

The essence of the re module is the search function, that takes a regular expression (pattern) and a string ('M' ) to try to match against the regular expression. If a match is found, search returns an object which has various methods to describe the match; if no match is found, search returns None,
the Python null value. All you care about at the moment is whether the pattern matches, which you can tell by just looking at the return value ofsearch. 'M' matches this regular expression, because the first optional M matches and the second and third optional M characters are ignored.

'MM' matches because the first and second optional M characters match and the third M is ignored.

'MMM' matches because all three M characters match.

'MMMM' does not match. All three M characters match, but then the regular expression insists on the string ending (because of the $ character), and the string doesn't end yet (because of the fourth M). So search returns None.

Interestingly, an empty string also matches this regular expression, since all the M characters are optional.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import re
>>> pattern = '^M?M?M?$'
>>> re.search(pattern, 'M')
<_sre.SRE_Match object at 0x0000000002B7C168>
>>> print pattern
^M?M?M?$
>>> re.search(pattern, 'MM')
<_sre.SRE_Match object at 0x0000000002B7C238>
>>> print pattern
^M?M?M?$
>>> re.search(pattern, 'MMM')
<_sre.SRE_Match object at 0x0000000002B7C168>
>>> re.search(pattern, 'MMMM')
>>> re.search(pattern, '')
<_sre.SRE_Match object at 0x0000000002B7C238>
>>>

</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}