Python - Regex Lookbehind

Python - Regex Lookbehind

In regex, lookbehinds allow you to match a string only if it is preceded by another string. Lookbehinds are non-capturing, which means they don't consume any characters in the string; they just assert whether a match is possible or not.

In Python's re module, lookbehinds are defined using the following syntax:

(?<=...)

Here's a breakdown:

  • ?<=: Indicates the beginning of a positive lookbehind.
  • ...: The pattern you want to assert as a lookbehind.

Here are some examples to illustrate how lookbehinds work:

Example 1: Positive Lookbehind

Suppose you want to match the word "apple" only if it's preceded by the word "red".

import re

text = "red apple is better than green apple."
matches = re.findall(r'(?<=red\s)apple', text)
print(matches)  # ['apple']

Example 2: Negative Lookbehind

You can also use negative lookbehinds to match a string only if it is NOT preceded by another string. The syntax for negative lookbehind is:

(?<!...)

Let's modify the previous example to match the word "apple" only if it's NOT preceded by the word "red":

import re

text = "red apple is better than green apple."
matches = re.findall(r'(?<!red\s)apple', text)
print(matches)  # ['apple']

Limitation

Python's re module has a limitation when it comes to lookbehinds: the width of the lookbehind pattern must be fixed. This means you can't use quantifiers like *, +, or {m,n} that would make the lookbehind variable in length. If you need variable-length lookbehinds, you might want to explore other regex engines or libraries that support them, or try to reformulate your pattern to avoid this limitation.


More Tags

moving-average eventkit angularfire redis-server formbuilder android-wifi git-clone declaration amazon-redshift-spectrum azure-pipelines

More Programming Guides

Other Guides

More Programming Examples