How to Check if a Python String Contains a Substring

How to Check if a Python String Contains a Substring
by:
blow post content copied from  Real Python
click here to view original post


To check if a string contains another string in Python, use the in membership operator. This is the recommended method for confirming the presence of a substring within a string. The in operator is intuitive and readable, making it a straightforward way to evaluate substring existence.

Additionally, you can use string methods like .count() and .index() to gather more detailed information about substrings, such as their frequency and position. For more complex substring searches, use regular expressions with the re module. When you’re dealing with tabular data, then pandas provides efficient tools for searching for substrings within DataFrame columns.

By the end of this tutorial, you’ll understand that:

  • The in membership operator is the recommended way to check if a Python string contains a substring.
  • Converting input text to lowercase generalizes substring checks by removing case sensitivity.
  • The .count() method counts occurrences of a substring, while .index() finds the first occurrence’s position.
  • Regular expressions in the re module allow for advanced substring searches based on complex conditions.
  • The .str.contains() method in pandas identifies which DataFrame entries contain a specific substring.

Understanding these methods and tools enables you to effectively check for substrings in Python strings, catering to various needs from simple checks to complex data analysis.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string:

Python
>>> raw_file_content = """Hi there and welcome.
... This is a special hidden file with a SECRET secret.
... I don't want to tell you The Secret,
... but I do want to secretly tell you that I have one."""

>>> "secret" in raw_file_content
True

The in membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.

When you use in, the expression returns a Boolean value:

  • True if Python found the substring
  • False if Python didn’t find the substring

You can use this intuitive syntax in conditional statements to make decisions in your code:

Python
>>> if "secret" in raw_file_content:
...    print("Found!")
...
Found!

In this code snippet, you use the membership operator to check whether "secret" is a substring of raw_file_content. If it is, then you’ll print a message to the terminal. Any indented code will only execute if the Python string that you’re checking contains the substring that you provide.

The membership operator in is your best friend if you just need to check whether a Python string contains a substring.

However, what if you want to know more about the substring? If you read through the text stored in raw_file_content, then you’ll notice that the substring occurs more than once, and even in different variations!

Which of these occurrences did Python find? Does capitalization make a difference? How often does the substring show up in the text? And what’s the location of these substrings? If you need the answer to any of these questions, then keep on reading.

Generalize Your Check by Removing Case Sensitivity

Python strings are case sensitive. If the substring that you provide uses different capitalization than the same word in your text, then Python won’t find it. For example, if you check for the lowercase word "secret" on a title-case version of the original text, the membership operator check returns False:

Python
>>> title_cased_file_content = """Hi There And Welcome.
... This Is A Special Hidden File With A Secret Secret.
... I Don't Want To Tell You The Secret,
... But I Do Want To Secretly Tell You That I Have One."""

>>> "secret" in title_cased_file_content
False

Despite the fact that the word secret appears multiple times in the title-case text title_cased_file_content, it never shows up in all lowercase. That’s why the check that you perform with the membership operator returns False. Python can’t find the all-lowercase string "secret" in the provided text.

Humans have a different approach to language than computers do. This is why you’ll often want to disregard capitalization when you check whether a string contains a substring in Python.

Read the full article at https://realpython.com/python-string-contains-substring/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]


December 01, 2024 at 07:30PM
Click here for more details...

=============================
The original post is available in Real Python by
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================

Salesforce