You are reading solutions / Python

Python String Contains – See if String Contains a Substring

An easy way to check if a string contains a particular phrase is by using an if ... in statement. We can do this as follows:

if 'apples' in 'This string has apples':
    print('Apples in string')
else:
    print('Apples not in string')
Out:
Apples in string

Today we'll take a look at the various options you've got for checking if a string contains a substring. We'll start by exploring the use of if ... in statements, followed by using the find() function. Towards the end, there is also a section on employing regular expressions (regex) with re.search() to search strings.

Option 1: if ... in

The example above demonstrated a quick way to find a substring within another string using an if ... in statement. The statement will return True if the string does contain what we're looking for and False if not. See below for an extension of the example used previously:

strings = ['This string has apples', 'This string has oranges', 'This string has neither']

for s in strings:
    if 'apples' in s:
        print('Apples in string')
    else:
        print('Apples not in string')
Out:
Apples in string
Apples not in string
Apples not in string

The output displays that our if ... in statement looking for 'apples' only returned True for the first item in strings, which is correct.

It's worth mentioning that if ... in statements are case-sensitive. The line if 'apples' in string: wouldn't detect 'Apples'. One way of correcting this is by using the lower() method, which converts all string characters into lowercase.

We can utilize the lower() method with the change below:

strings = ['This string has apples', 'This string has oranges', 'This string has Apples']

for s in strings:
    if 'apples' in s.lower():
        print('Apples in string')
    else:
        print('Apples not in string')
Out:
Apples in string
Apples not in string
Apples in string

Alternatively, we could use the upper() function to search for 'APPLES' instead.

The if .. in approach has the fastest performance in most cases. It also has excellent readability, making it easy for other developers to understand what a script does.

Of the three options listed in this article, using if ... in is usually the best approach for seeing if a string contains a substring. Remember that the simplest solution is quite often the best one!

Option 2: find()

Another option you've got for searching a string is using the find() method. If the argument we provide find() exists in a string, then the function will return the start location index of the substring we're looking for. If not, then the function will return -1. The image below shows how string characters are assigned indexes:

string_indexes.png

We can apply find() to the first if ... in example as follows:

strings = ['This string has apples', 'This string has oranges', 'This string has neither']

for s in strings:
    apples_index = s.find('apples')
    if apples_index < 0:
        print('Apples not in string')
    else:
        print(f'Apples in string starting at index {apples_index}')
Out:
Apples in string starting at index 16
Apples not in string
Apples not in string

For the first list item, 'apples' started at index 16, so find('apples') returns 16. 'apples' isn't in the string for the other two items, so find('apples') returns -1.

The index() function can be used similarly and will also return the starting index of its argument. The disadvantage of using index() is that it will throw ValueError: substring not found if Python can't find the argument. The find() and index() functions are also both case-sensitive.

Option 3: Regex search()

Regex is short for regular expression, which is kind of like its own programming language. Through re.search, a regex search, we can determine if a string matches a pattern. The re.search() function generates a Match object if the pattern makes a match.

Here's an example:

import re

re.search('apples', 'This string has apples')
Out:
<re.Match object; span=(16, 22), match='apples'>

Looking at the Match object, span gives us the start and end index for 'apples'. Slicing the string using 'This string has apples'[16:22] returns the substring'apples'. The match field shows us the part of the string that was a match, which can be helpful when searching for a range of possible substrings that meet the search conditions.

We can access the span and match attributes using the span() andgroup() methods, as follows:

print(re.search('apples', 'This string has apples').span())

print(re.search('apples', 'This string has apples').group())
Out:
(16, 22)
apples

If the substring isn't a match, we get the null value None instead of getting a Match object. See the example below for how we can apply regex to the string problem we've been using:

strings = ['This string has apples', 'This string has oranges', 'This string has neither']

for s in strings:
    if re.search('apples', s):
        print('Apples in string')
    else:
        print('Apples not in string')
Out:
Apples in string
Apples not in string
Apples not in string

In this case, the if statement determines if re.search() returns anything other than None.

We could argue that regex might be overkill for a simple functionality like this. But something like the example above is a great starting point for regex, which has plenty of other capabilities.

For instance, we could change the first argument of the search() function to 'apples|oranges', where | is the "OR" logical operator. In this context re.search() would return a match object for any strings with the substring 'apples' or 'oranges'.

The following demonstrates an example of this:

strings = ['This string has apples', 'This string has oranges', 'This string has neither']

for s in strings:
    if re.search('apples|oranges', s):
        print('Apples or oranges in string')
    else:
        print('Neither fruit is in string')
Out:
Apples or oranges in string
Apples or oranges in string
Neither fruit is in string

Summary

The easiest and most effective way to see if a string contains a substring is by using if ... in statements, which return True if the substring is detected. Alternatively, by using the find() function, it's possible to get the index that a substring starts at, or -1 if Python can't find the substring. REGEX is also an option, with re.search() generating a Match object if Python finds the first argument within the second one.

Take the internet's best data science courses Learn More

Meet the Authors

Get updates in your inbox

Join over 7,500 data science learners.