Cookie Policy

We use cookies to operate this website, improve usability, personalize your experience, and improve our marketing. Privacy Policy.

By clicking "Accept" or further use of this website, you agree to allow cookies.

Accept
Learn Machine Learning by Doing Learn Now
You are reading solutions
Cansın-Guler-profile-photo.jpg
Author: Cansin Guler
Software Engineer

Python ValueError: substring not found

ValueError: substring not found

Why does this happen?

Python raises ValueError: substring not found when the index() function fails to find a given substring.

Let's look at an example case. The code below runs a for loop over a word list, trying to locate each word inside the given quote:

quote = "I knew exactly what to do, but in a much more real sense I had no idea what to do."
word_list = ["real", "idea", "jamrock"]

for word in word_list:
    i = quote.index(word)
    print(f"{i} - {word}")
Out:
46 - real
66 - idea
Out:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 5
      2 word_list = ["real", "idea", "jamrock"]
      4 for word in word_list:
----> 5     i = quote.index(word)
      6     print(f"{i} - {word}")
ValueError: substring not found

The for loop fails at its third run because index() cannot find 'jamrock' in quote.

When we use index() to locate substrings in whose existence we are not confident, we risk termination. We should reinforce our code against the error or replace the index() call completely.

This article will discuss several solutions and their possible advantages over one another.

Solution 1: find()

On String values, the function find() works almost identically to index() with one crucial difference - when find() fails, instead of raising an error, it returns -1.

Let's bring back the example from the intro and swap the index() for find():

quote = "I knew exactly what to do, but in a much more real sense I had no idea what to do."
word_list = ["real", "idea", "jamrock"]

for word in word_list:
    i = quote.find(word)
    print(f"{i} - {word}")
Out:
46 - real
66 - idea
-1 - jamrock

Like index(), find() also takes the optional start and end parameters. So if we wanted to narrow the search between two indexes, we could still do it.

Below, the second find() call uses the index where the first 'word' ends and the quote's length to bracket the search:

quote = "Me think, why waste time say lot word, when few word do trick."

first_occurrence = quote.find("word")
second_occurrence = quote.find("word", first_occurrence + 4, len(quote))

print(first_occurrence, second_occurrence)

Though find() seems like the perfect substitute for index(), there is a subtle difference: while index() is available for lists and tuples, find() only applies to string values.

It's also important to note that -1 is a valid index marking the last element of a sequence. This can cause problems when using the fetched indexes in further processing.

You can see a simple demonstration of the pitfall below:

s = "abcdef"
i = s.find("k")  # returns -1, since there is no 'k'

print(s[i])

This issue, however, can easily be resolved by a conditional statement:

s = "abcdef"
i = s.find("k")  # returns -1, since there is no 'k'

print(s[i] if i > -1 else "###")

Solution 2: try-except

One way of working around a possible ValueError is wrapping the index() call in a try-except block, like so:

quote = "Should have burned this place down when I had the chance."
word_list = ["when", "jamrock", "burn"]

for word in word_list:
    try:
        i = quote.index(word)
    except ValueError:
        print(f"Couldn't find the word: '{word}'.")
        i = "#"
    print(f"{i} - {word}")
Out:
35 - when
Couldn't find the word: 'jamrock'.
# - jamrock
12 - burn

This option takes a bit more typing but allows us to assign a fallback index or print a custom message when index() fails.

Solution 3: if...in

One other solution is to ensure the existence of the substring using if...in before calling index(), like so:

quote = "I have to be liked. But it’s not like this compulsive need like my need to be praised."
word_list = ["praise", "like", "jamrock"]

for word in word_list:
    if word in quote:
        i = quote.index(word)
    else:
        i = "#"
    print(f"{i} - {word}")
Out:
78 - praise
13 - like
# - jamrock

One advantage this method has over the try-except option is that we can compress the if-else block into one line like this:

for word in word_list:
    i = quote.index(word) if word in quote else "#"
    print(f"{i} - {word}")
Out:
78 - praise
13 - like
# - jamrock

The downside is that if...in impacts performance when used over many iterations due to the extra lookup to check if the substring exists. Refer to the previous solution if this performance hit is a concern.

Summary

When index() function fails to find a given substring within a string, it raises the ValueError: substring not found.

To prevent the program from crashing, we can replace the index() method with find(), which performs the same string operation yet fails in silence. It is also helpful to wrap the index() call in a try-except block or to use if-in to check the existence of the substring before calling index().


Meet the Authors

Cansın-Guler-profile-photo.jpg

Software engineer, technical writer and trainer.

Brendan Martin
Editor: Brendan
Founder of LearnDataSci

Get updates in your inbox

Join over 7,500 data science learners.