Cookie Policy

We use cookies to operate this website, improve usability, personalize your experience, and improve our marketing. Privacy Policy.

By clicking "Accept" or further use of this website, you agree to allow cookies.

Accept
Learn Machine Learning by Doing Learn Now
You are reading solutions
Cansın-Guler-profile-photo.jpg
Author: Cansin Guler
Software Engineer

Python KeyError: How to fix and avoid key errors

A KeyError occurs when a Python attempts to fetch a non-existent key from a dictionary.

This error commonly occurs in dict operations and when accessing Pandas Series or DataFrame values.

In the example below, we made a dictionary with keys 1–3 mapped to different fruit. We get a KeyError: 0 because the 0 key doesn't exist.

fruit = {
    1: 'apple', 
    2: 'banana',
    3: 'orange'
}

print(fruit[1])
print(fruit[0])
Out:
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[4], line 8
      1 fruit = {
      2     1: 'apple', 
      3     2: 'banana',
      4     3: 'orange'
      5 }
      7 print(fruit[1])
----> 8 print(fruit[0])
KeyError: 0

There is a handful of other places where we might see the KeyError (the os and zipfile modules, for example). Yet, the main reason for the error stays the same: the searched key is not there.

The easiest, immediate, all-fitting solution to the key error would be wrapping the value-fetching portion of the code in a try-except block. Like the code below does:

fruit = {
    1: 'apple', 
    2: 'banana',
    3: 'orange'
}

for key in range(5):
  try:
    print(fruit[key])
  except KeyError:
    print("Couldn't find a match for the key:", key)
Out:
Couldn't find a match for the key: 0
apple
banana
orange
Couldn't find a match for the key: 4

The try-except construct saved our program from terminating, allowing us to avoid the keys that have no match.

In the next section, we'll use more nuanced solutions, one of which is the _proper_ way of adding and removing dictionary elements.

Generic Solutions

Solution 1: Verifying the key using 'in'

While working with dictionaries, Series and DataFrames, we can use the in keyword to check whether a key exists.

Below you can see how we can use in with a conditional statement to check the existence of a dictionary key.

info = {
    "name": "John", 
    "surname": "Doe"
}

if "email" in info:
    print(info["email"])
else:
    print("No e-mail recorded.")
Out:
No e-mail recorded.

This method does not change in the slightest when applying to a Pandas Series, as you can see below:

import pandas as pd

info_series = pd.Series(data=info) # parsed the previous dict to Series

if 'name' in info_series:
  print(info_series['name'])

We can use the same if key in collection structure when verifying DataFrame column names. However, we have to add a bit more if we want to check a row name.

Let's start by building a DataFrame to work with:

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d, index=['row1', 'row2'])

print(df)
Out:
col1col2
row113
row224

Now we can check whether a column name is in df or a row name is in df.index:

if 'col1' in df:
  print('Found col1')

if 'row2' in df.index:
  print('Found row2')
Out:
Found col1
Found row2

Solution 2: Assigning a fall-back value using get()

We can use the get() method to fetch dictionary elements, Series values, and DataFrame columns (only _columns_, unfortunately).

The get() method does not raise a KeyError when it fails to find the key given. Instead, it returns None, which is more desirable since it doesn't crash your program.

Take a look at the code below, where fetching the non-existent key3 returns None:

d = {'key1': 111, 'key2': 222}

print(d.get('key3'))

get() also allows us to define our own default values by specifying a second parameter.

For example, say we have a website with a few URLs and want to fall back to a 404 page:

urls = {
    'home': '/index.html',
    'about': '/about.html',
    'contact': '/contact.html'
}

print(urls.get('/blog.html', '404.html'))

The get() method also works on Pandas DataFrames.

Let's define one like so:

data = {
    'Name': ['John', 'Jane'],
    'Age':[34, 19],
    'Job':['Engineer','Engineer']
}

df = pd.DataFrame(data)
print(df)
Out:
NameAgeJob
0John34Engineer
1Jane19Engineer

We can try and grab two columns by name and provide a default value if one doesn't exist:

df.get(['Name', 'School'], 'Non-Existent')
Out:
'Non-Existent'

Since not all the keys match, get() returned 'Non-Existent'.

Accessing Items in Pandas: The loc-iloc Mishap

Programmers learning Pandas often mistake loc for iloc, and while they both fetch items, there is a slight difference in mechanics:

  • loc uses row and column names as identifiers
  • iloc uses integer location, hence the name

Let's create a Series to work with:

data = ['John', 'Peter', 'Gabriel', 'Riley', 'Roland']
index = list('abcde')

names = pd.Series(data, index)
names
Out:
a       John
b      Peter
c    Gabriel
d      Riley
e     Roland
dtype: object

How would we retrieve the name "John" from this Series?

We can see John lies in the "a" row, which we can target using loc, like so:

If we were to use iloc for the same purpose, we'd have to use the row's integer index. Since it's the first row, and Series are 0-indexed, we need to do the following:

If we used an integer for loc we would get a KeyError, as you can see below:

Out:
Traceback (most recent call last):
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\base.py:3802 in get_loc
    return self._engine.get_loc(casted_key)
  File pandas\_libs\index.pyx:138 in pandas._libs.index.IndexEngine.get_loc
  File pandas\_libs\index.pyx:165 in pandas._libs.index.IndexEngine.get_loc
  File pandas\_libs\hashtable_class_helper.pxi:5745 in pandas._libs.hashtable.PyObjectHashTable.get_item
  File pandas\_libs\hashtable_class_helper.pxi:5753 in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  Cell In[36], line 1
    names.loc[0]
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1073 in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1312 in _getitem_axis
    return self._get_label(key, axis=axis)
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexing.py:1260 in _get_label
    return self.obj.xs(label, axis=axis)
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\generic.py:4056 in xs
    loc = index.get_loc(key)
  File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\base.py:3804 in get_loc
    raise KeyError(key) from err
KeyError: 0

Note that this is only true for the cases where the row labels have different values than the indexes.

Dictionary-specific solutions

Now we'll look closer at the operations that may cause KeyError and offer good practices to help us avoid it.

  1. Avoiding KeyError when populating a dictionary

Let's give an example of how this may go wrong:

fruit_list = ['apple', 'berries', 'apple', 'pear', 'berries']
fruit_dict = {}

for item in fruit_list:
  fruit_dict[item] += 1
Out:
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[15], line 5
      2 fruit_dict = {}
      4 for item in fruit_list:
----> 5   fruit_dict[item] += 1
KeyError: 'apple'

It's clear this is a mistake since the code is trying to fetch items from an empty dictionary, but this example demonstrates the problem of wanting to use a dictionary as if it already had the keys present.

We could write another loop at the start that initializes each value to zero, but Python offers defaultdicts for such situations. They are type-specific dictionaries with defaults for handling new keys.

Take a look:

from collections import defaultdict

fruit_list = ['apple', 'berries', 'apple', 'pear', 'berries']
fruit_dict = defaultdict(int) # instead of {}

for item in fruit_list:
  fruit_dict[item] += 1

print(fruit_dict)
Out:
defaultdict(<class 'int'>, {'apple': 2, 'berries': 2, 'pear': 1})

The only change needed is swapping in defaultdict for the empty brackets. The defaultdict is of type int, meaning that the access of any new key will auto-create that key with an initial value of 0.

This also works for more complex scenarios, like if you want a default value to be a list. In the following example, we generate ten random numbers and store them as either even or odd:

from collections import defaultdict
import random

numbers = defaultdict(list)

for i in range(10):
    r = random.randint(1, 5)
    if r % 2 == 0:
        numbers['even'].append(r)
    else:
        numbers['odd'].append(r)
        
print(numbers)
Out:
defaultdict(<class 'list'>, {'odd': [1, 1, 1, 3, 1, 1], 'even': [4, 4, 2, 4]})

Using defaultdict(list) we're able to immediately append to the "even" or "odd" keys without needing to inialized lists beforehand.

2. Avoiding KeyError when deleting dictionary items

Deleting dictionary keys runs into the same problem as accessing keys: first we need to get the key using \[\] to delete it.

We can always check whether the key exists before attempting to delete the value assigned to it, like so:

babies = {
    'cat':'kitten', 
    'dog':'pup', 
    'bear':'cub'
}

if 'bear' in babies:
  del babies['bear']

babies
Out:
{'cat': 'kitten', 'dog': 'pup'}

A quicker way, however, would be to pop() the value out of the dictionary, effectively deleting it if we don't assign it to a variable.

pop() takes the desired key as its first parameter and, similar to get(), allows us to assign a fall-back value as the second parameter.

Take a look:

babies = {'cat':'kitten', 'dog':'pup', 'bear':'cub'}
baby = babies.pop('lion', 'nope, no lion')
print(baby)
Out:
nope, no lion

Since Python couldn't find the key, pop() returned the default value we assigned.

If the key exists, Python will remove it. Let's run pop() one more time with a key we know exists:

babies.pop('cat')

print(babies)
Out:
{'dog': 'pup', 'bear': 'cub'}

The 'cat' was found and removed.

Summary

KeyError occurs when searching for a key that does not exist. Dictionaries, Pandas Series, and DataFrames can trigger this error.

Wrapping the key-fetching code in a try-except block or simply checking whether the key exists with the in keyword before using it are common solutions to this error. One can also employ get() to access elements from a dictionary, Series or DataFrame without risking a KeyError.


Meet the Authors

Cansın-Guler-profile-photo.jpg

Software engineer, technical writer and trainer.

Brendan Martin
Editor: Brendan
Founder of LearnDataSci

Get updates in your inbox

Join over 7,500 data science learners.