Developer Experience

Handling Errors in Boto3 & Botocore

At Trek10 we see our fair share of errors when using python boto3. Here's how we handle them exceptionally well.
Ryan Brown Trek10
Ryan Scott Brown | Apr 22 2020

Imagine you have a call that may fail (known more commonly as "every call"). It's a well-documented fact that saying "hah, no need to worry about X failing" is the only way to guarantee X will fail and page you at 3 am. You might write code like this using the S3 client to create a new bucket.

>>> import boto3
>>> s3 = boto3.client('s3')
>>> s3.create_bucket(Bucket='test', ACL='private')
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to. 

This is one of the more common exceptions: a botocore ClientError is bubbling up from the API call layer (botocore) up to your higher-level call (boto3). Unfortunately, the type ClientError doesn't give us enough information to be useful. You can import botocore.exceptions and handle this error by the class (ClientError in this case) but that information isn't enough.

We have to do more work handle errors based on their error code like these from the S3 docs. Your boto3 client also carries its own exception factory - you can build an exception class from any error code from those docs and catch it.

import boto3
s3 = boto3.client('s3')
try:
    s3.create_bucket(Bucket='test')
except s3.exceptions.from_code('IllegalLocationConstraintException'):
    print('bad location')

Each of these error codes might need special handling. The IllegalLocationConstraintException error above isn't something we can retry our way out of. The error is happening because S3 buckets are globally unique and we're trying to make one called test which someone else has definitely used. We need to pick a new bucket name for the call to succeed. If we were hitting the API too frequently we would get a ClientError with a SlowDown code telling us to wait a little bit before continuing. This would happen if we're bumping up against the S3 API rate limit.

But this doesn't work how we think it does. If you try to catch two different error codes with this method we will always enter the first handler like this:

import boto3
s3 = boto3.client('s3')
try:
    s3.create_bucket(Bucket='test')
except ddb.exceptions.from_code('SlowDown'):
    print('Calm yourself')
except ddb.exceptions.from_code('IllegalLocationConstraintException'):
    print('Bad location')

No matter what ClientError we hit this code will always print "Calm yourself" because Python will match the first except that has a matching exception. This is why you want to handle exceptions from most specific to least. Since we can't handle it based on the type of the exception, we need to use the e.response['Error']['Code'] attribute to distinguish between error codes and take the right action.

import boto3
import botocore.exceptions
s3 = boto3.client('s3')
try:
    s3.create_bucket(Bucket='test')
except botocore.exceptions.ClientError as e:
    if e.response['Error']['Code'] == 'SlowDown':
        print('Calm yourself')
    if e.response['Error']['Code'] == 'IllegalLocationConstraintException':
        print('Bad location')
    else: # neither of those codes were right, so we re-raise the exception
        raise

Now we've handled our error, but there may be a better way for us to deal with the wide variety of codes we might want to handle. One little-known fact about Python is the except clause is executed when the exception is being handled and doesn't just have to be an exception class. That's why we were able to write this line in an earlier example:

except ddb.exceptions.from_code('IllegalLocationConstraintException'):

The from_code function returns an exception class for Python to check and either catch or pass on. We can take advantage of this to examine the current exception. Then we can check the error code and if it's the one we're trying to match we return a ClientError that will trigger the clause to handle our exception. If we don't find a match, we'll return an exception subclass that doesn't exist (the NeverEverRaisedException) so that the clause will never be run.

import sys
import boto3
from botocore.exceptions import ClientError

def is_client_error(code):
    e = sys.exc_info()[1]
    if isinstance(e, ClientError) and e.response["Error"]["Code"] == code:
        return ClientError
    return type("NeverEverRaisedException", (Exception,), {})

s3 = boto3.client('s3')
try:
    s3.create_bucket(Bucket='test')
except is_client_error('SlowDown'):
    print('Calm yourself')
except is_client_error('IllegalLocationConstraintException'):
    print('Bad location')

Now when we run our snippet we get the output we expect: Bad location because the SlowDown clause never matched. But there's still one more thing we can do. If we're outside of an exception, we still want to be able to easily check error codes in a truthy/falsy way. To do that we add a None-defaulting keyword argument so we can use our function like this too:

if is_client_error('SlowDown', exc):
    time.sleep(1)
    pass

The Final Handler

This is useful when you're being passed in an exception but are outside the exception handler itself such as bubbling up the exception to a logging library for enrichment. Our final function looks like this:

import sys
from botocore.exceptions import ClientError

def is_client_error(code, e=None):
    """Match a botocore.exceptions.ClientError to an error code.

    Returns ClientError if the error code matches, else a dummy exception.

    Based on Ansible's GPL `is_boto3_error_code` https://github.com/ansible/ansible/blob/stable-2.9/lib/ansible/module_utils/aws/core.py
    """
    if e is None:
        exc = sys.exc_info()[1]
    else:
        exc = e
    if isinstance(exc, ClientError) and exc.response["Error"]["Code"] == code:
        return ClientError
    return type("NeverEverRaisedException", (Exception,), {}) if e is None else False

We can handle being called as part of an except clause, or anywhere else we find a ClientError to match. This should help you write more readable code and handle errors when working with AWS APIs better.

Thanks for reading, and have a pythonic day.

Author
Ryan Brown Trek10
Ryan Scott Brown