Data and Machine Learning | Serverless

ChatKitty Powers Premier Chat Experience with AI-driven Content Moderation on Amazon Comprehend

Trek10 uses AWS machine learning to lay the foundation of a machine learning-based content moderation system for ChatKitty's chat platforms

More than ever, people expect the applications they use to be rich-featured, responsive, and nearly flawless from a UI perspective. It can be a lot for one company to deliver alone.

ChatKitty’s mission is to relieve some of that pressure by addressing a growing need for performant, affordable in-app chat solutions. ChatKitty does the work to create a chat UI with all the features end users have come to expect, which application developers can integrate into their projects in a few lines of code.

In any chat platform, moderation is key—but human-powered moderation is incredibly expensive and time-consuming. CTO Aaron Nwabuoku wanted to build out a machine learning-based content moderation system, but ChatKitty’s team was already pushing several other critical features through the pipeline. Nwabuoku decided to partner with Trek10 to lay the machine learning foundation for ChatKitty’s content moderation so that his engineering team could focus their time on building out other important features.

“The first time we had a call with Trek10, we could see how much they knew about AWS machine learning,” said Nwabuoku. ”I immediately felt like I could trust their team to work alongside us.”

The Benefits of Using Amazon Comprehend for Machine Learning

There are many machine learning solutions out there, and business needs play a huge role in deciding which services to use. A key benefit of Amazon Comprehend is that it has many predefined models to choose from, meaning users don’t have to define the algorithms behind the models themselves. This would make it possible for ChatKitty to apply sophisticated machine learning algorithms to their service, and continue improving existing models, without a significant engineering time investment.

“We wanted to give ChatKitty a repeatable process for building and hosting models,” said Brenden Judson, Cloud Engineer at Trek10, “and Comprehend was a super easy way to implement this.”

In May 2021, Trek10 began working with ChatKitty to plan the architecture, which would be a serverless infrastructure powered by Amazon Comprehend and backed by AWS Lambda. “I’d been exploring the AWS machine learning ecosystem for a while,” said Nwabuoku, “and knew there were a lot of potential approaches to take. Talking to Trek10 gave me confidence that they really knew what they were doing and would help us make the right decisions.”

Lambda-backed Machine Learning Architecture

Using Amazon Comprehend, Trek10 trained a custom classifier for content moderation on a unique corpus (text-based) data set, which contained a set of non-profane instances and a set of profane instances. All interactions with Comprehend are handled in a serverless fashion using AWS SAM, which ultimately defines AWS Lambda functions. Different Lambdas can be used to interact with the Comprehend API and complete tasks such as training new models for custom classifiers or invoking the trained models. A SageMaker Jupyter Notebook performs EDA and ETL while Python data science libraries, such as Pandas, explore and clean the code.

Comprehend can also automatically handle some model evaluation and performance based on accuracy, precision, recall, and more.

The entire solution is repeatable, has quick business outcomes, and took less than one month to deliver—complete with a pre-trained content moderation classifier. The new model successfully flags inappropriate language with 90% accuracy and is continually improving as ChatKitty feeds more corpus data into the model, with the goal is to reaching 98% accuracy soon.

“I was incredibly surprised that this entire end-to-end solution was built in basically a couple of weeks!” said Nwabuoku. “Getting this content moderation put into place, all while we had the time to work on other important features, has been game changing for us.”

Training New Machine Learning Classifiers

Machine learning models are dynamic entities that grow stale over time; it’s important for ChatKitty to keep growing their training data set and training new models. The next step for ChatKitty is to add a refresh strategy, which automatically detects when the KPI starts performing under the threshold and retrains the model.

ChatKitty is already excited to not only train new classifiers to handle text moderation, but to similarly use Amazon Rekognition to handle image and video moderation.

Because the machine learning infrastructure is Infrastructure-as-Code (IaC) and clearly laid out in CloudFormation templates, Nwabuoku said it’s been very easy for the team to understand the pipeline from end to end and absorb it into their own code base. By using cloud native technologies, ChatKitty doesn’t need the computing resources typically required to train or host models, nor do they have to pay a data scientist to create an algorithm, which lowers their total cost of ownership.

Thanks to Trek10 being part of the AWS Activate program, they were able to issue AWS credits to ChatKitty that they could use in the following months to test their solution, further lowering cost.

Laying the Foundation for Growth

Nwabuoku is incredibly excited to keep growing ChatKitty’s team and build out new features such as video and voice chat.

“I want to democratize chat,” says Nwabuoku. “Every startup, every company, should have access to a quality in-app communication experience.”

In a nutshell, Nwabuoku wants to provide for other companies the engineering freedom to work on what they feel differentiates their product in the marketplace. With a simple integration, ChatKitty can provide the rich-featured, performant chat experience end users have come to expect.

“It’s similar to what Trek10 was able to provide for us,” says Nwabuoku. “The ability to spend more of our time focusing on new features and growing our customer base.”