How boberdoo Gets Near-zero Downtime with Datadog and Trek10 CloudOps
Datadog has given boberdoo the power to proactively fix issues with client-run websites before they result in downtime.
You’ve just sat down with a cup of coffee when you notice a puddle of water on the kitchen floor. The sink is gushing water and you need the leak fixed, now. A quick search pulls up a form that promises to find top-rated plumbers near you. You give it your zip code and, within half an hour, a local plumber calls to say they’re on their way.
Behind this speed and simplicity is boberdoo’s SaaS intelligent marketing technology platform. Their infrastructure handles tens of millions of requests per day, enabling their clients to connect people to the providers they need in real-time—from home services like plumbing and repairs, to car insurance providers and legal professionals.
“What we do is sort of like matchmaking,” says Dan Cerceo, Chief Information Security Officer at boberdoo. boberdoo works with a series of clients in regions across the country, who integrate their own instances of the boberdoo platform into their business processes. The platform uses a variety of factors, such as vertical and location, to help clients efficiently match their partners with business opportunities. These partners then connect the person in need with top-quality local service providers in real-time.
“24/7 availability is table stakes for us,” says Cerceo. “It’s absolutely critical.”
To ensure the highest possible availability for their clients, boberdoo relies on their robust Datadog implementation, which is designed and instrumented by Trek10 CloudOps.
Instant, Battle-tested Monitoring Suite
“It’s incredibly powerful what we can do with Datadog,” says James Bowyer, Director of CloudOps Division at Trek10. “We wouldn’t be able to provide CloudOps 24/7 Support without it.”
Datadog has out-of-the-box integrations with hundreds of commonly-used tools and services, and a robust API that Trek10’s CloudOps team can use to create customized monitoring suites for any use case. Trek10 has over 150 Datadog monitors for specific AWS services that are code-defined in JSON and stored and updated with source control. These monitors can then be quickly deployed to customer environments using Datadog's API for instant ops support—covering everything from CPU monitoring for multiple instances to DynamoDB throttling to Lambda errors.
“This is five years of our intellectual property, everything we’ve honed from years of doing CloudOps work, codified as monitors and best practices into Datadog,” says Josh von Schaumburg, Director of Sales Engineering at Trek10. “And we get to share all of it, instantly, with our clients.”
The monitors include predefined thresholds for when to warn or when to send a high-priority ticket to Trek10, whose AWS-certified agents offer 24/7/365 support. These agents then begin executing the runbook; actions that are built into the monitors offer best-practice advice when specific events occur.
For Cerceo, this has been a huge time-saver for the team. “Like every business, we have limited resources,” he says. “Trek10’s support together with our Datadog implementation means we can focus on growing the platform.”
Harnessing Data for Peak Performance
For boberdoo, a single minute of service interruption could cost them hundreds of money-making transactions. Their performance bar is astronomically high.
To give their platform best-in-class service, boberdoo uses current and historical Datadog metrics to build seasonality models for each vertical in their portfolio, constantly monitoring usage patterns and forecasting client needs. Healthcare clients, for instance, get flooded with requests during open enrollment. Accountants are slammed in March and April. Auto insurance agents see higher demand both in January-February and July-August when 6-month policies are due for renewal. boberdoo incorporates this data about each vertical’s seasonality and peak hours into their infrastructure design to optimize capacity and performance.
This well-defined infrastructure ensures the best possible service, and therefore maximum revenue, for their clients.
“We never want to be surprised by seasonal flux,” says boberdoo’s Dan Cerceo. “When we see heavy loads or increased latencies we need to know—is this an AWS issue, or something new the client is doing? We watch real-time indicators, but we rely on historical data, too.”
Datadog has given boberdoo the power to proactively fix issues with client-run websites before they result in downtime. Once clients deploy boberdoo's application, their website becomes tightly coupled with boberdoo's platform. Trek10 then adds boberdoo's new clients' websites to their proprietary "Pinger" HTTP request tool that sends metrics to Datadog. In turn, these metrics are monitored and dashboarded to alert on things such as response time and domain name, and TLS certificate expiration.
boberdoo also uses Datadog dashboards to continually optimize their infrastructure, which streamlines maintenance and AWS costs. “If you look at a list of our infrastructure,” says Cerceo, “it’s not a lot. But what it does is a lot. We keep everything lean, optimized, and efficient. We can quickly hit the dashboards for historical and current trends that give us the context to support optimization decisions.”
There’s a real data science aspect to boberdoo’s business, too, that utilizes machine learning and AI to get marketing data routed to the right partner, at the right place, at the right price. Their clients span a wide number of industries, and their platform can be customized to do everything from data distribution to phone call routing to A/B testing.
“What we provide is so much more than a marketplace. It’s a robust marketing technology engine that drives our clients’ businesses forward,” says Cerceo. “We love the flexibility Datadog gives us.”
Staying Relevant in a Rapidly-changing Cloud
The infrastructure support process boberdoo has with Trek10 and Datadog has become central to their product development and bottom line, from performance and availability enhancements to trend analysis.
“From a monitoring standpoint,” says Cerceo, “getting the Datadog monitors from Trek10 is great. Everything is automated. We just pull in what we need.”
When AWS releases new services, Trek10 adds Datadog monitors to reflect these new services and best practices—insider knowledge they can pass directly to CloudOps clients, keeping them at the edge of what’s possible in AWS.
Having quick access to best-practices monitoring and alerts saves boberdoo precious time they can spend on product development instead.
“We want to keep our entire focus on improving our platform and serving our clients,” Cerceo says. “Trek10 CloudOps and Datadog make it possible to do that.”