‘Ops, I Did it Again’ - Think FaaS Podcast
Well hello again, I’m Forrest Brazeal at Trek10, and this is ‘Think FaaS,’ where we learn about the world of serverless computing in less time than it takes to run a Lambda function. So put five minutes on the clock - it’s time to ‘Think FaaS.’
Today I want to give a shoutout to the forgotten ladies and gentlemen of the serverless revolution - the friendly neighborhood operations engineers. Yes, I’m talking to you, DBA, and you, network admin, and you, person that used to call yourself a ‘Windows sysadmin’ but now you’re a ‘DevOps engineer’ even though as far as you can tell you’re still spending most of the day force-updating group policy on Server 2008.
Well, people tend to forget that the modern serverless movement really was born on the ops side, right? AWS Lambda originally was kind of just a way to run cron jobs in the cloud. I think the first Lambda function I ever wrote was a scheduled task that cleaned up some OUs in ActiveDirectory. There’s all kinds of drudgy little things you can automate for yourself in the cloud using functions as a service. Serverless can make your life better –right now–, and you can have a lot of fun with it that way.
But in another sense, serverless can feel like an existential threat, right? If your whole job is the care and feeding of servers, what does it mean for your career when the developers in your organization start deploying their code directly on cloud services and sort of passing you by?
Well, as long as code has bugs, we will need people who can solve production problems, and that’s what an operational skill set is all about. But to remain competitive in a post-server world, you need to think differently. I’ve got four big takeaways for you.
First, you have to think at the level of the whole system, not the level of the server. Back in the old days, we ops folks were great at remoting into a server to check CPU utilization. Obviously you can’t do that with serverless. But when database connections start hanging, or a particular user complains about latency, we still need to know what is happening and why. And you can be the person to push forward this discipline in your organization.
Next, you’ll have to clean up your act when it comes to utilization. Some of us in the cloud ops world have gotten lazy about throwing hardware capacity at problems instead of fixing them. Got a memory leak? Add some more RAM to that VM. But serverless systems force you to break problems down and address them at a functional level. Not to mention, running serverless seems inexpensive at first compared to spending capital on a bunch of servers, or even reserved EC2 instances, but all those service costs can add up fast if you don’t keep a close eye on what you’re spending.
Third, the time for supporting an application you don’t understand is over. It used to be that you could have expertise in Windows or Linux, and if a problem came up that required deeper knowledge of the app, you’d kick it back over the wall to the developers. With serverless, the application is literally the only thing you control, so you have to provide business value much more directly.
Now, that doesn’t mean your life will get simpler, especially once you get a serverless system up to production scale. With serverless, your problems are going to occur not on servers, but between services, at the place where your database talks to your functions and your event stream triggers your notifications. You’re going to have hundreds of concurrent instances of functions running and at any given moment, two of them are going to do something weird. Far from serverless making your life simple, you’re going to find that there are so many fires to fight that you’ll never be able to keep up with them all by clicking around in the AWS Console.
That brings us to the last takeaway: you have to learn how to scale yourself. You will provide operational value in this new world not by clicking buttons in a GUI, but by automating everything you can get your hands on. That means having a rock-solid solution for automatic deployments and patches. It means getting serious about observability for large systems. There are a few tools that can help with that. Here at Trek10 we rely pretty heavily on Datadog, which lets us monitor our serverless stacks and troubleshoot problems at scale. If you’re using AWS Lambda you also will want to check out IOPipe, which is a service that lets you report on errors from your functions and do rudimentary distributed tracing. It even does cool things like pin specific function invocations to the underlying containers, and if you’re solving a problem that requires that level of insight you’re already having a bad day, so don’t make it worse by squinting at CloudWatch logs and guessing.
So is serverless the death knell for your ops career? Absolutely not. It’s another step on a road that we’ve been on for a long time, which is automating undifferentiated tasks and focusing attention on more interesting and important work. If you’re willing to keep learning, rethink your assumptions, and grow with technology, as far as you’re concerned, server”less” can open the door to so much more.
That’s it from me! Hope you’ll join me next time for another episode of ‘Think FaaS.’