자율 주행 차와 마찬가지로 완전 AI-A-Automated Sysadmins는 존재하지 않습니다
The Society of Automotive Engineers has defined six levels of autonomous driving, ranging from Level 0—where the driver is responsible for everything—to Level 5—where a car performs all driving tasks under any conditions from Point A to Point B. The same spectrum can be used with system administration tasks to determine where and to what extent AI should be leveraged.
If we apply the SAE’s levels to system administration tasks, it would look something like this:
- Level 0: No Automation
- Level 1: Assistance Required
- Level 2: Partial Automation
- Level 3: Conditional Automation
- Level 4: High Automation
- Level 5: Full Automation
I thought about the sysadmin tasks ripest for AI-based automation and leveled them based on this spectrum. It’s important to note that all of this is a snapshot in time. As AI technology matures—and organizations’ comfort level with AI increases—what’s a Level 2 today may be a Level 3 or 4 tomorrow.
It stands to reason, but none of the tasks I focused on landed at Level 0 or Level 1. As with cars, there are few system administration tasks that involve little to no automation. You could say something like racking servers and unraveling cables, but AI will never help detangle a Clark Griswold-level cable ball.
You’ll also notice that there are no Level 5 tasks—yet. (More on that later.)
I’m open to discussion and even argument on what I’ve come up with. I would also be really interested to hear how sysadmins would categorize these and other functions, as well as how they see the sysadmin role changing as AI matures.
System shutdown: In the traditional world, this could be a server shutdown, but in a modern, cloud-native world, it could be the shutdown of a critical application, load balanced across thousands of containers which run on hundreds of worker nodes. Either way, a human needs to be involved at a high level. There are a variety of reasons for initiating a shutdown, but humans should always be the ones driving them. At most, system shutdown should be Level 2 on the autonomy scale. AI can help suss out behavior anomalies or security threats. A “driver” assistance might prompt:
- “Are you really sure you want to shut that down?”
- “I noticed a couple of containers didn’t shut down correctly and were still serving traffic”
- “A critical task is hanging, and data hasn’t been flushed to disk, so shutting down now could cause database corruption”
Sort of like lane keeping. AI has the potential to really enlighten the user about the subtasks that are happening, and what their status is, in a completely new and transparent way. But the decision to initiate a shutdown should come only after a human has verified an issue and authorized defensive actions — feet on gas and brakes, if you will.
Repairing system issues: Using AI to diagnose issues and then automatically fix those issues is a promising use case, but still a Level 2. I’ve had conversations with colleagues who used agentic AI to determine whether a set of pods in Kubernetes was healthy and recommend tools to use to fix them if they weren’t. At this point, we’re staying away from automatic fixes because the prospect is a little bit terrifying, but it’s something we may see in the future. If you basically control the inputs and the outputs—for example, saying, “Here’s a set of tools you can use, and here are the things you can do with them”—AI is really good at figuring it all out. These capabilities could eventually be used to support safe automatic repairs, but might require some modifications to existing utilities.
Powering shells: Language models are being integrated into shells and CLIsvery difficult to remember, much less understand
Post Comment