zimki | all bloggers | admin login | recent entries for dwilson
 
System Administration - Tactical vs Strategic

dwilson on Fri May 25 2007 09:58:58 GMT+0100 (GMT)

Contrary to my previous sysadmin blog posts we (unfortunately) don't spend all our time in the office trying new software and evaluating new hardware - even we need a lunch break. Instead we're working behind the scenes on Zimki and our other websites to keep things running happily, teaming up with our developers to answer any support issues that you lovely customers send our way and, sometimes, we even complete milestones in our longer running projects. Honest.

Despite the often recited "herding cats" analogy, creating an operationally efficient systems team is a pretty straight forward thing to do. Notice that I didn't say it'd be easy. It mostly requires an understanding that our workload has two main forms: tactical and strategic. I'm not including firefighting here - that's a topic for something longer than even one of my blog posts.

Tactical work is what most sysadmins spend their days doing. Helping customers (a much nicer term than users), fixing problems that appear, making small tweaks and changes etc. These tasks are often important to other people but they rarely help us achieve our own goals or complete our project work. The projects themselves, which are the strategic part of the workload, are every bit as important as user requests - they're just not as visible.

At Fotango the systems team is currently four people (we're looking for a fifth) and the work breakdown on a typical day looks a lot like this -

One person monitors the request tracking system. We track customer issues escalated upwards by our excellent (and very patient) front end support and all internal requests.

We've found that people have an expected response time for tasks. By assigning a dedicated person we keep our response time low while not constantly interrupting any one on a more involved task. The systems support person can also help with other, less focus demanding, tasks.

The second tactical role is the on-call bunny. She's first line for issues that crop up from our systems themselves. Problems detected via Nagios, suspicious lines in logs, performance bottlenecks and load spikes are all part and parcel of this role.

The other half of the team mostly work on our longer term projects, attend meetings that require a sysadmin to be present, or perform daily maintenance. These are often concentration demanding tasks (apart from the meetings) that are made much easier by having the other two providing an interruption shield. Of course, a big problem will drag them back off in to the trenches but there shouldn't be enough big problems to make this a real issue.

So now you know it's not all glamour in the systems team. Next time your page loads quickly and with no problems spare a thought for the effort we've put in so you don't have to.

leave a comment

name
email
comment