Posts Tagged Accountability
The one question that every CIO should ask themselves… What are you going to do when (not if) your cloud systems fail?
Frank started the conversation with this response to my tweet about Azure:
Frank: “Exactly the type of thing that reinforces CIO fears about cloud…”
Stuart: “working on the assumption that cloud outages are inevitable… I feel it’s how vendors respond that will give CIO’s confidence”
Frank: “No, fewer outages will give confidence…”
Stuart: “I’ll meet you half way… Fewer outages and proper service management around problems when they do happen…”
Frank makes the point that some of his CIO contacts were livid following this outage. And this is where this post really starts, as I challenged Frank as to exactly who they were livid at on the basis that to overall accountability for a company’s IT systems, whether they be on premise or in the cloud lies with the CIO.
Stuart: “as CIO you’re accountable for everything as you choose to use cloud or not!”
Alongside the Azure thread there was a parallel thread running on cloud security that had been started by Dennis Howlett in his Accman blog.
“Anything that connects to a network is vulnerable. That includes EVERY cloud player, regardless of the service they offer. What matters is the extent to which vulnerabilities exist AND are capable of exploitation.”
Let me share my belief here, these two topics are intrinsically linked, i.e. when you’re appointed as a CIO you’re trusted to deliver competitive advantage for your company through IT. Now, it doesn’t take a rocket scientist to work out that if you can’t maintain availability and adequate security of your systems then you’ll only manage to deliver disadvantage, and you probably won’t be around very long.
So, let’s get back to the title of the post… what are you going to do when your systems fail (which is inevitable)?
If you’re running in house, the apps themselves (if they are decent apps) are least likely to fail, more likely failures are from switches, disks, networks, cables and other parts of infrastructure. You protect yourself against this by designing your datacentre(s) around redundancy with zero single points of failure.
If you’re running cloud services, you pick a reputable supplier who works with a reputable hosting partner right? Well, yes but as we saw with Azure yesterday (and previously with Amazon and Rackspace and most other reputable cloud vendors) the same hardware failure points exist in cloud provider datacentres as they do in your own. If you appreciate and accept this this then you’ll also be mindful that you could be introducing a single point of failure in your enterprise platform and that your service availability is now at the mercy of their service availability.
When you running outside of your own bricks and mortar you also need a high bandwidth and high availability WAN, Firewalls and Proxies, etc that all need to be fault tolerant and designed around redundancy to ensure adequate access and security at all times. Even then you can’t mitigate around someone digging up the cable which has happened to me twice this year and is more common than you might expect.
Is this a story of cloud bashing? No it isn’t, it’s a story of how the CIO needs to take full accountability for managing risk within their platform.
- If you’re running mission critical systems and your business can’t afford any outage then you simply can’t design a single point of failure into your enterprise platform.
- If you’re running non mission critical systems, then you may choose to take a little more risk around availability and accept a single point of failure and manage any disruptions that may arise.
What you deem to be mission critical or not is your own decision and it doesn’t have to be one or the other. For my part I run a hybrid platform where some parts are mission critical and some parts less so and the platform design and location of services (in house vs. cloud) reflects this.
Of course from a customer perspective people outside of IT expect things to work 100% of the time and if you’re running either of the above, or a combination, then any outage no matter what damages your credibility with users.
So as an effective CIO, you need to design an effective platform around what your business needs, you need to manage the risk, you need to pick the suppliers that you work with, and you need to take full accountability when things go wrong. Yes you can get livid with your suppliers, but just remember who picked them and remember who chose to introduce a single point of failure into your platform in the first place.
So, what are you going to do when (not if) your cloud systems fail? Make sure you know the answer today.
Footnote: This post relates to large enterprise businesses and the role of the CIO and the point I’m trying to make is you have to plan for failure to guarantee success.
Part of this cross posted here
If you follow my blog you’ll be well aware of my recent trials and tribulations I’ve been having with Dell. I suppose like many others before me I innocently thought that when I placed the order that I was buying from Dell and that they would take accountability for my order until it was delivered. Sounds sensible don’t it? Well from the many conversations I have had with them during this debacle this obviously wasn’t the case. For example, here are some quotes I got from their call centre that made me think otherwise:
“Our customer service team is not to blame for mismanaging your expectation; the delay to your order was a supplier issue and it was beyond our control.” – and who picks the supplier?
“Your order is delayed because of a high level of demand at this time of the year.” – I always thought supply was a vendor issue not a customer issue?
“Delivery is now with UPS and you should talk to them about the status of your order.” – erm, I thought my order was placed with Dell not UPS?
It’s quite obvious here that Dell didn’t take ‘end to end’ accountability for my order and therefore the customer experience I recieved was abysmally poor.
Another example I’ve had recently is with Next Directory, if you’re from the UK you’ll know of them as the catalogue outlet part of high street chain of clothing stores right… well I thought the same but it turns out I was wrong!!! In this scenario I ordered something for collection at one of their high street stores. They sent me a email that it was ready for collection, so off I go to collect. When I arrive it isn’t there, there has been a delay and this is what I’m told by the Next shop assistant?
“Sorry your order isn’t here it’s because Next Directory isn’t part of Next and so it’s not under our control.”
Sigh! How about a simple hang on a minute, I’ll check with the Next Directory people for you… would have turned this into a good experience instead of a poor one.
In both cases there was a clear lack of accountability, if someone had taken accountability instead of saying it’s not my job then they could have dramatically changed my experience.
Now compare this to Amazon (Amazing), and a department store called John Lewis in the UK… they both have amazing customer experience because no matter who you talk to they look after you and their people make sure your issue is resolved whether it is their primary role or not.
The lesson here is very simple… no matter what role you have in a company, you are accountable for customer experience… do everything you possibly can, and then a bit more, to make sure your customers have a good one.
Anybody could have done it, but Nobody did it.
Somebody got angry about that, because it is Everybody’s job.
Everybody thought that Anybody could do it but Nobody realised that Everybody wouldn’t do it.
It ended up that Everybody blamed Somebody when Nobody did what Anybody could have done.
… and the moral of the story is to take accountability
A few weeks back I wrote this post on How to make Agile work for Product Development. As a build on that post I’d like to share how I’ve used Agile to successfully deliver ‘Business’ challenges outside of Product Development.
Consider this scenario, there is a problem in the business and despite many efforts to solve the problem the situation continues to get worse. I’m sure you’ve come across this before and it normally happens when a lot of people are involved in solving the problem but they aren’t working for a common purpose, they are trying to resolve the symptoms rather than the cause, and very often their efforts cancel each other out or make the problem worse.
So how can Agile help?
Well the Scrum process works just as well in this situation as it does for Product Development; here is the process I’ve used:
- Establish a Framework: Break the problem down into its component parts, identify the deliverables and the priority of each deliverable. Sound familiar, yes? This is our backlog.
- Identify Clear Focus and Accountability: Chunk the backlog up into a number of time bound sprints, identify clear accountability for who’s going to lead each sprint, who’s going to work on each sprint, and what needs to be accomplished in each sprint.
- Trust the process and the people: Let the sprint team get on with it. If you’ve given accountability to the right people then you don’t need to do their job. Instead provide support and help them by removing any barriers and keep distractions away.
I know what you’re thinking… Does, this mean you’re abdicating accountability? Absolutely not, you still have accountability for the entire thing and you need to be there if the sprint team need you. Does this mean that you won’t know what’s going on? Absolutely not, Scrum is an open process… anyone can drop into a sprint meeting to hear what’s going on, but remember not to interfere if you’re not part of the sprint team.
As with Product Development, It’s a good habit for the sprint team to use their Friday review meeting for more formal progress update, i.e. if you have a short project with one week sprints, this will be the end of sprint retrospective meeting, if you have two weekly sprints then this would be the mid sprint review, and so on. Personally, I favour Friday as it’s a good way to celebrate progress at the end the week and to retain focus on what the plan is for the next week. This is also a good technique to kill two birds with one stone, e.g. if you need to provide visibility to an external or higher level sponsor.
This really is a very simple and effective process which I’ve used this countless times to very good effect. However, as with Product Development, the formula for success is all down to having a clear framework that everyone understands, a strong focus always, and good people who are willing to take accountability.
There is of course a fourth ingredient to the formula that I haven’t mentioned here, anyone like to tell me what it is?