Following O2‘s second widescale outage in recent months, they have been quick to regroup and initiate damage limitation. In an attempt to deflect the blame, they have claimed that equipment supplied by the Swedish company Ericsson caused the fault.
The outage hit on Friday afternoon on 12 October and left about three million O2 users without service for several hours. Contrary to initial reports all services were affected including data – not just calls and texts. The issue persisted for several hours during the busy Friday night period and continued for days in some cases ruining many people’s weekend plans. The previous outage occurred just a few months ago in July, lasted about 24 hours and was caused by a similar issue.
As a result of the outage, Derek McManus, Chief Operating Officer, wrote that O2 would be removing the Central User Database provided by Ericsson at a cost of £10 million. It’s not exactly clear whose equipment they will use instead but for the time being they will roll back to the previous solution.
The outage was caused by an issue with the Central User Database. This is the part that is provided by Ericsson and was introduced earlier this year. It forms part of the HLR or Home Location Register. This is the part of the network that holds information about the users and communicates with the central Mobile Switching Centre which acts like the main router – it controls all services across the network. Without a functioning Central User Database, the HLR breaks down and the Mobile Switching Centre can’t do its job – the whole network effectively breaks down for affected customers.
Even in spite of the removal of the faulty Ericsson equipment, pressing questions still remain for O2. Firstly, if the fault is with Ericsson’s hardware and software, why did they no identify the potential issues prior to contracting it and installing it into their network. Did they not perform their due diligence or did they just go with the lowest bidder?
Secondly, and even more damningly, if inadequate failover systems and network redundancy was identified as being at fault for the first outage in July, why was this not addressed until now? Why was it not immediately fixed? It’s telling that, despite experiencing a major fault like this, O2 attempted to patch over the cracks and hope for the best rather than implement a complete and final fix.
Some of you may remember this tweet from O2 CEO Ronan Dunne last July:
To all our affected customers – I'm very sorry. The network is back. My focus now is restoring your confidence and trust in O2.
— Ronan Dunne (@ronandunneo2) July 12, 2012
Especially after claiming that they were trying to rebuild customers’ confidence, it seems strange that they didn’t plan for issues such as this and left faulty hardware in the system.
Even worse that this happening again is the conspicuously absent apology this time round. Quite how O2 think it’s reasonable to have an outage of this scale and length and not even have the decency to apologise (let alone offer compensation) really beggars belief.
What do you think? Were you affected? Was it a mistake not to fix this permanently after the first outage. Should they have offered compensation? And what do you make of the failure to say sorry to those affected?