Re: Network Outage: Provo
While most all services have been restored as of this time, I'd like to first note that we're still working towards tying up the final lose ends. These remaining issues are absolutely a priority for us currently. In the meantime we do want to provide some more information and answer everyone's questions with what details we do have available currently.
Q: What happened?
A: We experienced a degradation of network service in one of our data centers due to a firmware bug in one of our vendor’s hardware solutions. This was an undocumented bug and we worked with our partner to diagnose the issue and deployed a firmware update to the systems to remediate the problem. Only websites that were being served by this hardware were affected.
Q: Was this related to any previous outage?
A: No, this is unrelated to any previous outages.
Q. Have you identified the problem?
A. Yes, we have isolated the problem to this firmware failure and the downstream effects that resulted from it. We have reviewed our entire network to make sure this problem will not occur elsewhere.
Q. Why did it take so long to address the problem?
A. We started to address the problem immediately when we began to see performance issues. The root cause of the problem was complicated to diagnose because it was an undocumented bug in software of a vendor’s hardware solution. Full service for some customers was restored immediately, but some servers were not visible on our network. We apologize for any downtime that you experienced. The servers continued to operate during this entire period, which means, that at no point in time was your data at risk. The problem was access to the servers because of the firmware issue.
Q. What happened to any email that was sent to me while this firmware issue was affecting the network?
A. There is good and bad news. Unfortunately, any message that was sent to you while we were experiencing this issue would not have been delivered, however the sender should receive a notice that their mail wasn't delivered and most mail servers will continue to try to re-send that email at periodic intervals, anywhere from 2 days to up to 7 days. While we cannot guarantee that any emails sent to you will be delivered, there is a very good chance that it will arrive...slightly delayed.
Q: How has Endurance's involvement with HostGator affected the situation?
A. Actually, this was not a result of Endurance. In fact, the team at our corporate headquarters was tremendously helpful in our recovery effort. They stayed with us throughout the entire incident. By committing the resources of the entire company, including technicians, customer service reps, and engineers, we were able to swarm the problem and address it as quickly as possible.
Q. Why did you leave SoftLayer?
A. We moved out of SoftLayer to be able to more fully control our server environment to provide a better customer experience. We work really hard to prevent issues like this from happening. We recognize that this transition has not been as smooth as either you or we would like and we take the issues that have occurred very seriously. We believe in the long run this is the best environment to deliver service to you.
Q. Do I have to worry about this happening again?
A. We would like to say that we will never have a network service outage again, but realistically that isn’t something we can promise. What we can assure you is that we are continually taking steps to audit and improve the performance of our infrastructure, and investing a large amount of capital and people to do this.
Last and certainly not least, I want to thank everyone for your extreme patience throughout this. We realize the situation is hugely frustrating, but we look forward to getting this resolved for you all and hopefully moving forward stronger.
Director of Customer Service
Have support questions?
Check out our Knowledgebase: http://support.hostgator.com/
Follow us on Twitter:
Last edited by GatorJMartin; 04-17-2014 at 11:35 AM.