|
#1
|
|||
|
|||
|
For reasons unknown to us at this time the g35 server is having severe issues. Several services appear to have become corrupt, and the primary boot partition will not allow us to enter Runlevel 3 (Which is multi-user mode). We are working with our datacenter to resolve these issues as quickly as possible and we currently have three of our best Level 3 System Admins working together to find the root cause of the problem.
We do apologize for the inconvenience and frustration our users are experiencing but I'd like to take a moment to reassure everyone that we are doing everything in our power to bring this server back to 100%. As soon as we have more information to report we'll relay it to everyone. At this time please do not submit a trouble ticket should you have an outage on g35, rather please wait for us to update this forum thread with our status. Once we've declared that the issue has been resolved if your site is not functioning then by all means please submit a support ticket. Again, our apologies for this downtime but the circumstances are beyond our control. If anyone has any questions or concerns please do not hesitate to contact us! Best Regards, Robert Stone |
|
#2
|
|||
|
|||
|
Just an update...
We are still having issues even getting this server to boot. At this point we are booting to a "Live CD" which is an entire OS (Operating System) on a single CD. The difficulty with this is that very few distros (distributions of the linux operating system we use) have support for the 3ware RAID drivers so we had to use a special Live CD in order to mount the hard drives into the live filesystem. Once the server comes up via the Live CD we will attempt to repair the operating system by "chrooting" into the original environment, and from there we will attempt to actually repair the OS. If we are unable to do so we will "rsync" (a method of copying files that is designed for backups) all the data onto an emergency server and use that while we repair g35. Everyone is keeping our fingers crossed that we'll be able to repair the OS, and we'll let everyone know what's going on as soon as a decision is made as to how to proceed. Again our profuse apologies, and our thanks for your patience. Expect further updates as new information arrives. Best Regards, Robert Stone |
|
#3
|
|||
|
|||
|
We have found the cause of the problem. It seems that several key system files have become corrupt. We are in the process of restoring these system files in order to bring the server back on-line as quickly as possible.
Once we've completed restoring system files we're going to bring the server back on-line and begin to rsync all data off of the server to an emergency server. The reason for this is, whenever there is system file corruption the safest course of action is an "OS Reload" otherwise system stability may suffer (as we've all seen). The server will be active and processing requests while we create our backup so the downtime should be minimal once we've repaired the Operating System. We will update everyone once the server is on-line again. Again we appreciate everyone's patience during this difficult time. Best Regards, Robert Stone |
|
#4
|
||||
|
||||
|
Quote:
And even as I write this, I can see my sites are slowly coming back on line... Thanks again.. Dwight Jenkins
__________________
=================== Dwight Jenkins Rainbow Flair Web Design =================== |
|
#5
|
|||
|
|||
|
Another update...
At this point we have come to the decision that the operating system on the g35 server simply can not be saved, however we are very relieved to report that all customer data is intact and corruption free. We are running some websites on the g35 via the livecd but this is not a solution that can hold long as the load rises on the server. So where are we now? We've loaded a fresh OS on an emergency server and we are in the process right now of copying files off of the g35 server to the emergency box. As soon as this has been completed we'll set the ARP information so that the emergency server will act just like g35. For those of you unaware, ARP, or address resolution protocol, is what turns an IP address into a MAC address which is how individual NIC's (Network Interface Cards) are identified and packets are routed by network hubs and switches. Once we preform the adjustments we should have fully restored all accounts and everything should be back online. From there everything else should be transparent to our users. We'll continue to make the repairs needed but a final decision hasn't been made as to if we'll physically swap the servers or if we're going to restore g35 and migrate the data back over. As soon as we know more we'll pass that info along. Their is a light at the end of the tunnel folks, and we're almost there. Thank you again for your patience and understanding during this difficult time. Best Regards, Robert Stone |
|
#6
|
||||
|
||||
|
Further update:
We have Apache and DNS up on the server now. Sites are working but mails are not up currently. We are in the process of getting everything up and will keep posting updates here.
__________________
Shashank Wagh Systems Administrator & Level III Support, Hostgator.com LLC. Find us @ http://www.HostGator.com/help/ |
|
#7
|
|||
|
|||
|
The post that said that apache and dns are up and that the sites are now working (minus mail) was posted 2 1/2 hours ago. All of our sites are still down, including two very critical ones (bransonwindmillinn.com and gammill.net).
|
|
#8
|
|||
|
|||
|
Come on guys, DO SOMETHING. This is 5th time for me this month when i have downtime, last one was about one hour but today the server is almost all day down!!!!!
|
|
#9
|
||||
|
||||
|
Quote:
Thanks, Dwight
__________________
=================== Dwight Jenkins Rainbow Flair Web Design =================== |
|
#10
|
||||
|
||||
|
The server transfer is around 80% completed and should be fully completed within the next few hours. We will keep this thread updated when we have more information.
|
|
#11
|
||||
|
||||
|
The transfer is currently complete and the new server is online. Please email support@hostgator.com if you see any issues with accounts.
|
|
#12
|
|||
|
|||
|
Hopefully our final update.
We've noticed a few routing problems with individuals who have "Private IP Addresses". These are mostly for users who have SSL with their sites. If anyone is still experincing down time please send a message to support@hostgator.com and let us know if you fall into this category. Best Regards, Robert Stone |
![]() |
| Bookmarks |
| Thread Tools | |
|
|