Go Back   HostGator Peer Support Forums > HostGator Announcements > Network Status

Notices

Closed Thread
 
Thread Tools
  #51  
Old 05-01-2005, 05:47 PM
faze1 faze1 is offline
Hatchling Croc
 
Join Date: Sep 2004
Posts: 17
Default Re: mustang drive failure 67.19.170.34

I'm on the mustang server and I'm still not back up from this episode. I was out of town yesterday and luckily I did not hear anything from any of my customers until this morning.

All my sites are still down and I can not access email. I offer my customers 99.9% uptime and at this point I have lost all my revenue for this month and it hasnt even started yet. How can my sites be down while everyone else seems to be back up?

I left a message on voice mail and no one has been online for support. I'm really hoping I'm back online soon so I can asses the data lost.

Thanks,
  #52  
Old 05-01-2005, 05:48 PM
garymchu garymchu is offline
Hatchling Croc
 
Join Date: Apr 2005
Posts: 9
Default Re: we shutdown Mustang due to drive errors....

Guess what, you guys offered us a month which is nothing compared to what we have to refund our clients we know that one well.

Then of course there is any clients that are lost, then we start multipling permanent lost monthly income, because "The planet are wortheless when it comes to harware"

You were told an hour, just like yesterday its well over that now, my support system is going mad and I have nothing to tell my clients. Once is ok but 2 days on the trot sheesh.

Inclidentally I talked to the planet just know on the phone and they tell me that a drive replacement shouldn't take long to rectify.

I'm dumbfounded. I will certainly never pay for hosting for a year again
  #53  
Old 05-01-2005, 05:54 PM
scavok scavok is offline
Hatchling Croc
 
Join Date: Apr 2005
Posts: 14
Default Re: we shutdown Mustang due to drive errors....

hostgator needs to go shopping

I'm glad I backed up everything last night after I reuploaded all my newwest clients. Will I have to do that again?
  #54  
Old 05-01-2005, 05:57 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

garymchu were in the same boat as you........ everything you said applies to us. You have just as much power as us to get this fixed. I've been on the phone all weekend with them calling all my contacts.


Why wasn't it completely swapped when I requested yesterday? I dont know
Why did they swap the wrong drive yesterday when we clearly told them which one? I don't know

Why is this taking hours when swapping should take 30 min? I dont know

what it looks like........ They have staffing problems and everyone there is doing the best job that they can but overworked and underpaid.


what we are going to have to do......... I'm going to have to fly to texas and hire someone to handle our hardware problems correctly when they come up.
__________________
Gators love marshmallows.
  #55  
Old 05-01-2005, 05:58 PM
faze1 faze1 is offline
Hatchling Croc
 
Join Date: Sep 2004
Posts: 17
Default Re: we shutdown Mustang due to drive errors....

Believe me, I understand that math. I am in a similar situation, the credit for this month does not do much to offset the money I need to reimburse my clients.

I'm praying there is minimal data loss, as almost every client site is still in the process of development and are being worked on constantly. Explaining downtime is one thing, explaining a weeks worthe of workj down the drain is another.

Please give us a realitic eta soon, so we have something to tell our clients.

Thanks,
  #56  
Old 05-01-2005, 06:02 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

It is back online now. I'm waiting for news to make sure this will not happen again and to confirm the real problem has been fixed.
__________________
Gators love marshmallows.
  #57  
Old 05-01-2005, 06:04 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

they took it back off line not sure why yet.
__________________
Gators love marshmallows.
  #58  
Old 05-01-2005, 06:12 PM
123456qwerty 123456qwerty is offline
Hatchling Croc
 
Join Date: Apr 2005
Posts: 5
Default Ping

Pinging 67.19.170.34 with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 67.19.170.34:
Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
  #59  
Old 05-01-2005, 06:32 PM
scavok scavok is offline
Hatchling Croc
 
Join Date: Apr 2005
Posts: 14
Default Re: we shutdown Mustang due to drive errors....

update? (sorry just a lil anxious here)
  #60  
Old 05-01-2005, 06:42 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

up again. I was told raid was replaced so it looks like they are done with whatever they did. I'm trying to find out what exactly was done so we can be sure it is fixed this time.
__________________
Gators love marshmallows.
  #61  
Old 05-01-2005, 07:27 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

"(npranivong-05/01/2005 19:18:00):
I have swapped out the 3ware RAID controller card and the 200GB Drive on port 0. 3Ware's did not report the drive to be non-usable, which is usually the case when a drive has is bad. However, when I booted your server prior to the hardware change, your server was reporting

scsi1: AEN: Warning: Sector Repair Occurred: Port 0. This could be due to ECC errors on the HDD."


here's what looks like happened.........

They replaced the bad drive with a brand-new drive that also was bad. how and why we are investigating with them.
we should not see any more issues related to this going forward.
__________________
Gators love marshmallows.
  #62  
Old 05-01-2005, 07:52 PM
VPC VPC is offline
Baby Croc
 
Join Date: Sep 2004
Posts: 58
Default Re: we shutdown Mustang due to drive errors....

Apache no workie...sigh
  #63  
Old 05-01-2005, 08:02 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: we shutdown Mustang due to drive errors....

We were able to figure out exactly what happened!

The server was up and running fine up until four days ago when one of the drives in the raid failed. We had it scheduled to be replaced over the weekend when traffic was at its lowest. By coincidence the working drive also completely failed the same day it was scheduled to have the other drive replaced.

When they took it off line they replaced the one drive and that's when we were waiting the entire day for the fsck. That drive was completely bad and could not be salvaged. The technician who was working on it believed they replaced the bad drive with the same drive. But in fact they did not screw up! both drives were bad!! the one drive was completely unusable the other was in the process of failing. That is the one that went down today and was replaced. It had enough life left in it to continue on and give us a full backup on another drive. However was the one erroring today which we replaced.


Nobody screwed up, just no one caught on either that both drives had failed.

This is extremely confusing so I'm not sure if anyone here will understand what I wrote but here's a quick break down....

two drives were operational 0, and 1
drive 0 failed and the server was running on drive 1
drive 1 failed leaving the server with 0,1 failed.

they replaced one Bad Drive during the first outage, and then another in the next when we shut down the server.

We have only had a couple drive failures out of the hundreds of servers we have to date. I have no idea what the odds of 2 drives on the same server failing within days of one another!!! crazy!
__________________
Gators love marshmallows.
  #64  
Old 05-01-2005, 08:04 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default mustang what really happened

We were able to figure out exactly what happened!

The server was up and running fine up until four days ago when one of the drives in the raid failed. We had it scheduled to be replaced over the weekend when traffic was at its lowest. By coincidence the working drive also completely failed the same day it was scheduled to have the other drive replaced.

When they took it off line they replaced the one drive and that's when we were waiting the entire day for the fsck. That drive was completely bad and could not be salvaged. The technician who was working on it believed they replaced the bad drive with the same drive. But in fact they did not screw up! both drives were bad!! the one drive was completely unusable the other was in the process of failing. That is the one that went down today and was replaced. It had enough life left in it to continue on and give us a full backup on another drive. However was the one erroring today which we replaced.


Nobody screwed up, just no one caught on either that both drives had failed.

This is extremely confusing so I'm not sure if anyone here will understand what I wrote but here's a quick break down....

two drives were operational 0, and 1
drive 0 failed and the server was running on drive 1
drive 1 failed leaving the server with 0,1 failed.

they replaced one Bad Drive during the first outage, and then another in the next when we shut down the server.

We have only had a couple drive failures out of the hundreds of servers we have to date. I have no idea what the odds of 2 drives on the same server failing within days of one another!!! crazy!

I have closed the other two threads if you have any questions please post in this one.
__________________
Gators love marshmallows.
  #65  
Old 05-01-2005, 08:07 PM
VPC VPC is offline
Baby Croc
 
Join Date: Sep 2004
Posts: 58
Thumbs down Re: mustang what really happened

Yeah, no MySQL DB's...can't wait to hear this one.

Anyways, I am in IT and I have never told a client that we will let their server run on 1 drive until the weekend. I cannot imagine a client that would say "No, do not replace the drive asap and leave us vulnerable until we are not so busy."

Nice way to gamble with someone elses livelyhood. Simply the dumbest decision I have seen in IT.

Last edited by VPC; 05-01-2005 at 08:11 PM.
  #66  
Old 05-01-2005, 08:11 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: mustang what really happened

it's fine. some services are being overloaded from it being down so long and everyone slamming the box give it a little time to catch up.
__________________
Gators love marshmallows.
  #67  
Old 05-01-2005, 08:19 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: mustang what really happened

VPC running on one drive is fine. 99% of the hosts out there run on one drive 24/7/365


Very few companies have a raid set up like we do. The chances of two drives failing within a few days of one another is extremely rare. We also had full backups on our disksync system that was taken 7 days prior.
__________________
Gators love marshmallows.
  #68  
Old 05-01-2005, 08:24 PM
VPC VPC is offline
Baby Croc
 
Join Date: Sep 2004
Posts: 58
Default Re: mustang what really happened

"VPC running on one drive is fine. 99% of the hosts out there run on one drive 24/7/365"

Exactly why someone that choses a company who advertises RAID like HG then that company says 1 drive is ok because others do it, is why people should not consider HG the standard of excellence, thanks for pointing that out Brent.

Saying that 1 drive is ok, just kind of proved you wrong with this weekends events, thanks again for pointing that out. Guess you will not consider changing out a drive next time and roll the dice with clients data again. Nice Brent.

To address your backups, 7 day backups is nothing to promote. 7 days is a long time for dynamic MYSQL websites to go back to. They do make backup solutions that can handle the amount of data on the HG servers. If you want to boast about something, get nightly backups and make HG a leader in hosting, or simply keep saying "other hosts do the same"

Last edited by VPC; 05-01-2005 at 08:29 PM.
  #69  
Old 05-01-2005, 08:39 PM
davoice davoice is offline
Hatchling Croc
 
Join Date: Sep 2004
Posts: 13
Angry Re: mustang what really happened

So what we customers who got burned on this would like to see is:

1) A free or paid nightly backup service. So we aren't at the mercy of a 7 day interval.

2) An *easy* way for us to backup all the stuff we have - all the email configs, mySQL databases, etc. - on an ad hoc basis when we feel the need. (Cpanel's backup util isn't quite there.)

3) Faster remediation and notification to customers of failed drives - and failures in general - in the future. I agree with the previous posters... as an IT Director myself, I would never let a failed drive go more than 24 hours. That is simply asking for trouble. I would also take a backup snapshot of the server as soon as you find out a drive has failed. That way you're having your arseparts flapping in the wind if the 2nd drive fails before a replacement can be brought online and synced. Taking a backup when the first drive failed would have saved your butt (and all of ours) here. But we didn't have the luxury of taking a backup ourselves b/c we weren't warned a drive had failed.

4) Make a suitable apology - both written and financially - to all your customers affected by this issue. We've basically lost an entire 7 days of work due to this problem. I personally lost $450 of consulting time we had paid for getting our business' web trouble ticketing system up and running from Wed-Fri before the complete failure. For many customers that is equivilent to shutting their doors. While hardware failures aren't your fault, you are responsible for them. That's part of the liability of running a hosting business. Even though you can blame ThePlanet for assisting in you having a thick layer of egg on your face, it's still solely your job to clean it off.

- Daniel P.
  #70  
Old 05-01-2005, 08:52 PM
garymchu garymchu is offline
Hatchling Croc
 
Join Date: Apr 2005
Posts: 9
Default Re: mustang what really happened

I agree totally, am researching an alternative service.

Several things throughout this incident ruined my opinions of HG.

Technical issues aside:

1. "The planet are not good with hardware issues" (or something to that effect) Nice to know.

2. Call this petty if you wish, but what really blew me away was my request for this board to have more regular updates from admin being met with a copied and pasted apology, from live chat support. That was a complete and utter insult!

My support team have spent all this down time communicating one on one with our clients nightmare as that was and not with copied and pasted replies.

3. The changing stories and the allocating of blame also does not make for good support.

Our clients expect fixes form us when there are issues.

We expect fixes from HG

HG expect fixes from the planet.

Seems the top of that tree is broken, I certainly am not prepaired to offer a service to my clients without adequate backup.
  #71  
Old 05-01-2005, 08:54 PM
VPC VPC is offline
Baby Croc
 
Join Date: Sep 2004
Posts: 58
Default Re: mustang what really happened

I would pay for a nightly backup. Hey Brent, do your servers in your internal network at HG run on nightly backups? Just wondering because I have never had a client that runs anything less than nightly backups. Why do hosting companies think anything less is "acceptable"?

Why does HG continue to use the "well the other guys do not have it" or "well the other guys have the same thing" damn, what happened to striving to be the best?

Let me tell you something, I would pay for items 1-4 in Daniel P. post because my clients pay me to do 1-4 for them. I would not want anything less. Is this a money issue or just an acceptable issue for HG?
  #72  
Old 05-01-2005, 11:14 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: mustang what really happened

It takes about 48 hours to complete a full backup. You can not do something daily if it takes longer than that to do

If anybody wants daily backups they're welcome to purchase a dedicated server, and assuming they do not have much in disk space used it will perform a daily backup.

to restore a full backup.... takes at least a day. had we not recovered the raid the restore alone would have taken much much longer.

An array rebuild takes close to 24 hours writing disk to disk.


backups are the heaviest thing you can do to a server. if I was to start a full backup on the whole server now load would go to 20+ If we did it daily load on the server would never go below 20.
__________________
Gators love marshmallows.
  #73  
Old 05-01-2005, 11:38 PM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: mustang what really happened

Do a virus scan on your computer and watch how slow it will go. Backups on a server has an equal effect.


Daily backups is not sufficient either. Say something happened while the site owner was sleeping and the daily backup overwrote itself.
__________________
Gators love marshmallows.
  #74  
Old 05-01-2005, 11:52 PM
VPC VPC is offline
Baby Croc
 
Join Date: Sep 2004
Posts: 58
Default Re: mustang what really happened

"It takes about 48 hours to complete a full backup. You can not do something daily if it takes longer than that to do"

I do not know what you are using for backup, but that is horrible. You do know that TANDY's are no longer a viable solution don't you?

Thanks for letting me know about the effect on my PC with VS. I am quite capable being in IT of knowing what loads by apps are. It is not a problem if the load is throttled and the right backup hardware is used to cut your 48 hr time to a 10th of that. I do it everyday.
  #75  
Old 05-02-2005, 12:12 AM
GatorBrent's Avatar
GatorBrent GatorBrent is offline
HostGator Staff
 
Join Date: Oct 2002
Location: houston, texas
Posts: 2,977
Default Re: mustang what really happened

VPC provide a viable solution and we will consider it.
__________________
Gators love marshmallows.
Closed Thread

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mustang Downtime - 30Min Tonight April 28 2005 GatorBen Network Status 0 04-28-2005 03:47 PM
Mustang MySQL problems again...... VPC Network Status 8 02-25-2005 05:30 PM
acura hard drive needs to be replaced GatorBrent Network Status 22 01-06-2005 09:54 AM
gator4 hard drive failure 12/26/03 GatorBrent Network Status 37 01-05-2004 05:41 PM

All times are GMT -6. The time now is 12:57 AM.

 
Forum SEO by Zoints