Disk problems
Disk problems
I'm gonna have to suspend crunching on my server for awhile.
This is the second time it has had problems with the hard drive (filesystem corruption) since I started crunching and it didn't before.
The drive itself isn't that old, so that's a bit worrying.
This is the second time it has had problems with the hard drive (filesystem corruption) since I started crunching and it didn't before.
The drive itself isn't that old, so that's a bit worrying.
-
- UBT Forum Admin
- Posts: 9710
- Joined: Mon Mar 13, 2006 12:00 am
- Location: NW Midlands
- Contact:
Re: Disk problems
Hi James,James Box wrote:I'm gonna have to suspend crunching on my server for awhile.
This is the second time it has had problems with the hard drive (filesystem corruption) since I started crunching and it didn't before.
The drive itself isn't that old, so that's a bit worrying.
Which server/OS/size of HDD is causing the issue?
AFAIK, all operating systems are pretty much "stable" these days and so you shouldn't have an issue, unless the CPU is being VERY over stretched and then the FAT can get disrupted.
OTOH, HDD are now so cheap that it's more likely that failure can occur, if something "happened"....
Have you got any clues as to what happened and when?
regards
Tim
Linux FC5 with ext3 filesystem.
Drive is, I believe, a Seagate Barracuda 160Gb bought last year.
You know something's up when you get a bunch of Uncorrectable Errors and then it sets the filesystem to read-only to prevent further corruption. First time this happened, I ran fsck and fixed a bunch of errors and it was okay.
Now I went to work and left it running a Seagate diagnostic tool. I'll see how it is later.
Drive is, I believe, a Seagate Barracuda 160Gb bought last year.
You know something's up when you get a bunch of Uncorrectable Errors and then it sets the filesystem to read-only to prevent further corruption. First time this happened, I ran fsck and fixed a bunch of errors and it was okay.
Now I went to work and left it running a Seagate diagnostic tool. I'll see how it is later.
-
- UBT Forum Admin
- Posts: 9710
- Joined: Mon Mar 13, 2006 12:00 am
- Location: NW Midlands
- Contact:
Can't say I'm too well versed in Linux, but if it's been kept up to date, then there's no reason why the OS should be causing the problem...!James Box wrote:Linux FC5 with ext3 filesystem.
Drive is, I believe, a Seagate Barracuda 160Gb bought last year.
Main "culprits" therefore would either be the BOINC client (assuming little else running at the same time) or the HDD.
I had a look at your stats here:
http://www.boincstats.com/stats/boinc_h ... 54e8af2868
and from what I can tell, is the problem with your Pentium 111 (Katmai) box?
Which seems to be running Einstein, SIMAP, SETI and SETI Beta?
Are all these crunching at the same time (i.e. none are suspended?). It might be the Katmai CPU is being over-stretched?
regards
Tim
-
- UBT Forum Admin
- Posts: 9710
- Joined: Mon Mar 13, 2006 12:00 am
- Location: NW Midlands
- Contact:
Hi James,James Box wrote:Yes, it is the P3 Katmai.
None are suspended. How can you tell if it is being over-stretched?
I would say it was "over-strteched" if it was simply having to switch from one task to another too frequently.....each time, it has to keep a "record" of where it had "crunched to" and hence save something to the HDD. If the CPU doesn't keep track of the HDD, maybe due to "trying to do too much", then it might have issues.
Is the PC doing anything else, other than BOINC?
regards
Tim
-
- Posts: 3790
- Joined: Mon Mar 13, 2006 12:00 am
-
- Posts: 3790
- Joined: Mon Mar 13, 2006 12:00 am
-
- UBT Forum Admin
- Posts: 9710
- Joined: Mon Mar 13, 2006 12:00 am
- Location: NW Midlands
- Contact:
In which case, invoke the warranty now, if you can....!James Box wrote:SeaTools found some bad sectors and marked them bad. It says I'll have to keep monitoring it. Can you believe this thing has a 6 year warranty expiring in 2011?
I'll keep an eye on the memory situation too.
Better to resolve now than to carry on, only for the inevitable to happen.
I assume the PC hasn't been dropped or had any "accident" while switched on.
Only time a Seagate failed on me, it gave similar "initial" issues with numerous bad sectors being reported. Mind you, it was a 40Mb drive from a few years ago.
Ditched it and got some lovely, nice Quantum drives, which were brilliant, until Quantum got sold off.
regards
Tim
Update:
I submitted an RMA request to Seagate and got it approved. I then removed the drive from my server and tried to get Linux working from a USB HDD and a small internal IDE drive. I managed that but started getting random lockups, and eventually failed to boot at all. So there's something wrong with the PSU, motherboard, memory or CPU...
I transferred the drives, including the (not yet returned) Seagate, to a borrowed Celeron 400 and all seem to be working fine!
End result is that we decided to upgrade my wife's PC to an Athlon64 3800+ and the server gets her AthlonXP 1500+. The Seagate will be on a final warning.
Looking forward to some increased RAC
I submitted an RMA request to Seagate and got it approved. I then removed the drive from my server and tried to get Linux working from a USB HDD and a small internal IDE drive. I managed that but started getting random lockups, and eventually failed to boot at all. So there's something wrong with the PSU, motherboard, memory or CPU...
I transferred the drives, including the (not yet returned) Seagate, to a borrowed Celeron 400 and all seem to be working fine!
End result is that we decided to upgrade my wife's PC to an Athlon64 3800+ and the server gets her AthlonXP 1500+. The Seagate will be on a final warning.
Looking forward to some increased RAC