My old stats machine

Having problems installing that new stick of memory? Found some great software or having issues with something? Or maybe want to chat about your PlayStation, X-Box, Nintendo, Sega, even your old Spectrum 48k....! Or maybe something you want to sell or acquire (computing related of course!). Let us know here...
Post Reply
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

My old stats machine

Post by Temujin »

I guess my old stats machine is about 5 years old now and has enjoyed its retirement from serving the stats for about a year now. During that time its been a dedicated boinc machine running both CPU and GPU apps under Fedora 9 linux.

Late last week it stared to lockup after 4 or 5 hours, sometimes wouldn't even boot.
I'd been thinking of upgrading its Fedora from v9 to something a bit more modern for quite a while but never got round to it.  All this locking persuaded me to give it a go and maybe it would solve the locking.
On went Fedora 14 and the locks continued :(

hmmn  :evil:

ok, I'll borrow some memory sticks from work and see if it's my memory sticks at fault.
Nope, it still locks up.
So yesterday I removed the CPU and heatsink, cleaned them both up and refitted them with new paste and so far so good, it's been up for almost 24hrs now  :thumbup:

Strange thing is that the CPU temperature monitors weren't showing anything unusual and it didn't seem to make any difference if boinc was running or not.

Maybe CPU paste just gets old Image
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

Very odd, but I'm glad that you've fixed the problem. :)

Out of curiousity, how do you monitor your temps John? I tend to use lm-sensors to monitor such things on my Ubuntu machine, but I've noticed that there are several different outputs for the CPU temps, and there is often quite a variation in the readings that they give, which is kind of weird.  :shock:

James.
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Joshrandom wrote:Out of curiousity, how do you monitor your temps John? I tend to use lm-sensors to monitor such things on my Ubuntu machine
Yep, lm-sensors for me too.
I've not compared different sensor types or the same machine in different OSs but my 2 quads have always showed similar temps.
1 core has always run significantly hotter than the other 3 on both machines (like 10C hotter)
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

On my quad, the variation between cores is rarely any more than around 6C (with the hottest of the cores reaching around 70C when crunching). However, on my lm-sensors, there are two check boxes for CPU temperature and then a further four boxes (one for each core), right now the cores are reading 62C 56C 58C and 60C respectively, while the two CPU temps are showing as 52C and 40C, but then I guess that I probably just set the app up wrong when I installed it.  :lol:
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Joshrandom wrote:On my quad, the variation between cores is rarely any more than around 6C (with the hottest of the cores reaching around 70C when crunching).
When mine get up to 70C I give the fans a damned good clean :D
right now the cores are reading 62C 56C 58C and 60C respectively,
Mine are at 55, 56, 52 & 63C
Intel Q6600 BTW with a Zalman heatpipe cooler and honking great fan
while the two CPU temps are showing as 52C and 40C, but then I guess that I probably just set the app up wrong when I installed it.  :lol:
I've got 2 other sensors that report anything and they both show 35C, one of them is labelled MB temp.

It's 20C in my computer room right now, which is fairly cool :D
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

Temujin wrote:When mine get up to 70C I give the fans a damned good clean :D
That was my thought back during the hot weather, cleaned out the fans etc, and even added a new 120mm chassis fan to blow cool air in, but the CPU temps on my Q6600 obstinately remained at 70C. The only thing that changed was that the CPU fan ran slower, only spinning up when the temps started to climb beyond 70C.  :?

Hmm, it looks like I'm going to need to take another look at the airflow in my system.  :(
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

After being up for over 3 days, it locked up again this afternoon and now won't boot  :evil:

Ho hum Image
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

Temujin wrote:After being up for over 3 days, it locked up again this afternoon and now won't boot  :evil:

Ho hum Image
When you say it won't boot, do you mean that it's totally dead or just that it crashes while loading the OS?

If it's the latter, have you tried booting from cd?

I've had a couple of systems that failed at boot due to hardware issues, the first was caused by a failing PSU, the second by a failing HD5970, in both cases the only way that I could work out where the problem lay was to bite the bullet and start swapping suspect components between systems, a process that can be quite scary.  :shock:

Whatever the problem, I'm sure that you'll track it down, good luck. ;)
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Good point there Mr Random.
It actually POSTs ok, then locks just after displaying "Verifying DMI Pool Data........"
Last thing I tried was swapping the boot order and then I got
Verifying DMI Pool Data........
Boot from cd......

I'm at home tomorrow afternoon so i'll work on it a bit then.
It could well be a faulty PSU or GPU, they'll be the things I swap out tomorrow having already tested the memory & CPU/heatsink last weekend.
The GPU is reasonably new (6 months or so) but I think the PSU may be the original Akasa I bought for it 5 years ago, so that'll be where I start.

It is nice and quiet in my computer room now though :D
Zydor
Posts: 437
Joined: Mon Nov 01, 2010 12:00 am

Post by Zydor »

Worth running ....

sfc /scannow

.... inside safe mode from a command window when you can get it that far in the boot sequence.  The beast been running fine - as such -  from what you have said here, then "randomly" locks up/crashes. That sounds like a software system fault slowly building up.  SFC is a very good utility for checking system file integrity, and worth running periodically, let alone during present troubles.

Regards
Zy
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Hi Zy

It's a linux machine mate :D
It's even had a fresh install and is still locking up.

I'm gonna start pulling hardware this afternoon to try narrow it down
Zydor
Posts: 437
Joined: Mon Nov 01, 2010 12:00 am

Post by Zydor »

opppps ....  Senior Moment  :)
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

hmmn this is doing my head in :twisted:

I've now replaced the motherboard with one borrowed from a guy at work.
Another fresh install of Fedora 14.
It is at least staying up but has highlighted a possible cause, or at least a "funny"

I have 3 computers in my computer room, 2x linux & 1x winxp
I only have 1 monitor, keyboard & mouse, so use a 4 port KVM switch to swap between them as required.

This new motherboard & fedora refuses to recognise the keyboard & mouse through the KVM, so i've plugged in an old imac usb keyboard & mouse directly into it and it then works fine.
The original keyboard & mouse still work fine with the other 2 machines through the KVM.

The KVM port I usually use for this machine has a usb connection for keyboard & mouse and I wondered if the cable was at fault. I changed to to the spare one which has the old fashioned PS/2 style plugs. The machine then refused to boot, just continually rebooted. Remove the PS/2 connectors and plug in the imac usb and it boots.

So it looks like there's something strange happening with the usb system somewhere.
But as it's still there with the new motherboard it looks like the problem is with the keyboard, mouse or cables Image
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

Clearly you are a lot more capable with this sort of thing than I am John, but after reading your post there were a few things that occurred to me.
Temujin wrote:I've now replaced the motherboard with one borrowed from a guy at work.
How confident are you that this borrowed motherboard is in full working order, and that the problems it seems to have with the keyboard and mouse aren't just a red herring?
Temujin wrote:I have 3 computers in my computer room, 2x linux & 1x winxp
I only have 1 monitor, keyboard & mouse, so use a 4 port KVM switch to swap between them as required.

This new motherboard & fedora refuses to recognise the keyboard & mouse through the KVM, so i've plugged in an old imac usb keyboard & mouse directly into it and it then works fine.
The original keyboard & mouse still work fine with the other 2 machines through the KVM.
You say that the KVM switch works fine with the other 2 machines, but have you tried swapping the KVM connections from one of these working systems to the problem machine?

Oh, and what happens if you connect the keyboard and mouse from the KVM directly to the PC using the same ports that you use for the KVM leads?
Temujin wrote:So it looks like there's something strange happening with the usb system somewhere.
But as it's still there with the new motherboard it looks like the problem is with the keyboard, mouse or cables Image
I once tried to add a USB card to one of my older systems, with the result that the PC would regularly crash or else fail to boot properly, when I removed the card the system became stable again. After thinking it through I came to the realisation that the new USB card was drawing more power than the PSU could comfortably supply, causing the crashing and other problems. With this in mind I have to say that the evidence still seems to suggest that your PSU might be behind all of your system's current issues, have you tried swapping it out yet?  :?
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Joshrandom wrote:
Temujin wrote:I've now replaced the motherboard with one borrowed from a guy at work.
How confident are you that this borrowed motherboard is in full working order, and that the problems it seems to have with the keyboard and mouse aren't just a red herring?
haha, not confident at all cos he didn't use it cos he couldn't get it to boot :roll:
But.. I reckon that's down to the PS/2 connections. As soon as I plug something into those the thing refuses to boot
You say that the KVM switch works fine with the other 2 machines, but have you tried swapping the KVM connections from one of these working systems to the problem machine?
that's next on the list, it's just that F1 qualifying and football have got in the way this afternoon :D
Oh, and what happens if you connect the keyboard and mouse from the KVM directly to the PC using the same ports that you use for the KVM leads?
good idea
.....................
Yep, they work, nice one :thumbup:
I once tried to add a USB card to one of my older systems, with the result that the PC would regularly crash or else fail to boot properly, when I removed the card the system became stable again. After thinking it through I came to the realisation that the new USB card was drawing more power than the PSU could comfortably supply, causing the crashing and other problems. With this in mind I have to say that the evidence still seems to suggest that your PSU might be behind all of your system's current issues, have you tried swapping it out yet?  :?
Yep, it could well be down to the PSU and I still haven't got around to checking that.
I don't have a spare PSU so it'll mean shutting down another cruncher for the duration but I guess i'll just have to do it :?

good help there Josh, thanks :thumbup:
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

Right then, I've reverted back to my motherboard, using the one from work just complicated things.

I've swapped PSUs, so mis-behaving machine now has the OCZ 700w PSU from the other linux machine (which now has the possibly bad Akasa 850w)

Keyboard & mouse through the KVM still don't work but imac keyboard & mouse does.
I'll try get another KVM cable tomorrow.

Having thought about events over this last week, I've come to the conclusion that the locking was caused by a failing hard disk in its 2 disk striped raid. Any time the machine was up using the raidset it would lock. Since I split the raid and installed onto a single disk it hasn't locked, it's just been this weird KVM behaviour
Temujin
Posts: 2259
Joined: Mon Mar 13, 2006 12:00 am

Post by Temujin »

A shorter kvm cable seems to have done the trick :thumbup:

just gotta wait and see if it locks up again now :?
Joshrandom
Posts: 5602
Joined: Sat Jun 23, 2007 1:00 am

Post by Joshrandom »

Fingers crossed that you've fixed the problem John (it's certainly sounding a lot more stable at least), well done.  :thumbright:
Post Reply