I recently had a strange problem with one of my PCs. It was acting slow and sluggish, then the RAID 1 dropped a drive out saying it was failed (I’ve used RAID 1 on all my PCs for a while now, an dI highly recommend it). I shut down, inspected the failed drive, and turned the PC back up again, and it wouldn’t boot. No matter what I did, it wouldn’t come up. The next morning, it had shut itself off, and when I turned it on, it worked perfectly fine… and then shut itself off again after about 30 minutes. Clearly, I had a heat-related issue. But I wasn’t seeing any of the symptoms of CPU overheating, like random reboots or application errors; the expected shutdown was the only CPU-heat symptom, while the rest of the problems (drive errors, for example) pointed to motherboard issues. I installed the MB tools to monitor it, and it was clear that the CPU was indeed overheating; it hit 97 C within about 10 minutes of booting! Eventually, the PC refused to boot. I ordered a new motherboard, and thanks to Amazon Prime, it would be delivered less than 24 hours later for only $3.99 S/H.
Even though I was certain I knew what the fix was, I did a quick consultation with my friend Chris Ansbach via IM. He really knows his stuff, and he pinpointed the exact cause of the problem, which is going to help me prevent it. If you need to work with someone who knows their stuff, he’s your person and I’d gladly put you in touch with him. Looking at the motherboard layout, the two bridge chips northbridge chip is are located right next to the CPU, and is both are passively cooled. Inspecting the CPU and heatsink showed the cause of the overheating. The heatsink is the stock Intel model, and the plastic clips can eventually lose a little bit of tension. While the heatsink will still be on, and feel firmly attached, it will no longer make good contact with the CPU. Meanwhile, the thermal grease gets dried up (mine flaked off) because of the heat, and its is less effective, compounding the problem. Eventually, the CPU starts to overheat. Because of the location and cooling systems on the bridges, they were it was overheating too, causing that flakiness. After replacing the motherboard, the system is working like a champ; I got very lucky that the CPU was not damaged!
So, what’s the takeaway here? Two things:
1. Motherboard design matters a lot more than I thought. From here on out, I am going to be looking for motherboards where the bridges are actively cooled, and not right up against the CPU.
2. Heatsink design matters, even in a non-gaming, non-overclocked machine. Two big things that I learned to look for: a backplate to secure the heatsink to the CPU that uses screws or some other fastening mechanism that will not loosen with time, and fan that blows up or sideways, not down; this will ensure that if the case air is hot, it isn’t making the CPU any hotter. I knew about some of the other stuff (heat pipes to elevate the heatsink away from the CPU, larger design, etc.) but these were two things that I just was not aware of, particularly the backplate.
Hope this helps someone avoid the same kind of meltdown I had!
J.Ja