Your Linux Data Center Experts

I'm not a big fan of using memtest86 for testing systems, because system problems can often be more complex than just the memory. However, recently we've had quite a few memory problems and memtest finds those pretty quickly and does a good job of pinpointing them.

You have to be really careful about manipulating static-sensitive equipment when it's dry cold out, and memory is some of the most sensitive. This is because when you are heating the relative humidity goes down and static build up tends to be more of an issue.

We tend to be pretty careful, staying grounded particularly while dealing with CPUs and memory. I also try very hard to just touch the board, not the chips or contacts. However, recently we've had a lot of memory problems. One of the boxes that had bad memory hadn't been touched for at least a year. We were in the process of migrating services off to a new machine when it started hanging after a few hours. Another machine had been fine for several months, but started having problems.

For memory problems, memtest86 is so pin-pointed that if it shows a problem it's likely to be memory. Where if you're running gcc compiles, which test much more of the system, and it fails, you don't know exactly where it is failing.

On the Fedora Core, you can boot the first CD and at the initial boot prompt type “memtest86” to start the memory test. This is quick and easy, a good way to test the memory.

Unfortunately, most of the systems we have been getting over the last several years have not been ECC compatible. Sad but true. The new machines we are using are, but for a period of about 5 years or maybe even longer ECC memory stopped being available. Odd since RAM sizes have kept going up and up, meaning more memory to have problems with.

comments powered by Disqus

Join our other satisfied clients. Contact us today.