Troubleshooting Mining Rig: Power Issues


Issue: Only 4/5 GPU’s Reporting Temperature under Claymore V9.4

Today when I woke up and checked in on my miners, I noticed one of my 5x GPU rigs was reporting only temperatures for 4 of the cards. It is running Windows 10 with the latest Claymore v7.4 and the correct number of GPU mining instances were reporting in and delivering results, as well as all five GPUs were being properly displayed in Device Manager. Having seen this before I knew right away it meant one of the cards was not getting enough power, either via the riser or through the 8-pin PCIe connector.

This usually boils down to turning the miner off and re-seating all the cards and power connectors, and failing that looking to see if I have undersized wires or loading too many cards on a shared connector coming off the power supply. I figured as long as I was going to go through the exercise, I thought I might as well take a few screenshots and write up a short guide that may help out my fellow miners who may encounter this problem and be unfamiliar with the symptoms and resolution.

Closeup to better make out only 4 GPUs are returning temperature readings.

As mentioned previously, since I have encountered this problem before I knew right where to begin. The first time it happened to me I ignored it as I figured as long as it was mining I was ok. Then one day I noticed the PCIe power cable going to one of the cards on that rig was extremely warm to the touch. I swapped it out and the new cable ran normally (room temperature) and I also noticed that now all my temperatures were correctly reporting in. I have seen it a couple of other times since and usually just re-seating of a cable or card is needed, but sometimes the cable will need to be swapped out.

Troubleshoot

The first thing I normally do when this crops up, is to shutdown the rig and take all of the cards out and re-seat everything. By everything I mean I disconnect all power cables between the rig Power Supply Unit (PSU), unplugging both ends if it is a modular PSU. This step includes even unplugging the power cables going to the motherboard. I figure as long as I am going to do it I might as well take the extra effort and do it correctly. I also unplug all the riser cables (again both ends) as well as remove the GPUs for the riser adapter boards.

Since I have everything pretty much disassembled at this point, I then also take this opportunity to clean the cards (vacuum/blow the dust out) as well as blow off the motherboard and PSU. I usually do this in my garage so it is easier to blow everything out good while I am disassembling everything.

Once everything is apart and cleaned I begin to reassemble the rig. I will usually also run some spot checks on cables and voltages just to make sure everything is alright with the PSU too. I put the rig together with one GPU and the power needed for the motherboard and one graphic card. I leave the SSD off at this point as it can sit at the BIOS screen as I am only checking voltages. I usually will plug in the empty cables and check 12V, 5V on a couple of PCIe cables and any Molex adapter cables I am using. As a rule I try to avoid using the SATA power cables in my rigs, except for the SSD, but if you do use those be sure to check those voltages too. (As a side note, my next motherboard purchases I plan to use the M2 slot for SSD, so I will eliminate the need for the SATA power cable completely.)

Also, a periodic good cleaning never hurts and will help lower operating temperatures slightly as well as prolong the life of your mining equipment. To play it safe I would recommend getting a anti-static “electronics’ vacuum and using canned air rated for electronic use.

If you use an air compressor, be sure the tank is drained first and do a few test squirts first to make sure no moisture comes out. Also, keep the pressure fairly low. You don’t need much past 10-12 psi to get at most dust and dirt. if you set it too high there is a danger of impacting the dirt into the components (especially exposed slots), or even causing physical damage to delicate components.  On humid days, a lot of moisture can form inside the tank and find its way into your components potentially causing more harm than good. An add-on in-line moisture remover is recommended for your air line if you have a lot of rigs and plan to clean them this way. Finally, be sure to hold you fans in place with your finger as you blow near them, as over-spinning a bearing can easily happen if you are not careful.

Once everything is clean and hooked back up, you can fire up the rig and see if the issue is resolved. I would say in at least 50% of the cases I have experienced, a simple re-seating of cards and cables clear up this and many other intermittent type of symptoms.

Resolution

In this case it seems the re-seating of all components and cables did the trick, which makes sense as it was running fine. If you have a new build exhibiting these problems it could turn out to be a bad cable, loose connection, or often those 4-pin Molex to PCIe, or worse those SATA to PCIe, power adapters.

As you can see in the below image, I knew I had fixed the issue almost right away after I starting mining again as I could see all 5 GPUs reporting temperatures again.

Summary

While this was not meant to be a complete troubleshooting guide, it does go over one issue I have encountered on a periodic basis and hopefully will be of some use to some of you. Also, there may be other reasons for temperatures not displaying correctly and this is one of maybe many other causes, as drivers can be another culprit especially if none of the GPUs are reporting temperatures.

If you found this useful let me know in the comments section. I think I might do more of these quick one page guides covering other common issues as sometimes information gets lost in the larger multi-page posts.

 

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.