Monitor gpu temperature

From finninday
Jump to navigation Jump to search
  • how do I get the current temperature of my nvidia GPU?

the easy way[edit]

root@weasel:/usr/lib/hobbit/client/ext# nvidia-smi -q -a

Driver Version			: 260.19.06

GPU 0:
	Product Name		: GeForce 7300 SE/7200 GS
	PCI Device/Vendor ID	: 1d310de
	PCI Location ID		: 0:5:0
	Display			: Connected
	Temperature		: 58 C
	Utilization
	    GPU			: 0%
	    Memory		: 0%

the wrong way[edit]

The following method works for normal users, but not for root or cron jobs.

rday@merkli:~$ DISPLAY=:0.0 nvidia-settings -q [gpu:0]/GPUCoreTemp | grep Attribute | sed -e "s/.*.://g" -e "s/\.//g"
 47

The result is in Celsius.

Without all the filtering, it looks like this:

rday@merkli:~$ nvidia-settings -q [gpu:0]/GPUCoreTemp

  Attribute 'GPUCoreTemp' (merkli:0[gpu:0]): 47.
    'GPUCoreTemp' is an integer attribute.
    'GPUCoreTemp' is a read-only attribute.
    'GPUCoreTemp' can use the following target types: X Screen, GPU.


Now I want to put this into a hobbit monitor task. The easiest way is to stick it into an existing custom monitor.

/usr/lib/hobbit/client/ext/cputemp looks like this:

#!/bin/sh

#/usr/bin/sensors -f | grep "CPU Temp" | awk '{print $1 $2 $3}' |
#	awk {'sub("\+", ""); sub("°F", ""); print }' > /tmp/cputemp.txt
#/usr/bin/sensors -f | grep -i temp | grep -v k8temp| awk '{ sub("°F", ""); sub("\+", ""); sub("/", ""); sub(" +", ""); sub("\(.*$", ""); print}' > /tmp/cputemp.txt
/usr/bin/sensors -f | grep -i temp | grep -v k8temp | awk '{sub(".F.*$",
""); sub(" +",""); sub("+",""); print}' > /tmp/cputemp.txt

RESULT=`grep Core0Temp /tmp/cputemp.txt | awk '{ FS=":";  print int($2) }'`;
COLOR=green
if test "$RESULT" -gt 120
then
	COLOR=red
fi
if test "$RESULT" -lt 70
then
	COLOR=red
fi

$BB $BBDISP "status $MACHINE.cputemp $COLOR `date`

`cat /tmp/cputemp.txt`
"

exit 0

It probably should look something like this:

#!/bin/sh

#/usr/bin/sensors -f | grep "CPU Temp" | awk '{print $1 $2 $3}' |
#	awk {'sub("\+", ""); sub("°F", ""); print }' > /tmp/cputemp.txt
#/usr/bin/sensors -f | grep -i temp | grep -v k8temp| awk '{ sub("°F", ""); sub("\+", ""); sub("/", ""); sub(" +", ""); sub("\(.*$", ""); print}' > /tmp/cputemp.txt
CGPUTEMP=`nvidia-settings -q [gpu:0]/GPUCoreTemp | grep Attribute | sed -e "s/.*.://g" -e "s/\.//g"`;

FGPUTEMP=$CGPUTEMP*9/5+32
echo "temp1:$FGPUTEMP" > /tmp/cputemp.txt
/usr/bin/sensors -f | grep Temp | grep -v k8temp | awk '{sub(".F.*$",
""); sub(" +",""); sub("+",""); print}' >> /tmp/cputemp.txt

RESULT=`grep Core0Temp /tmp/cputemp.txt | awk '{ FS=":";  print int($2) }'`;
COLOR=green
if test "$RESULT" -gt 120
then
	COLOR=red
fi
if test "$RESULT" -lt 70
then
	COLOR=red
fi

$BB $BBDISP "status $MACHINE.cputemp $COLOR `date`

`cat /tmp/cputemp.txt`
"

exit 0

My working temperature script:

RAWTEMP=`DISPLAY=:0.0 nvidia-settings -q [gpu:0]/GPUCoreTemp | grep Attribute | sed -e "s/.*.://g" | sed -e "s/\.//g"`
#echo "rawtemp = $RAWTEMP"

FGPUTEMP=$(($RAWTEMP*9/5+32))
#echo "fgputemp = $FGPUTEMP"

echo "temp1:$FGPUTEMP"