Xymon custom graph config: Difference between revisions
No edit summary |
No edit summary |
||
Line 22: | Line 22: | ||
===collect the data with a script=== | ===collect the data with a script=== | ||
hobbit client:/usr/lib/ | hobbit client:/usr/lib/xymon/client/ext/cputemp: | ||
<source lang="bash"> | <source lang="bash"> | ||
#!/bin/sh | #!/bin/sh | ||
Line 60: | Line 60: | ||
This is the expected output: | This is the expected output: | ||
<pre> | <pre> | ||
root@merkli:/usr/lib/ | root@merkli:/usr/lib/xymon/client/ext# ./cputemp | ||
./cputemp: 24: status .cputemp green Thu Jul 21 10:23:23 PDT 2011 | ./cputemp: 24: status .cputemp green Thu Jul 21 10:23:23 PDT 2011 | ||
Line 72: | Line 72: | ||
This particular script creates a temp file, so you can look at the timestamp on that to see if it is running: | This particular script creates a temp file, so you can look at the timestamp on that to see if it is running: | ||
<source lang=bash> | <source lang=bash> | ||
root@merkli:/usr/lib/ | root@merkli:/usr/lib/xymon/client/ext# ls -l /tmp/cputemp.txt | ||
-rw-r--r-- 1 hobbit hobbit 35 2011-07-21 10:26 /tmp/cputemp.txt | -rw-r--r-- 1 hobbit hobbit 35 2011-07-21 10:26 /tmp/cputemp.txt | ||
</source> | </source> | ||
<pre> | <pre> | ||
root@merkli:/usr/lib/ | root@merkli:/usr/lib/xymon/client/ext# cat /tmp/cputemp.txt | ||
temp1:95.0 | temp1:95.0 | ||
Core0Temp: | Core0Temp: | ||
Line 83: | Line 83: | ||
===launch the script with hobbit=== | ===launch the script with hobbit=== | ||
running the script on the hobbit server:/usr/lib/ | running the script on the hobbit server:/usr/lib/xymon/server/etc/hobbitlaunch.cfg: | ||
<pre> | <pre> | ||
[cputemp] | [cputemp] | ||
ENVFILE /usr/lib/ | ENVFILE /usr/lib/xymon/client/etc/hobbitclient.cfg | ||
CMD /usr/lib/ | CMD /usr/lib/xymon/client/ext/cputemp | ||
INTERVAL 5m | INTERVAL 5m | ||
</pre> | </pre> | ||
running the script on the hobbit client:/usr/lib/ | running the script on the hobbit client:/usr/lib/xymon/client/etc/clientlaunch.cfg | ||
or better, put into client:/var/run/ | or better, put into client:/var/run/xymon/clientlaunch-include.cfg | ||
* wait for five minutes and then you should see the data in the web interface, but not the graph | * wait for five minutes and then you should see the data in the web interface, but not the graph | ||
Line 101: | Line 101: | ||
But what gets transferred between the client and server? The script output or the rrd file? My guess is that just the script output is transferred and the rrd is first constructed on the server. | But what gets transferred between the client and server? The script output or the rrd file? My guess is that just the script output is transferred and the rrd is first constructed on the server. | ||
hobbit server:/usr/lib/ | hobbit server:/usr/lib/xymon/server/etc/hobbitserver.cfg | ||
<pre> | <pre> | ||
TEST2RRD="cpu=la,disk,inode,qtree,memory,$PINGCOLUMN=tcp,http=tcp,dns=tcp,dig=tcp,time=ntpstat, | TEST2RRD="cpu=la,disk,inode,qtree,memory,$PINGCOLUMN=tcp,http=tcp,dns=tcp,dig=tcp,time=ntpstat, | ||
Line 118: | Line 118: | ||
* you can verify the RRD: | * you can verify the RRD: | ||
hobbit server:/var/lib/ | hobbit server:/var/lib/xymon/rrd/localhost/cputemp.rrd | ||
<pre> | <pre> | ||
root@weasel:/var/lib/ | root@weasel:/var/lib/xymon/rrd/localhost# rrdtool info ./cputemp.rrd | ||
filename = "./cputemp.rrd" | filename = "./cputemp.rrd" | ||
rrd_version = "0003" | rrd_version = "0003" | ||
Line 135: | Line 135: | ||
ds[temp1].unknown_sec = 0 | ds[temp1].unknown_sec = 0 | ||
root@weasel:/var/lib/ | root@weasel:/var/lib/xymon/rrd/localhost# ls -lt cputemp.rrd | ||
-rw-r--r-- 1 hobbit hobbit 57616 2011-07-21 09:43 cputemp.rrd | -rw-r--r-- 1 hobbit hobbit 57616 2011-07-21 09:43 cputemp.rrd | ||
</pre> | </pre> | ||
Line 142: | Line 142: | ||
===create a graph definition=== | ===create a graph definition=== | ||
hobbit server:/usr/lib/ | hobbit server:/usr/lib/xymon/server/etc/hobbitgraph.cfg: | ||
<syntaxhighlight lang=ini> | <syntaxhighlight lang=ini> | ||
[cputemp] | [cputemp] | ||
Line 161: | Line 161: | ||
It is difficult to manually run /usr/lib/ | It is difficult to manually run /usr/lib/xymon/cgi-bin/hobbitgraph.sh with the parameters from the web page to see what the error is. hobbitgraph.sh is a shell wrapper around a binary called /usr/lib/xymon/server/bin/hobbitgraph.cgi. | ||
<pre> | <pre> | ||
Line 170: | Line 170: | ||
# Install this script in your webservers' cgi-bin directory | # Install this script in your webservers' cgi-bin directory | ||
. /usr/lib/ | . /usr/lib/xymon/server/etc/hobbitcgi.cfg | ||
exec /usr/lib/ | exec /usr/lib/xymon/server/bin/hobbitgraph.cgi $CGI_HOBBITGRAPH_OPTS | ||
</pre> | </pre> | ||
Line 197: | Line 197: | ||
* locate the rrd | * locate the rrd | ||
/var/lib/ | /var/lib/xymon/rrd/merkli/cputemp.rrd | ||
* locate the existing graph config | * locate the existing graph config | ||
/usr/lib/ | /usr/lib/xymon/server/etc/hobbitgraph.cfg | ||
<pre> | <pre> | ||
[cputemp] | [cputemp] | ||
Line 217: | Line 217: | ||
<pre> | <pre> | ||
$period = `date --date="7 days ago" +%s`; | $period = `date --date="7 days ago" +%s`; | ||
cp /var/lib/ | cp /var/lib/xymon/rrd/merkli/cputemp.rrd /tmp/cputemp.rrd | ||
rrdtool graph /tmp/output.png --width 300 --start $period -v "Degrees Fahrenheit" \ | rrdtool graph /tmp/output.png --width 300 --start $period -v "Degrees Fahrenheit" \ | ||
DEF:temp1=/tmp/cputemp.rrd:temp1:AVERAGE \ | DEF:temp1=/tmp/cputemp.rrd:temp1:AVERAGE \ | ||
Line 238: | Line 238: | ||
The script seems to be generating correct output, the rrd file is accumulating data, and the graph is appears on the custom graph page, but there is no data on the graph. | The script seems to be generating correct output, the rrd file is accumulating data, and the graph is appears on the custom graph page, but there is no data on the graph. | ||
server:/var/log/ | server:/var/log/xymon/rrd-status.log contains this: | ||
<pre> | <pre> | ||
2011-08-01 11:09:59 RRD error updating /var/lib/ | 2011-08-01 11:09:59 RRD error updating /var/lib/xymon/rrd/merkli/cputemp.rrd from 10.0.0.14: | ||
/var/lib/ | /var/lib/xymon/rrd/merkli/cputemp.rrd: found extra data on update argument: 89:89 | ||
</pre> | </pre> | ||
rrdtool info tells me that the data from the script is unparseable: | rrdtool info tells me that the data from the script is unparseable: | ||
<pre> | <pre> | ||
root@weasel:/var/lib/ | root@weasel:/var/lib/xymon/rrd/merkli# pwd | ||
/var/lib/hobbit/rrd/merkli | /var/lib/hobbit/rrd/merkli | ||
root@weasel:/var/lib/ | root@weasel:/var/lib/xymon/rrd/merkli# rrdtool info cputemp.rrd | ||
filename = "cputemp.rrd" | filename = "cputemp.rrd" | ||
rrd_version = "0003" | rrd_version = "0003" |
Revision as of 03:54, 30 October 2013
Prerequisites
Make sure the client has xymon-client installed:
root@merkli:/var/lib/hobbit# dpkg -l "*xymon*" Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Description +++-================-================-================================================ un xymon <none> (no description available) ii xymon-client 4.3.0~beta2.dfsg client for the Xymon network monitor un xymon-plugins <none> (no description available)
I found that it was necessary to run
dpkg-reconfigure xymon-client
in order to tell the client how to contact the server. (/etc/default/hobbit-client)
Another important configuration point: CLIENTHOSTNAME in /etc/default/hobbit-client on the client must match the name for the client in /etc/hobbit/bb-hosts on the server.
collect the data with a script
hobbit client:/usr/lib/xymon/client/ext/cputemp: <source lang="bash">
- !/bin/sh
- /usr/bin/sensors -f | grep "CPU Temp" | awk '{print $1 $2 $3}' |
- awk {'sub("\+", ""); sub("°F", ""); print }' > /tmp/cputemp.txt
- /usr/bin/sensors -f | grep -i temp | grep -v k8temp|
- awk '{ sub("°F", ""); sub("\+", ""); sub("/", ""); sub(" +", ""); sub("\(.*$", ""); print}'
- > /tmp/cputemp.txt
RAWTEMP=`nvidia-smi -a | grep Temperature | sed -e "s/.*.://g" -e "s/C//g"` FGPUTEMP=$((${RAWTEMP:-0}*9/5+32)) echo "temp1:$FGPUTEMP" > /tmp/cputemp.txt
/usr/bin/sensors -f | grep Temp | grep -v k8temp | awk '{sub(".F.*$", ""); sub(" +",""); sub("+",""); print}' >> /tmp/cputemp.txt
RESULT=`grep Core0Temp /tmp/cputemp.txt | awk '{ FS=":"; print int($2) }'`; COLOR=green if test "$RESULT" -gt 120 then COLOR=red fi if test "$RESULT" -lt 70 then COLOR=red fi
$BB $BBDISP "status $MACHINE.cputemp $COLOR `date`
`cat /tmp/cputemp.txt` "
exit 0 </source>
- you can run the script from the command line to test it
This is the expected output:
root@merkli:/usr/lib/xymon/client/ext# ./cputemp ./cputemp: 24: status .cputemp green Thu Jul 21 10:23:23 PDT 2011 temp1:95.0 Core0Temp: Core1Temp: : not found
If you run the script as root, make sure you leave the permissions on /tmp/cputemp.txt such that it is writable by user hobbit.
This particular script creates a temp file, so you can look at the timestamp on that to see if it is running: <source lang=bash> root@merkli:/usr/lib/xymon/client/ext# ls -l /tmp/cputemp.txt -rw-r--r-- 1 hobbit hobbit 35 2011-07-21 10:26 /tmp/cputemp.txt </source>
root@merkli:/usr/lib/xymon/client/ext# cat /tmp/cputemp.txt temp1:95.0 Core0Temp: Core1Temp:
launch the script with hobbit
running the script on the hobbit server:/usr/lib/xymon/server/etc/hobbitlaunch.cfg:
[cputemp] ENVFILE /usr/lib/xymon/client/etc/hobbitclient.cfg CMD /usr/lib/xymon/client/ext/cputemp INTERVAL 5m
running the script on the hobbit client:/usr/lib/xymon/client/etc/clientlaunch.cfg or better, put into client:/var/run/xymon/clientlaunch-include.cfg
- wait for five minutes and then you should see the data in the web interface, but not the graph
- if you see the timestamp on /tmp/cputemp.txt change, then this file is correctly launching the script
collect the data in an RRD
From here on out, the configuration should all be on the server and not the client, since the RRD is stored on the server and the graphs are generated from that.
But what gets transferred between the client and server? The script output or the rrd file? My guess is that just the script output is transferred and the rrd is first constructed on the server.
hobbit server:/usr/lib/xymon/server/etc/hobbitserver.cfg
TEST2RRD="cpu=la,disk,inode,qtree,memory,$PINGCOLUMN=tcp,http=tcp,dns=tcp,dig=tcp,time=ntpstat, vmstat,iostat,netstat,temperature,apache,bind,sendmail,mailq,nmailq=mailq,socks,bea,iishealth, citrix,bbgen,bbtest,bbproxy,hobbitd,files,procs=processes,ports,clock,lines,ops,stats,cifs, JVM,JMS,HitCache,Session,JDBCConn,ExecQueue,JTA,TblSpace,RollBack,MemReq,InvObj,snapmirr, snaplist,snapshot,if_load=devmon,temp=devmon,paging,mdc,mdchitpct,cics,dsa,getvis,maxuser, nparts,cputemp=ncv,heater=ncv"
NCV_cputemp="temp1:GAUGE,Core0Temp:GAUGE,Core1Temp:GAUGE"
- restart hobbit after making these changes
- you can verify the RRD:
hobbit server:/var/lib/xymon/rrd/localhost/cputemp.rrd
root@weasel:/var/lib/xymon/rrd/localhost# rrdtool info ./cputemp.rrd filename = "./cputemp.rrd" rrd_version = "0003" step = 300 last_update = 1311266629 header_size = 2320 ds[temp1].index = 0 ds[temp1].type = "GAUGE" ds[temp1].minimal_heartbeat = 600 ds[temp1].min = NaN ds[temp1].max = NaN ds[temp1].last_ds = "131" ds[temp1].value = 2.9999000000e+04 ds[temp1].unknown_sec = 0 root@weasel:/var/lib/xymon/rrd/localhost# ls -lt cputemp.rrd -rw-r--r-- 1 hobbit hobbit 57616 2011-07-21 09:43 cputemp.rrd
The timestamp on the file should be less than 5 minutes old and the data inside should correspond to the output of the script.
create a graph definition
hobbit server:/usr/lib/xymon/server/etc/hobbitgraph.cfg: <syntaxhighlight lang=ini> [cputemp]
TITLE CPU Temperature YAXIS Degrees Fahrenheit DEF:temp1=cputemp.rrd:temp1:AVERAGE DEF:Core0Temp=cputemp.rrd:Core0Temp:AVERAGE DEF:Core1Temp=cputemp.rrd:Core1Temp:AVERAGE LINE2:temp1#@COLOR@:temp1 LINE2:Core0Temp#@COLOR@:Core0Temp LINE2:Core1Temp#@COLOR@:Core1Temp\n
</syntaxhighlight>
troubleshooting
If a graph is not appearing, you can view source on the page with the missing graph and click on the IMG link that it is trying to display. But often that won't reveal anything useful.
It is difficult to manually run /usr/lib/xymon/cgi-bin/hobbitgraph.sh with the parameters from the web page to see what the error is. hobbitgraph.sh is a shell wrapper around a binary called /usr/lib/xymon/server/bin/hobbitgraph.cgi.
#!/bin/sh # This is the Hobbit CGI script interface to hobbitgraph.cgi # # Install this script in your webservers' cgi-bin directory . /usr/lib/xymon/server/etc/hobbitcgi.cfg exec /usr/lib/xymon/server/bin/hobbitgraph.cgi $CGI_HOBBITGRAPH_OPTS
Failing URL:
"/hobbit-cgi/hobbitgraph.sh?host=merkli&service=ncv:cputemp&graph_width=576&graph_height=120&disp=merkli&nostale&color=green&graph_start=1311105181&graph_end=1311277981&graph=hourly&action=view"
Looking at the environment variables as hobbitgraph.sh runs, I see this on a working query:
REQUEST_URI=/hobbit-cgi/hobbitgraph.sh?host=localhost&service=ncv:cputemp&graph_width=576&graph_height=120&disp=localhost&nostale& color=green&graph_start=1311116740&graph_end=1311289540&graph=hourly&action=view
QUERY_STRING=host=localhost&service=ncv:cputemp&graph_width=576&graph_height=120&disp=localhost&nostale&color=green& graph_start=1311116862&graph_end=1311289662&graph=hourly&action=view
And this for a failing query:
REQUEST_URI=/hobbit-cgi/hobbitgraph.sh?host=merkli&service=ncv:cputemp&graph_width=576&graph_height=120&disp=merkli&nostale&color=green& graph_start=1311116888&graph_end=1311289688&graph=hourly&action=view
QUERY_STRING=host=merkli&service=ncv:cputemp&graph_width=576&graph_height=120&disp=merkli&nostale&color=green&graph_start=1311117131& graph_end=1311289931&graph=hourly&action=view
manually generating a graph
In order to force a graphing error into view, you can manually attempt to generate the graph outside of hobbit, but using the same configuration and rrd that hobbit is using.
- locate the rrd
/var/lib/xymon/rrd/merkli/cputemp.rrd
- locate the existing graph config
/usr/lib/xymon/server/etc/hobbitgraph.cfg
[cputemp] TITLE CPU Temperature YAXIS Degrees Fahrenheit DEF:temp1=cputemp.rrd:temp1:AVERAGE DEF:Core0Temp=cputemp.rrd:Core0Temp:AVERAGE DEF:Core1Temp=cputemp.rrd:Core1Temp:AVERAGE LINE2:temp1#@COLOR@:temp1 LINE2:Core0Temp#@COLOR@:Core0Temp LINE2:Core1Temp#@COLOR@:Core1Temp\n
- cook up a corresponding graph command
$period = `date --date="7 days ago" +%s`; cp /var/lib/xymon/rrd/merkli/cputemp.rrd /tmp/cputemp.rrd rrdtool graph /tmp/output.png --width 300 --start $period -v "Degrees Fahrenheit" \ DEF:temp1=/tmp/cputemp.rrd:temp1:AVERAGE \ DEF:Core0Temp=/tmp/cputemp.rrd:Core0Temp:AVERAGE \ DEF:Core1Temp=/tmp/cputemp.rrd:Core1Temp:AVERAGE \ LINE2:temp1#FF0000:temp1 \ LINE2:Core0Temp#00FF00:Core0Temp \ LINE2:Core1Temp#0000FF:Core1Temp
- the errors should be much more helpful now
root@weasel:/var/www/temp# rrdtool graph /tmp/output.png --width 300 --start 1311615323 -v "Degrees Fahrenheit" DEF:temp1=/tmp /cputemp.rrd:temp1:AVERAGE DEF:Core0Temp=/tmp/cputemp.rrd:Core0Temp:AVERAGE DEF:Core1Temp=/tmp/cputemp.rrd:Core1Temp:AVERAGE LINE2:temp1#FF0000:temp1 LINE2:Core0Temp#00FF00:Core0Temp ERROR: No DS called 'Core0Temp' in '/tmp/cputemp.rrd'
a graph appears, but there is no data
The script seems to be generating correct output, the rrd file is accumulating data, and the graph is appears on the custom graph page, but there is no data on the graph.
server:/var/log/xymon/rrd-status.log contains this:
2011-08-01 11:09:59 RRD error updating /var/lib/xymon/rrd/merkli/cputemp.rrd from 10.0.0.14: /var/lib/xymon/rrd/merkli/cputemp.rrd: found extra data on update argument: 89:89
rrdtool info tells me that the data from the script is unparseable:
root@weasel:/var/lib/xymon/rrd/merkli# pwd /var/lib/hobbit/rrd/merkli root@weasel:/var/lib/xymon/rrd/merkli# rrdtool info cputemp.rrd filename = "cputemp.rrd" rrd_version = "0003" step = 300 last_update = 1312221286 header_size = 2320 ds[temp1].index = 0 ds[temp1].type = "GAUGE" ds[temp1].minimal_heartbeat = 600 ds[temp1].min = NaN ds[temp1].max = NaN ds[temp1].last_ds = "U" ds[temp1].value = 0.0000000000e+00 ds[temp1].unknown_sec = 286
The entry of interest is ds[temp1].last_ds = "U" That should be a Fahrenheit temperature instead of U.
The script generates a floating point temperature. In case that is the problem, I rewrote the script to generate an integer temperature.
Sure enough, after making sure that the script was generating an integer and then restarting hobbit, it fixed the problem.