This page is written for an audience of system adminitrators who wish to use MRTG or similar tools to measure more than just the bandwidth usage of Cisco routers. We do that at Trolltech, and with this page we want to help other MRTG users learn from our experiences.
Bandwidth measurementWe measure the bandwidth utilization for each port of each switch. With Cisco switches this hasn't been a problem, but Intel and D-Link switches haven't been as cooperative.
We use MRTG with Intel 460T switches. We had to upgrade the firmware to get them to work. The firmware we use at the moment is version 4.something and the file is called
460_45run.tfp. With this upgrade, we're happy with the switches. We do not use any of the features in Intel's switch-specific MIB file.
We have also tried a D-Link switch, but D-Link's SNMP support is unusable in practice. D-Link writes:
This problem is not a bug, just only a definition issue.If that sounds stupid, that's because it is stupid. Their "definition" makes the switch count only about 0.003% of the traffic passing through it.
In the MIB-II (RFC-1213 standard), the definition for IF is not very strict. At D-Link switch, we define IF as the CPU/Controller. That is, the switch interface is the controller, not the port. The IFCounter will be counted on if the packet go through the controller. In the console program, the port statistic counts the number of packets which go through port. For this reason, the information from SNMP will differ from the information from telnet.
D-Link has twice told us that they will not fix this bug (oops — change this definition), so we will return that switch. (The console program, which counts correctly, can't really be used with MRTG. Its output is too hard to parse.)We use MRTG with several Merlin-Gerin (MGE) UPSes, measuring load, battery lifetime, number of power failures, mains power voltage and mains power frequency.
- Output load: This is the output effect as percent of the UPS's maximum output effect. The OID is 188.8.131.52.184.108.40.206.220.127.116.11.1, and here's an example:
Title[upsname-load]: Upsname: Load
PageTop[upsname-load]: <h1>Upsname: Load</h1>
Options[upsname-load]: growright, gauge, nopercent
- Battery lifetime: This reports the number of minutes for which the UPS has battery capacity, at its current load. The OID is 18.104.22.168.22.214.171.124.2.3.0, so we use a line like this:
- Effect in and out: This is the number of watts the UPS pulls from the mains and delivers to its equipment. The input number is typically quite a bit higher than the output number, as the UPS itself needs power. The input OID is 126.96.36.199.188.8.131.52.184.108.40.206.1 and the output OID is 220.127.116.11.18.104.22.168.22.214.171.124.1. An old MGE we have does not support the input measurement; the new ones do support it.
- Battery uses: This OID is supposed to measure the number of times the UPS has decided to stop drawing current from its power source and use the battery instead. The OID is 126.96.36.199.188.8.131.52.3.1.0.
- Mains voltage and frequency. There hasn't been any use for our frequency graph, but our voltage graph showed us that one particular "power failure" was not complete: The voltage dropped from 220V to 120V. The OID for the mains frequency is 184.108.40.206.220.127.116.11.18.104.22.168.1, and the unit is dHz (ie. the value will be 500 for 50Hz AC). The OID for the mains voltage is 22.214.171.124.126.96.36.199.188.8.131.52.1.