Quantcast
Channel: THWACK: Document List - All Communities
Viewing all articles
Browse latest Browse all 9474

Noes not responding to SNMP or WMI

$
0
0

There are times when the clients' device stops polling for whatever reason.  This could be an issue with the device or a change in credentials.  Almost all clients I have been involved with are not aware that the polling has stopped.

 

There is a simple way of noticing this, which is by looking at the timestamp of the CPU polling.  If it is more than 35 minutes from the current time, the node is having issue.

 

Here is the report for it:

 

SELECTn.CaptionasNode_Name,n.ip_addressasIP_Address,n.ObjectSubTypeasPoll_Type

,Cast(DateDiff(day,MAX(c.datetime),getdate())asvarchar)+' Day(s) '+convert(char(8),dateadd(second,DateDiff(second,MAX(c.datetime),getdate()),0),14)as Duration

,DateDiff(mi,MAX(c.datetime),getdate())minutes_since

FROM Nodes n

InnerjoinCPUload c onc.NodeID=n.NodeID

WHEREn.status= 1 and(n.ObjectSubType='wmi'orn.ObjectSubType='snmp')

GROUPBYn.Caption,n.StatusDescription,  n.ip_address,n.ObjectSubType

HavingDateDiff(mi,MAX(c.datetime),getdate())> 35

ORDERBYminutes_sincedesc

 

Reporting is nice, but a better way to notice this is by creating an alert for it - so it can be resolved in a timely manner.  For the alert, you would need to use a custom sql:

 

SELECTnodes.NodeID,nodes.captionFROM Nodes

InnerjoinCPUload c onc.NodeID=nodes.NodeID

WHEREnodes.status= 1 and(nodes.ObjectSubType='wmi'ornodes.ObjectSubType='snmp')

GROUPBYnodes.Caption,nodes.nodeid

HavingDateDiff(mi,MAX(c.datetime),getdate())>35

 

Using both the report and alert will make sure you are getting data from all nodes and avoid the embarrassing situation when a server crashes due to high CPU and the boss comments - "I thought that SolarWinds was monitoring this".

 

 

Thanks

Amit Shah

Loop1 Systems


Viewing all articles
Browse latest Browse all 9474


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>