Monitoring suggestions

The following monitoring options apply to DINA:

  • DINA PID: /usr/ssn/dina/CURRENT/bin/dina.pid

  • DINA default UI port: 8643

  • Heap memory usage/full GCs (same as other processes)

  • Monitor internal DINA processing queue depth (via logs)
  • TIBCO producers/consumers and queue depth: see DINA message queue configuration

  • Errors in the logs, detecting grid health issues

  • Monitors to detect stuck jobs, overrun jobs, or low job success rate (note that AgentDownloadJobs may have very infrequent updates just by the nature of the job itself). Consider the following options:

    • Monitor via JMX

      • get a list of active jobs: https://dina01.<host-name>:<dina-port>/jolokia/exec/ssn:name=DinaJobStatus/getActiveJobsList/

      • for each, getJobExecutionStatus, monitor runningCount changes

    • Splunk query against logs

    • Job-specific monitors (refer to DINA monitoring options)

Table 8 DINA monitoring options

Functional Area

Key Metrics

Possible Monitors

System interrogations for Agent Director data

  • Success Rate – Percentage-based
  • Timeliness – Relative to # devices
  • JMX or log monitoring to collect job stats
  • Tibco/RMQ queue depth – only helps with delivery to customer flow (or not)

User-initiated Interrogations for Agent Outcomes/Events

  • Aggregate success rate
  • Timeliness
  • JMX or log monitoring to collect job stats

AgentDirector/AgentCofiguration/AgentDirector Jobs

  • Success Rate – Percentage-based
  • Completion – % complete
  • Timeliness – Job completion time (threshold for alarm probably needs to be computed)
  • JMX or log monitoring to collect job stats
  • Tibco queue depth – only helps with delivery to customer flow (or not)

Trap Processing

  • General flow (is it working?)
  • E2E latency (from NIC to customer timeliness)
  • Tibco/RMQ queue depth – helps with determining trap flow in general

The following monitoring options apply to DINA Shim:

  • PID: /usr/ssn/dinashim/CURRENT/bin/dinashim.pid

  • RabbitMQ and Tibco producers/consumers and queue depth (need to identify queues): see DINA message queue configuration

  • Heap memory usage

  • Specific errors in the logs

  • Health page: https://DINASHIM_UI_HOST:8686/health

    Note: See also Health page.