Monitoring and Performance

About Skimmer

Energy Logserver uses a monitoring module called Skimmer to monitor the performance of its hosts. Skimmer collects metrics from the following sources:

  • ELS Data Node API — node stats, cluster health, cluster shards, license status, indices stats

  • ELS Network Probe API — JVM, GC, process, events, CPU load metrics

  • OS statistics — CPU, filesystem, swap, network interfaces, VM stats, zombie processes, disk usage, TCP port status

  • Systemd services — status of any configured systemd service (via DBus)

  • Kafka (optional) — consumer group lag metrics

  • Process monitoring — PID tracking for configured processes

Skimmer logs to the systemd journal. Set the log level with the SKIMMER_LOG_LEVEL environment variable (DEBUG, INFO, WARN, ERROR).

Skimmer Installation

The RPM package is delivered with the system installer in the install/agents directory:

cd $install_directory/install/agents
yum install skimmer-1.0.*-x86_64.rpm -y

The package installs to /opt/skimmer/ with the following layout:

Path

Description

/opt/skimmer/bin/skimmer

Binary

/opt/skimmer/skimmer.conf

Configuration file

/etc/systemd/system/skimmer.service

Systemd service unit

Skimmer service configuration

The Skimmer configuration is located in the /opt/skimmer/skimmer.conf file. The file uses a flat key = value format (lines starting with # are comments).

# enable stats collection
main_enabled = true

# index name in Data Node
index_name = skimmer
index_format = %Y.%m

# credentials for Data Node API
elasticsearch_auth = logserver:logserver

# Data Node output address (comma-separated list for fault tolerance)
elasticsearch_address = 127.0.0.1:9200

# Network Probe output address (comma-separated list for fault tolerance)
logstash_address = 127.0.0.1:6110

# Data Node API address for stats retrieval
elasticsearch_api = 127.0.0.1:9200

# Network Probe API address for stats retrieval
logstash_api = 127.0.0.1:9600

# collection interval (minimum 10s)
interval = 1min

# how often to collect shards, license, and indices stats
system_health_check_interval = 4h

# index patterns to monitor (default: all)
# indices_stats_patterns = *
# indices_stats_regex = .*

# Kafka monitoring (optional)
# kafka_path = /usr/share/kafka/
# kafka_server_api = 127.0.0.1:9092
# kafka_monitored_topics = topic1,topic2
# kafka_monitored_groups = group1,group2
# kafka_outdated_version = false

# comma-separated process names to track PIDs
processes = /usr/sbin/sshd,/usr/sbin/rsyslogd

# comma-separated systemd services to monitor status
systemd_services = elasticsearch,logstash,alert,cerebro,kibana,e-doc,intelligence,intelligence-scheduler,license-service,automation

# comma-separated port numbers to check
port_numbers = 9200,9300,9600,5514,5044,443,5601,5602

# enable SSL for Data Node API connections
# elasticsearch_ssl = true

# specific paths to check disk usage
# check_disk_usage = /var/lib/logserver-probe/queue

# force node IP when Skimmer is not running on a cluster node
# elasticsearch_api_force_node_ip = 10.0.0.1

# redefine field type mappings
# mapping = {"store": "long"}

Note

The systemd_services parameter uses internal service names (e.g. elasticsearch, logstash, kibana). Skimmer automatically maps them to branded output field names (logserver, logserver-probe, logserver-gui) in collected documents. Network Probe stats fields use the logserver-probe_stats_ prefix instead of logstash_stats_. Other services such as intelligence, license-service, and automation are monitored using their actual service names.

In the Skimmer configuration file, set the credentials to communicate with Data Node:

elasticsearch_auth = $user:$password

Kafka monitoring

To monitor the Kafka process and the number of documents in the queues of topics, run Skimmer on the Kafka server and uncomment the following parameters:

kafka_path = /usr/share/kafka/
kafka_server_api = 127.0.0.1:9092
kafka_monitored_topics = topic1,topic2
kafka_monitored_groups = group1,group2
kafka_outdated_version = false
  • kafka_path — path to Kafka home directory (requires kafka-consumer-groups.sh)

  • kafka_server_api — IP address and port for Kafka server API (default: 127.0.0.1:9092)

  • kafka_monitored_groups — comma-separated list of Kafka consumer groups; if not defined, the command will be invoked with the --all-groups parameter

  • kafka_outdated_version — set to true if you use Kafka before v2.4.0

After the changes in the configuration file, restart the service:

systemctl restart skimmer

Skimmer GUI configuration

To view the collected data by Skimmer in the GUI, you need to add an index pattern.

Go to the “Management” -> “Index Patterns” tab and press the “Create Index Pattern” button. In the “Index Name” field, enter the formula skimmer-*, and select the “Next step” button. In the “Time Filter” field, select @timestamp and then press “Create index pattern”.

In the “Discover” tab, select the skimmer-* index from the list of indexes. A list of collected documents with statistics and statuses will be displayed.

The Skimmer dashboard includes the following monitoring parameters (in the GUI, each visualization title is prefixed with Skimmer -):

  • Logserver - Heap usage in percent — total amount of Java heap memory currently used by the JVM Data Node process (in percent)

  • Logserver Probe - Heap usage in percent — total amount of Java heap memory currently used by the JVM Network Probe process (in percent)

  • Logserver - Process CPU usage — CPU time used by the Data Node process (in percent)

  • Logserver - Node CPU usage — CPU time used by a specific Data Node node (in percent)

  • Logserver - Current queries — current count of search queries to Data Node indices

  • Logserver - Current search fetch — current count of the fetch phase for search queries to Data Node indices

  • GC Old collection — duration of Java Garbage Collector Old collection (in milliseconds)

  • GC Young collection — duration of Java Garbage Collector Young collection (in milliseconds)

  • Flush — duration of the Data Node flushing process that permanently saves the transaction log to the Lucene index (in milliseconds)

  • Refresh — duration of the Data Node refresh process that prepares new data for searching (in milliseconds)

  • Indexing — duration of the Data Node document indexing process (in milliseconds)

  • Merge — duration of the Data Node merge process that periodically merges smaller segments into larger segments (in milliseconds)

  • Indexing Rate — number of documents saved to the Data Node index per second (events per second — EPS)

  • Expected DataNodes — indicator of the number of data nodes required for the current load

  • Free Space — total space and free space in bytes on the Data Node cluster

  • Index / Size in Bytes — size of individual indices in bytes

  • Index / Documents — document count per index

Expected Data Nodes

Based on the collected performance data of the Energy Logserver environment, Skimmer automatically indicates the need to run additional data nodes.