The Kernel Metrics (Prysm Local Producer) is a tool designed to collect and monitor kernel-related metrics from your nodes. This tool gathers critical metrics such as context switches, available entropy, and network connections, and can publish these metrics to a NATS server or expose them for Prometheus, providing real-time visibility into the kernel's performance and health.
- Kernel Metrics Collection: Monitors essential kernel metrics, including context switches, available entropy, and network connections.
- NATS Integration: Publishes collected metrics to a specified NATS subject, enabling seamless integration with other monitoring and observability tools.
- Prometheus Metrics: Exposes kernel metrics in Prometheus format, allowing easy integration with monitoring dashboards.
- Configurable: Offers flexibility in configuration via command-line flags or environment variables.
To run the Prysm local producer for kernel metrics, use the following command:
prysm local-producer kernel-metrics [flags]
Example Flags:
--instance-id "instance-1"
: Unique identifier for the instance being monitored.--interval 10
: Interval in seconds between metric collections (default is 10 seconds).--nats-url "nats://localhost:4222"
: NATS server URL for publishing metrics.--nats-subject "node.kernel.metrics"
: NATS subject to publish metrics (default is “node.kernel.metrics”).--node-name "node-1"
: Name of the node being monitored.--prometheus
: Enable Prometheus metrics.--prometheus-port 8080
: Port for Prometheus metrics (default is 8080).
Environment Variables
Configuration can also be set through environment variables:
NATS_URL
: NATS server URL.NATS_SUBJECT
: NATS subject to publish metrics.NODE_NAME
: Name of the node.INSTANCE_ID
: Instance ID.INTERVAL
: Interval in seconds between metric collections.PROMETHEUS_PORT
: Port for Prometheus metrics.
Metrics Collected
node_context_switches_total
: Total number of context switches on the node.node_entropy_available_bits
: Available entropy in bits, indicating the amount of randomness available.node_network_connections_total
: Total number of network connections on the node.
These metrics are crucial for understanding the low-level operations of the kernel and can be used to identify performance bottlenecks, security issues, and overall system health.
Metric Collection:
- The tool collects kernel metrics at regular intervals as specified by the --interval flag or the INTERVAL environment variable.
Publishing Metrics:
- If NATS integration is enabled, the collected metrics are published to the specified NATS subject.
- If Prometheus metrics are enabled, the metrics are exposed on the specified port for scraping by Prometheus.
Logging and Monitoring:
- Metrics can be logged locally for debugging purposes or monitored via Prometheus dashboards to provide a real-time view of kernel performance.
- Start the server with the desired configuration:
prysm local-producer kernel-metrics --nats-url "nats://localhost:4222" --prometheus --prometheus-port 8080
- Metrics such as context switches, entropy, and network connections will be collected every 10 seconds (default) and can be monitored through Prometheus or forwarded to a NATS server for further processing.
This README is a draft and will be updated as the project continues to evolve. Contributions and feedback are welcome to help refine and enhance the functionality of Prysm.