Addons Monitoring
Addon Monitoring
Addon rules
The Prometheus agent is configured to report numerous addon metrics. Below is the YAML-based rule set that Catapult uses, including alert names, rules with alert type and expression, timeframe, labels with type and severity, and annotations which contain the summary and description of the notification.
x
bash-5.0# cat sunpike.ymlgroupsnamehosts rulesalertHostNotConverging exprsum without (status_start_attempts) (sunpike_hosts_healthclusterrole!="none"hoststate!="Ok") == 1 for20m labels typepf9 severityhigh annotations summaryHost $labels.hostname not converging descriptionHost $labels.hostname is not converging, status $labels.hoststate , role $labels.clusterrole , retries $labels.status_start_attempts nameclusteraddons rulesalertAddonNotHealthy exprsunpike_clusteraddons_healthhealthy!="true" == 1 for10m labels typepf9 severityhigh annotations summaryAddon $labels.type not healthy!! descriptionAddon $labels.type of cluster $labels.cluster not healthyalertAddonNotConverging exprsunpike_clusteraddons_healthphase="" == 1 for20m labels typepf9 severityhigh annotations summaryAddon $labels.type not converging!! descriptionAddon $labels.type of cluster $labels.cluster not convergingalertAddonInstallError exprsunpike_clusteraddons_healthphase="InstallAddonError" == 1 for5m labels typepf9 severityhigh annotations summaryError installing $labels.type addon!! descriptionError installing Addon $labels.type on cluster $labels.cluster failed to installalertAddonUninstallError exprsunpike_clusteraddons_healthphase="UnInstallAddonError" == 1 for5m labels typepf9 severityhigh annotations summaryError uninstalling $labels.type addon!! descriptionError uninstalling Addon $labels.type on cluster $labels.cluster failed to uninstallbash-5.0#Was this page helpful?