Grafana settings lost when host ethernet interface goes down and back up
Reporters
- Mika Silvola (Infinera)
- Lluis Gifre (CTTC)
Description
Host running TFS had network issues and interface went down and back up. As results Grafana restarted and comes back on default settings without TFS related templates. To recovery from situation TFS must be restarted. Root cause of the problem is that deploy/tfs.sh scripts configures Granana setup via script once, but this settings is not re-played in case of Grafana restart.
Deployment environment
- Ubunut 22.04
- MicroK8s (1.2.24):
enabled:
linkerd # (community) Linkerd is a service mesh for Kubernetes and other frameworks
community # (core) The community addons repository
dns # (core) CoreDNS
ha-cluster # (core) Configure high availability on the current node
helm3 # (core) Helm 3 - Kubernetes package manager
hostpath-storage # (core) Storage class; allocates storage from host directory
ingress # (core) Ingress controller for external access
metrics-server # (core) K8s Metrics Server for API access to service metrics
prometheus # (core) Prometheus operator for monitoring and logging
registry # (core) Private image registry exposed on localhost:32000
storage # (core) Alias to hostpath-storage add-on, deprecated
- TeraFlowSDN (include release/branch-name/commit-id): commit 7335ba27 (origin/develop) Date: Thu Feb 8 10:35:40 2024 +0000
TFS deployment settings
- base/minimal settings, +ztp and +monitoring enabled
Sequence of actions that resulted in the bug
-
easy to re-produce:
-
create setup which generates statistic events and monitor via grafana
-
sudo ifconfig down
-
grafana restarts and after comes back has no TFS settings present
-
sudo ifconfig up
-
Lluis gave quick analysis:
- probably the reason is because the failure causes grafana to self-redeploy and grafana settings are not persisted in disk (it uses a deployment set not a stateful set)
Document the explicit error
- grafana lost settings
Expected behaviour
- Expected behavior would be that Grafana doesn't restart on interface down/up transition, or alternatively in case restart re-plays TFS related settings back.