etcd fails to start with "failed to read WAL, cannot be repaired"
Problem
- Master node is in NotReady state as etcd fails to start.
Environment
- Platform9 Managed Kubernetes - v5.6.0 and Higher
Cause
- Node filesystem was built with incorrect filesystem for etcd data.
Resolution
- Rebuild the cluster with supported filesystem.
Additional Information
- Below error is seen in etcd logs:
{"level":"warn","ts":"2023-02-26T05:48:18.728Z","caller":"wal/file_pipeline.go:79","msg":"failed to preallocate space when creating a new WAL","size":64000000,"error":"no space left on device"}
{"level":"fatal","ts":"2023-02-26T05:48:19.061Z","caller":"etcdserver/storage.go:108","msg":"failed to read WAL, cannot be repaired","error":"no space left on device","stacktrace":"go.etcd.io/etcd/etcdserver.readWAL\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdserver/storage.go:108\ngo.etcd.io/etcd/etcdserver.restartNode\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdserver/raft.go:533\ngo.etcd.io/etcd/etcdserver.NewServer\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdserver/server.go:480\ngo.etcd.io/etcd/embed.StartEtcd\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/embed/etcd.go:214\ngo.etcd.io/etcd/etcdmain.startEtcd\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdmain/etcd.go:302\ngo.etcd.io/etcd/etcdmain.startEtcdOrProxyV2\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdmain/etcd.go:144\ngo.etcd.io/etcd/etcdmain.Main\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/etcdmain/main.go:46\nmain.main\n\t/tmp/etcd-release-3.4.14/etcd/release/etcd/main.go:28\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200"}
(END)
Was this page helpful?