(Jin Qing's Column, Dec., 2021)
My program is using K8s DNS SRV query to discovery service, and when it's deployed on minikube, I find DNS failure.
I can use nslookup to reproduce the failure.
Querying a FQDN is OK. But after querying a non-existing SRV short name, the ping fails.
root@web-0:/# ping google.com PING google.com (142.250.66.110) 56(84) bytes of data. 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=1 ttl=108 time=33.7 ms 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=2 ttl=108 time=33.8 ms ^C --- google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 33.779/33.834/33.889/0.055 ms root@web-0:/# nslookup > set type=srv > nosuch-nosuch-nosuch-1234567890abcdefg.cn Server: 10.96.0.10 Address: 10.96.0.10#53 ** server can't find nosuch-nosuch-nosuch-1234567890abcdefg.cn: NXDOMAIN > exit root@web-0:/# ping google.com PING google.com (142.250.66.110) 56(84) bytes of data. 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=1 ttl=108 time=33.7 ms 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=2 ttl=108 time=33.7 ms ^C --- google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 33.730/33.735/33.741/0.183 ms root@web-0:/# nslookup > set type=srv > nginx-wrong Server: 10.96.0.10 Address: 10.96.0.10#53 ** server can't find nginx-wrong: SERVFAIL > exit root@web-0:/# ping google.com ping: unknown host google.com root@web-0:/#
The ping will recover to normal after about 1 minute.
If I query a existing internal service name, and nslookup returns correctly, then DNS is OK after I quit nslookup.
root@web-0:/# ping google.com PING google.com (142.250.66.110) 56(84) bytes of data. 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=1 ttl=108 time=33.6 ms 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=2 ttl=108 time=34.8 ms ^C --- google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 33.648/34.260/34.872/0.612 ms root@web-0:/# nslookup > set type=srv > nginx Server: 10.96.0.10 Address: 10.96.0.10#53 nginx.default.svc.cluster.local service = 0 25 80 web-1.nginx.default.svc.cluster.local. nginx.default.svc.cluster.local service = 0 25 80 web-2.nginx.default.svc.cluster.local. nginx.default.svc.cluster.local service = 0 25 80 web-0.nginx.default.svc.cluster.local. nginx.default.svc.cluster.local service = 0 25 80 web-3.nginx.default.svc.cluster.local. > exit root@web-0:/# ping google.com PING google.com (142.250.66.110) 56(84) bytes of data. 64 bytes from hkg12s28-in-f14.1e100.net (142.250.66.110): icmp_seq=1 ttl=108 time=33.5 ms ^C --- google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 33.529/33.529/33.529/0.000 ms root@web-0:/#
When DNS fails, the whole cluster can not query any domain name outside, but internal name is OK.
https://github.com/kubernetes/minikube/issues/13137
Powered by: C++博客 Copyright © 金慶