ubuntu zesty / apt / dns timeout / srv records

ubuntu zesty / apt / dns timeout / srv records

  • Written by
    Walter Doekes
  • Published on

Ever since I updated from Ubuntu/Yakkety to Zesty, my apt-get(1) would sit and wait a while before doing actual work:

$ sudo apt-get update
0% [Working]

Madness. Let’s see what it’s doing…

$ sudo strace -f -s 512 apt-get update
...
[pid  5603] connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
...
[pid  5603] sendto(3, "\1\271\1\0\0\1\0\0\0\0\0\0\5_http\4_tcp\3ppa\tlaunchpad\3net\0\0!\0\1", 46, MSG_NOSIGNAL, NULL, 0) = 46
[pid  5603] poll([{fd=3, events=POLLIN}], 1, 5000 <unfinished ...>
...
[pid  5600] select(8, [5 6 7], [], NULL, {0, 500000}) = 0 (Timeout)
...
[pid  5600] select(8, [5 6 7], [], NULL, {0, 500000}) = 0 (Timeout)
...

That is, it does an UDP sendto(2) to 127.0.0.1:53, with the data which contains _http\4_tcp\3ppa\tlaunchpad\3net. It’s a DNS lookup of course, for _http._tcp.ppa.launchpad.net. For which it waits 5000 ms before continuing.

That looks like SRV records. New in apt, apparently. And probably a first lookup before falling back to regular A record lookups.

However, it shouldn’t be timing out if there is nothing. Who is not doing its job?

$ sudo netstat -tulpen | grep 127.0.0.1:53
tcp  0  0 127.0.0.1:53  0.0.0.0:*  LISTEN  0  23600  1347/dnsmasq
udp  0  0 127.0.0.1:53  0.0.0.0:*          0  23599  1347/dnsmasq

$ dpkg -l dnsmasq | grep ^ii
ii  dnsmasq  2.76-5  all  Small caching DNS proxy and DHCP/TFTP server

Is it dnsmasq or is the problem upstream?

$ time dig -t srv _http._tcp.google.com. @ns1.google.com. | grep status:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32887

real  0m0.023s
user  0m0.008s
sys   0m0.000s

$ time dig -t srv _http._tcp.google.com. @127.0.0.1 | grep status:

real  0m15.011s
user  0m0.004s
sys   0m0.004s

Okay, dnsmasq is to blame.

Interestingly, dnsmasq does return quickly for existing or even non-existing but NOERROR-status records:

$ dig -t srv _http._tcp.microsoft.com. @127.0.0.1 | grep -E 'status:|^[^;].*SRV'
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32215

$ dig -t srv _sip._udp.example-voip-provider.com @127.0.0.1 | grep -E 'status:|^[^;].*SRV'
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27212
_sip._udp.example-voip-provider.com.  2212  IN  SRV 60 0 5060 sip01.example-voip-provider.com

Workarounds?

Other than checking why dnsmasq misbehaves, we can quickly work around this by either adding the following, or removing dnsmasq altogether.

For the following workaround, you will need to keep this list updated. So if removing dnsmasq is feasible, you should consider doing that.

$ cat /etc/dnsmasq.d/srv-records-broken
srv-host=_http._tcp.ppa.launchpad.net,91.189.95.83,80

Back to overview Newer post: puppet / pip_version / facter Older post: squashing old git history