From owner-freebsd-net@freebsd.org Tue Jul 26 15:59:22 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32AF9BA40CB for ; Tue, 26 Jul 2016 15:59:22 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EB7F11EDE for ; Tue, 26 Jul 2016 15:59:21 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id u6QFxF8a081339 for ; Tue, 26 Jul 2016 08:59:19 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201607261559.u6QFxF8a081339@gw.catspoiler.org> Date: Tue, 26 Jul 2016 08:59:15 -0700 (PDT) From: Don Lewis Subject: IPv6 -> IPv4 fallback broken in serf, kernel bug? To: freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jul 2016 15:59:22 -0000 Serf has some code to fall back from IPv4 if an IPv6 and more generally try different addresses on multi-homed servers if connection attempts fail, but it does not work properly on recent versions of FreeBSD. I've tested both recent FreeBSD 10.3-STABLE and HEAD. The way that it is supposed to work is that serf creates a socket, sets it non-blocking, calls connect(), and then passes the fd to poll(). When the connection attempt fails, it expects to see a POLLERR event. The POLLERR event handler will then call getsockopt(fd, SOL_SOCKET, SO_ERROR, &error, ...). If the returned error is ECONNREFUSED or one of a couple of other errors, then serf will move on to the next address. Instead what happens is that serf also(?) sees POLLIN set, which it processes first by calling read(), which returns an ECONNREFUSED error. That not a documented error return from read(). An easy way to test this is to truss svn and attempt to do an http checkout from a host that has both IPv6 and IPv4 addresses, but is not listening on port 80. The only connection attempt will be to the IPv6 address. socket(PF_INET6,SOCK_STREAM|SOCK_CLOEXEC,6) = 4 (0x4) fcntl(4,F_GETFL,) = 2 (0x2) fcntl(4,F_SETFL,O_NONBLOCK|0x2) = 0 (0x0) setsockopt(0x4,0x6,0x1,0x7fffffffdda4,0x4) = 0 (0x0) gettimeofday({ 1469515046.979461 },0x0) = 0 (0x0) connect(4,{ AF_INET6 [xxxx:xxxx:xxxx:xxxx::xxxx]:80 },28) ERR#36 'Operation now in progress' gettimeofday({ 1469515046.979614 },0x0) = 0 (0x0) kevent(3,{ 4,EVFILT_READ,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0) kevent(3,{ 4,EVFILT_WRITE,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0) kevent(3,0x0,0,{ 4,EVFILT_READ,EV_EOF,NOTE_LOWAT|0x3c,0x0,0x805491300 4,EVFILT_WRITE,EV_EOF,NOTE_LOWAT|0x3c,0x8000,0x805491300 },32,{ 0.500000000 }) = 2 (0x2) read(4,0x80549c064,8000) ERR#61 'Connection refused' kevent(3,{ 4,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0) kevent(3,{ 4,EVFILT_WRITE,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0) kevent(3,{ 4,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) ERR#2 'No such file or directory' kevent(3,{ 4,EVFILT_WRITE,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) ERR#2 'No such file or directory' close(4) = 0 (0x0) close(3) = 0 (0x0) svn: E170013: Unable to connect to a repository at URL ... It looks like it should be possible to patch serf to handle this, but: * Should POLLIN be set for this event? * What errno value should read() return in this case, if it is ECONNREFUSED, then that should be documented.