From nobody Mon Jun 24 14:16:44 2024 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4W794r5bkmz5NyXJ for ; Mon, 24 Jun 2024 14:16:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4W794r2Kx9z44wh for ; Mon, 24 Jun 2024 14:16:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1719238604; a=rsa-sha256; cv=none; b=szh9MrJDFcWShcKXB0NbWD+6MhJwUpAmThlqYDC8viS+YbA986h00UW93Uuize6doKi6aA 4UyRl7miNiEHZ7zNl0qWp9oq0MsIdsLTQAKGvLvmhiTNaaIJpCjnNT3Y4d4RIMU9m977IV j6D08mnucwjrVJ/6uPa+nBiodbpX/QyR838hQKZ5o/Hs07oYPzrl6Gc5eJ9mylOXBdQ2+2 TWHpT/Id7wyuaGqa7Ah62AOMAb5ccsKbTPF7Str2r5VeIx+9aDiaS/GA+BwsqOTj+Zz/M2 Q8MXxM70cp2IV/i1BH6UIPdckRJK+LvLqsvh81xgPBoKZ3C1CBLuVZJRKLE0kQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1719238604; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=1c/L0CmCinhIjdQq7MSNu0QTya40egPn9K7yC4iIWRk=; b=DaT1Rnj7GG7ty92Ga7ICujlhyaoSaZtBjsOk4j57kP3VsD1sfZG18UYEEw9papdRqe3g1G 1TB3/7YgddR/Et4WSLeo7lju3ZYU4Ow3s+2+JbFIDqaoNQTW12z7+1LfFntNPJbiojJn9k LdDvw950QEiZsvOTUO0qTJ3ZCD4tInMYIltCEi8a6GJyim5EFUWxrRypuVaNPhJybFzdll nhxFtQKFL8VX3wDNLvdNXuiiuzUf3XnEKMzmq56+cAbrRaIurnerO1a0Mtp7k9KltYiJOH OfjwJg63e1HOOTrT425lbNjUUdEnaro6On9A91H9KBPki236DlC+cDr5L3Jt5g== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4W794r1w1zzM1N for ; Mon, 24 Jun 2024 14:16:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 45OEGiFK064118 for ; Mon, 24 Jun 2024 14:16:44 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 45OEGibj064117 for bugs@FreeBSD.org; Mon, 24 Jun 2024 14:16:44 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 279951] dhclient unable to reuse recorded lease after timeout, since 12.1 Date: Mon, 24 Jun 2024 14:16:44 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 15.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: viktor.stujber+freebsd-bugs_v4CCPfay@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D279951 Bug ID: 279951 Summary: dhclient unable to reuse recorded lease after timeout, since 12.1 Product: Base System Version: 15.0-CURRENT Hardware: Any OS: Any Status: New Severity: Affects Many People Priority: --- Component: bin Assignee: bugs@FreeBSD.org Reporter: viktor.stujber+freebsd-bugs_v4CCPfay@gmail.com At boot time, dhclient is supposed to fall back to /var/db/dhclient.leases = when it reaches 'timeout' (60s) without getting a DHCPOFFER response. For 3 years I've been observing that my freebsd router no longer started its services properly after a power outage, when previously it all worked fine = and there were no relevant configuration changes, aside from gradual OS upgrade= s. The core reason was that dhclient's behavior changed in 12.1-RELEASE from "= No DHCPOFFERS received. Trying recorded lease X. bound: renewal in Y seconds."= to "No working leases in persistent database - sleeping." and leaving the WAN interface with no ip address. The supporting reason is that this only happens when the ISP's network is s= low to recover, and when the server has no battery backup. If there is no reboo= t, the already running dhclient most likely just retains the active lease and = ip binding. I have reproduced this issue in a VM by letting it boot once to get the lea= se, then firewalling off the VM on the hypervisor level and rebooting (cannot simply disconnect link; cannot use on-host pf because it starts too late). I have tested 15.0-CURRENT-20240620, 14.0-RELEASE, 12.4-RELEASE, 12.1-RELEASE= and all trivially reproduce it. 12.0-RELEASE doesn't.=20 Via tedious buildworld bisection, I have identified the cause as b6c2f6eb (= base r344237) from 2019-02-17. In main branch these were 95f237c2 (base r343896) followed by 3b08e0fc (base r343922) from 2019-02-08. https://cgit.freebsd.org/src/commit/sbin/dhclient/dhclient.c?id=3Db6c2f6eb2= 9f4ca183a8877ccedd97ee98d583df3 https://cgit.freebsd.org/src/commit/sbin/dhclient/dhclient.c?id=3D95f237c2f= 65130b6567e69df06c393586e3969a3 https://cgit.freebsd.org/src/commit/sbin/dhclient/dhclient.c?id=3D3b08e0fcf= 357c1a905c5e59731930528fb94a0b1 MFC r343896,r343922: dhclient: Pass through exit status from script @@ -2353,2 +2353,3 @@ priv_script_go(void) - return (wstatus & 0xff); + return (WIFEXITED(wstatus) ? + WEXITSTATUS(wstatus) : 128 + WTERMSIG(wstatus)); } /usr/include/sys/wait.h:#define>_WSTATUS(x) (_W_INT(x) & 0177) /usr/include/sys/wait.h:#define WTERMSIG(x) (_WSTATUS(x)) /usr/include/sys/wait.h:#define WIFEXITED(x) (_WSTATUS(x) =3D=3D 0) /usr/include/sys/wait.h:#define WEXITSTATUS(x) (_W_INT(x) >> 8) A bunch of internal dhclient logic isn't handled directly by C code; instea= d, the code sets up parameters for a 'script' and then runs it via script_go()= . I found 12 such places in dhclient.c. At least 5 places treat the return valu= e as a boolean 0=3Dsuccess, !0=3Derror. Crucially, the lease reuse logic in dhclient.c::state_panic() does this, invoking dhclient-script 'TIMEOUT' (add new address) operation. note("Trying recorded lease %s", piaddr(ip->client->active->address)); /* Run the client script with the existing parameters. */ script_init("TIMEOUT", ip->client->active->medium); script_write_params("new_", ip->client->active); if (ip->client->alias) script_write_params("alias_", ip->client->alias); /* If the old lease is still good and doesn't yet need renewal, go into BOUND state and timeout at the renewal time. */ if (!script_go()) { if (cur_time < ip->client->active->renewal) { ip->client->state =3D S_BOUND; note("bound: renewal in %d seconds.", Adding debug logging code reveals that wstatus is 256, meaning that origina= lly the return value was 0, but after the change it became 1. This is because instead of just returning the wait() outcome of the forked subprocess (exit= ed / signaled / coredumped), it now instead provides the script's programmatic e= xit code. And the way dhclient-script's TIMEOUT is structured, it exits with 0 = in some specific cases, and 1 in the default path. The abovementioned change thus shows a fundamental lack of understanding how dhclient's code operates, resulting in a 'return type confusion' programming mistake. The immediate fix would be to review each call to script_go() that checks the result, and ensure that the check aligns with the exit codes in dhclient-script. Curiously, one of the commits notes that it's identical to openbsd's 1.117 = from 2008 ( https://cvsweb.openbsd.org/src/sbin/dhclient/dhclient.c ). The code structure is nearly identical, so I presume that change also introduced this same issue to openbsd. From the revision log I see that it seems to also ha= ve went unnoticed, for 4 years, until 1.159 'nuked' the entire script integrat= ion framework and thus unintentionally fixed it. --=20 You are receiving this mail because: You are the assignee for the bug.=