From owner-freebsd-isdn Sun Jan 9 9:21:59 2000 Delivered-To: freebsd-isdn@freebsd.org Received: from mail.sparud.net (hinken.sparud.net [193.12.107.26]) by hub.freebsd.org (Postfix) with ESMTP id EB47215080 for ; Sun, 9 Jan 2000 09:21:52 -0800 (PST) (envelope-from jan@sparud.net) Received: by mail.sparud.net (Postfix, from userid 10) id EAD703E24; Sun, 9 Jan 2000 18:21:50 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14456.50094.237265.228095@hinken.sparud.net> Date: Sun, 9 Jan 2000 18:21:50 +0100 (CET) To: freebsd-isdn@freebsd.org Subject: LCP loop problem fixed X-Mailer: VM 6.75 under Emacs 20.3.1 From: jan@sparud.net (Jan Sparud) Sender: owner-freebsd-isdn@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org (Sorry for the length of this mail) I think I have found the reason for the LCP loop bug that some people, including me, have painfully experienced. I had heard that the bug wasn't in 0.71 so I took a stab at trying to get the sppp files in 0.71 (sppp/if_sppp.h, sppp/if_spppsubr.c and driver/i4b_isppp.c) to work in 0.90. That went smoothly, and sure enough, the LCP loop bug was refreshingly non-existent. I looked at the diffs between 0.71 and 0.90 but couldn't find anything that looked very suspicious. So I started by included about half the diffs, recompiled and tried again. Didn't work. Removed more parts of the diffs, and tried again. After spending half a day doing this diff-diff binary search I finally found out the one thing that made the difference. *** if_spppsubr.c Sat Jan 8 13:28:48 2000 --- if_spppsubr.c.15 Sat Jan 8 13:20:54 2000 *************** *** 1822,1835 **** sp->lcp.protos = 0; sp->lcp.mru = sp->lcp.their_mru = PP_MTU; ! /* ! * Initialize counters and timeout values. Note that we don't ! * use the 3 seconds suggested in RFC 1661 since we are likely ! * running on a fast link. XXX We should probably implement ! * the exponential backoff option. Note that these values are ! * relevant for all control protocols, not just LCP only. ! */ ! sp->lcp.timeout = 1 * hz; sp->lcp.max_terminate = 2; sp->lcp.max_configure = 10; sp->lcp.max_failure = 10; --- 1828,1836 ---- sp->lcp.protos = 0; sp->lcp.mru = sp->lcp.their_mru = PP_MTU; ! /* Note that these values are relevant for all control protocols */ ! sp->lcp.timeout = 3 * hz; ! sp->lcp.max_terminate = 2; sp->lcp.max_configure = 10; sp->lcp.max_failure = 10; When setting the lcp timeout to 1 * hz, everything works beautifully again (both dial-out and dial-in). The comment refers to the following section in RFC 1661: 4.6. Counters and Timers Restart Timer There is one special timer used by the automaton. The Restart timer is used to time transmissions of Configure-Request and Terminate-Request packets. Expiration of the Restart timer causes a Timeout event, and retransmission of the corresponding Configure-Request or Terminate-Request packet. The Restart timer MUST be configurable, but SHOULD default to three (3) seconds. Implementation Note: The Restart timer SHOULD be based on the speed of the link. The default value is designed for low speed (2,400 to 9,600 bps), high switching latency links (typical telephone lines). Higher speed links, or links with low switching latency, SHOULD have correspondingly faster retransmission times. Instead of a constant value, the Restart timer MAY begin at an initial small value and increase to the configured final value. Each successive value less than the final value SHOULD be at least twice the previous value. The initial value SHOULD be large enough to account for the size of the packets, twice the round trip time for transmission at the link speed, and at least an additional 100 milliseconds to allow the peer to process the packets before responding. Some circuits add another 200 milliseconds of satellite delay. Round trip times for modems operating at 14,400 bps have been measured in the range of 160 to more than 600 milliseconds. I don't know why it doesn't work at all with the default timeout (3s). However, the RFC says that the timeout must be configurable, and that the default is for slow modems with high switching latency links, and that higher speed links should have correspondingly faster retransmission times. I have tried with a few different values: between 0.1 secs and around 2.1 secs it works, but with the higher values an additional pause, with the same length as the timeout, slows down the connection. With higher values than 2.1 secs it doesn't work at all (it loops). So in my setup I now use a timeout of 0.1 secs, which gives me a total connect time of about 0.8 secs. I have modified the ioctl routines in if_spppsubr.c to accept setting the lcp timeout value. I modified sppp_params to accept setting values when the interface is in state ESTABLISH too (i.e., not just in DEAD phase). I also modified spppcontrol to accept a new option "lcp-timeout" that takes an integer as argument, which is the new timeout value in milliseconds. Spppcontrol now also displays the current setting of the timeout value. Diffs are included below. When the remote site dials me and tries to connect, I can now change the timeout value instantaneously. The following tcpdump output shows a session when the timeout value is initially 3 s (the default value). After having looped for a while, I changed the timeout (using spppcontrol) to 0.1 s. Immediately after that, the connection succeeds. psoda# tcpdump -i isp1 tcpdump: listening on isp1 18:12:28.864739 LCP (c021): ID=018 Conf-REQ, Magic-Number=1070477904 18:12:28.912808 LCP (c021): ID=203 Conf-REQ, Magic-Number=1707765220 18:12:28.912821 LCP (c021): ID=203 Conf-ACK, Magic-Number=1707765220 18:12:30.913363 LCP (c021): ID=204 Conf-REQ, Magic-Number=1707765220 18:12:30.913374 LCP (c021): ID=204 Conf-ACK, Magic-Number=1707765220 18:12:32.916666 LCP (c021): ID=205 Conf-REQ, Magic-Number=1707765220 18:12:32.916676 LCP (c021): ID=205 Conf-ACK, Magic-Number=1707765220 18:12:34.916205 LCP (c021): ID=206 Conf-REQ, Magic-Number=1707765220 18:12:34.916214 LCP (c021): ID=206 Conf-ACK, Magic-Number=1707765220 18:12:36.919511 LCP (c021): ID=207 Conf-REQ, Magic-Number=1707765220 18:12:36.919527 LCP (c021): ID=207 Conf-ACK, Magic-Number=1707765220 18:12:37.010008 LCP (c021): ID=019 Conf-REQ, Magic-Number=1070477904 18:12:37.020759 LCP (c021): ID=019 Conf-ACK, Magic-Number=1070477904 18:12:37.020776 IP-CP (8021): ID=020 Conf-REQ, IP-Address: 0.0.0.0 18:12:37.023138 IP-CP (8021): ID=024 Conf-REQ, IP-Address: 193.12.107.34 18:12:37.023150 IP-CP (8021): ID=024 Conf-ACK, IP-Address: 193.12.107.34 18:12:37.024876 CISCODP-CP (8207): 0x01 0x18 0x00 0x04 18:12:37.024881 LCP (c021): ID=021 Protocol-REJ, Protocol=??? (000a): 18:12:37.031506 IP-CP (8021): ID=020 Conf-NAK, IP-Address: 193.12.107.46 18:12:37.031591 IP-CP (8021): ID=022 Conf-REQ, IP-Address: 193.12.107.46 18:12:37.042015 IP-CP (8021): ID=022 Conf-ACK, IP-Address: 193.12.107.46 18:12:37.350768 IP (0021): 193.12.107.26 > 193.12.107.46: icmp: echo request Here are the diffs: *** if_spppsubr.c.0.90 Wed Jan 5 15:34:40 2000 --- if_spppsubr.c Sun Jan 9 17:58:52 2000 *************** *** 1926,1932 **** sp->lcp.mru = sp->lcp.their_mru = PP_MTU; /* Note that these values are relevant for all control protocols */ ! sp->lcp.timeout = 3 * hz; sp->lcp.max_terminate = 2; sp->lcp.max_configure = 10; --- 1934,1940 ---- sp->lcp.mru = sp->lcp.their_mru = PP_MTU; /* Note that these values are relevant for all control protocols */ ! sp->lcp.timeout = 1 * hz; sp->lcp.max_terminate = 2; sp->lcp.max_configure = 10; *************** *** 4221,4227 **** * secrets are cleared if the authentication protocol is * reset to 0. */ ! if (sp->pp_phase != PHASE_DEAD) return EBUSY; if ((spr.defs.myauth.proto != 0 && spr.defs.myauth.proto != PPP_PAP && --- 4234,4241 ---- * secrets are cleared if the authentication protocol is * reset to 0. */ ! if (!((sp->pp_phase == PHASE_DEAD) || ! (sp->pp_phase == PHASE_ESTABLISH))) return EBUSY; if ((spr.defs.myauth.proto != 0 && spr.defs.myauth.proto != PPP_PAP && *************** *** 4252,4257 **** --- 4266,4275 ---- if (spr.defs.hisauth.secret[0] != '\0') bcopy(spr.defs.hisauth.secret, sp->hisauth.secret, AUTHKEYLEN); + } + if (spr.defs.lcp.timeout != 0) { + /* Set the LCP timeout */ + sp->lcp.timeout = spr.defs.lcp.timeout; } break; *** spppcontrol.c.090 Sun Jan 9 15:18:47 2000 --- spppcontrol.c Sun Jan 9 16:01:11 2000 *************** *** 84,89 **** --- 84,92 ---- #define PPP_PAP 0xc023 #define PPP_CHAP 0xc223 + /* Hmm */ + #define HZ 100 + int main(int argc, char **argv) { *************** *** 190,196 **** spr.defs.hisauth.flags |= AUTHFLAG_NORECHALLENGE; else if (strcmp(argv[0], "rechallenge") == 0) spr.defs.hisauth.flags &= ~AUTHFLAG_NORECHALLENGE; ! else errx(EX_DATAERR, "bad parameter: \"%s\"", argv[0]); argv++; --- 193,205 ---- spr.defs.hisauth.flags |= AUTHFLAG_NORECHALLENGE; else if (strcmp(argv[0], "rechallenge") == 0) spr.defs.hisauth.flags &= ~AUTHFLAG_NORECHALLENGE; ! else if (startswith("lcp-timeout=")) { ! int timeout_arg = atoi(argv[0]+off); ! if ((timeout_arg > 20000) || (timeout_arg <= 0)) ! errx(EX_DATAERR, "bad lcp timeout value: %s", ! argv[0]+off); ! spr.defs.lcp.timeout = timeout_arg * HZ / 1000; ! } else errx(EX_DATAERR, "bad parameter: \"%s\"", argv[0]); argv++; *************** *** 232,237 **** --- 241,248 ---- AUTHNAMELEN, sp->defs.hisauth.name, authflags(sp->defs.hisauth.flags)); } + printf("\tlcp timeout: %.3f s\n", + (double)sp->defs.lcp.timeout / HZ); } const char * Jan Sparud To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-isdn" in the body of the message