Date: Wed, 16 Jun 2004 11:49:01 -0300 From: "Sergio de Souza Prallon" <prallon@uol.com.br> To: FreeBSD-gnats-submit@FreeBSD.org Cc: Sergio de Souza Prallon <prallon@tmp.com.br> Subject: kern/68011: [patch] Isochronous delays in PPPoE Message-ID: <20040616144905.71749AA0B@scorpion1.uol.com.br> Resent-Message-ID: <200406161450.i5GEoQFt043969@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 68011 >Category: kern >Synopsis: [patch] Isochronous delays in PPPoE >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed Jun 16 14:50:26 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Sergio de Souza Prallon >Release: FreeBSD 4.10-STABLE i386 >Organization: >Environment: >Description: I use clockspeed (ports/sysutils/clockspeed) to keep my clock in sync. A couple of months ago I noticed it no longer was able to get the time reliably. When run from the cmd line it produced error msgs and sometimes failed to set the clock. Pinging the NTP server, I saw the RTT was too high (~500-1000ms). Even more anoying was the fact that most of the ICMP replies were taking the same RTT (to a 1ms precision). Pinging other sites and servers had the same results. The same for the PPPoE terminator. TCP connections were normal except for a "lag" in interactive SSH sessions to remote hosts. HTTP downloads were acceptable. At first, I tought it was a problem with my access provider, but they assured me everything was just fine on they side (no alarms, no abnormal error rates, etc). Not that I really trust them but I decided to investigate my side. My HW configuration haven't changed in months before, so the problem had to be software related. A week or two before, I had cvsup'ed and rebuilt my system. To check this, I cvsup'ed angain, this time to 4.9-REL. The problem vanished. After making a diff 4.9-REL and 4.10-ST, I began a process to try to pinpoint the change(s) that caused the problem. Eventually I came to 3 diffs that were commited at the same time with the same CVS comment: ----8<--------8<--------8<--------8<--------8<--------8<---- MFC: speedup stream socket recv handling by tracking the tail of the mbuf chain instead of walking the list for each append. This has been pretty well tested at Yahoo! Obtained from: netbsd (jason thorpe) Reviewed by: silby ----8<--------8<--------8<--------8<--------8<--------8<---- I failed to understand how such change slow down (or synchronize) my trafic. I don't see any time dependency (spin loops or sleeps) in it, but it do trigger the problem. To document it, I produced a screen (ports/misc/screen) session where I show: 1) The problem occurring on an up to date system. 2) That a 4.9-REL does not have it. 3) That a patched 4.9-REL kernel have it (with both userlands). The screenlog plus (possibly) relevant syslog and config info (including the diff that cause the bug) are in an annex file. I don't know if it affects other types of connections. I only have ADSL here. >How-To-Repeat: Start with a 4.9-REL system. Apply the patch and make a new kernel. It should exhibit the problem. Based on what it's changed, I don't think it's platform specific but I just can't prove it. >Fix: I'm currently running a 4.9-REL kernel with a 4.10-ST userland just fine. I believe that undoing the change should fix(?) the problem. I have not tested it, because the patch fail to reverse due to other changes in the code after this one. Of course, the correct solution is to understand what's going on and rewrite the change. >Release-Note: >Audit-Trail: >Unformatted: >System: FreeBSD ethshar 4.10-STABLE FreeBSD 4.10-STABLE #0: Sun Jun 13 13:05:35 BRT 2004 root@ethshar:/aux/src/sys/compile/TEST i386 Machine is a Intel Seattle II (SE440BX-2) + PIII 600E + 256MB RAM + 20GB HD. The Internet connection is ADSL (256Kbps). It uses a VIA Rhyne III ethernet + USR 9001 ADSL modem. I don't known the brand of the DSLAM but the tunnel terminator is probably a Cisco 6400.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040616144905.71749AA0B>
