Date: Sat, 22 Nov 2003 03:00:53 -0500 From: Mike Tancsa <mike@sentex.net> To: freebsd-net@freebsd.org Subject: Broken ppp LQR - FreeBSD or ERX ? Message-ID: <6.0.1.1.0.20031121221443.08ed0860@209.112.4.2>
next in thread | raw e-mail | index | archive | help
I have been trying to track down a rather old mystery as to why when using PPPoE via userland PPP the daemon would not be able to detect if the DSL connection was down when the other side was a Juniper ERX. The same machines when using mpd works fine and can detect via LCP echos if the link is down. Also, both mpd and ppp work correctly when talking PPPoE to an SMS. Looking through the logs, it would appear that LQR doesnt work against the ERX (ie the ERX is broken) or FreeBSD's LQR is broken, or the two dont like each other. According to the man page, When LQR is enabled, ppp sends the QUALPROTO option (see ``set lqrperiod'' below) as part of the LCP request. If the peer agrees, both sides will exchange LQR packets at the agreed frequency, allowing detailed link quality monitoring by enabling LQM logging. If the peer doesn't agree, ppp will send ECHO LQR requests instead. These packets pass no information of interest, but they MUST be replied to by the peer. So, there is a sort of "Plan A" and "Plan B". e.g on a session against an ERX Mov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Will send LQR every 10.00 secs Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output: Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: Magic: 860c0808 LastOutLQRs: 00000000 Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets: 00000000 LastOutOctets: 00000000 Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerInLQRs: 00000000 PeerInPackets: 00000006 Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards: 00000000 PeerInErrors: 00000000 Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerInOctets: 00000076 PeerOutLQRs: 00000001 Nov 22 02:34:04 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets: 0000000a PeerOutOctets: 00000141 Looking at the source code this implies the LQM_LQR method. Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again): Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: Magic: 860c0808 LastOutLQRs: 00000000 Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets: 00000000 LastOutOctets: 00000000 Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerInLQRs: 00000000 PeerInPackets: 00000011 Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards: 00000000 PeerInErrors: 00000000 Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerInOctets: 00000196 PeerOutLQRs: 00000001 Nov 22 02:34:44 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets: 00000017 PeerOutOctets: 000002ef Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink: RecvEchoRequest(4) state = Opened Nov 22 02:34:48 datawest3 ppp.prev[52036]: tun0: LCP: deflink: SendEchoReply(4) state = Opened Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Output (again): Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: Magic: 860c0808 LastOutLQRs: 00000000 Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: LastOutPackets: 00000000 LastOutOctets: 00000000 Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerInLQRs: 00000000 PeerInPackets: 00000012 Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerInDiscards: 00000000 PeerInErrors: 00000000 Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerInOctets: 000001a5 PeerOutLQRs: 00000001 Nov 22 02:34:55 datawest3 ppp.prev[52036]: tun0: LQM: PeerOutPackets: 00000019 PeerOutOctets: 00000335 Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink: RecvEchoRequest(5) state = Opened Nov 22 02:34:58 datawest3 ppp.prev[52036]: tun0: LCP: deflink: SendEchoReply(5) state = Opened Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: Phase: deflink: ** Too many LQR packets lost ** Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: LQM: deflink: Too many LQR packets lost Nov 22 02:35:05 datawest3 ppp.prev[52036]: tun0: CCP: deflink: State change Stopped --> Closed But as you can see from the logs, it doesnt work. The host never responds and ppp hangs up even though it really is connected. This compared to a connection against an SMS Nov 22 02:38:02 gastest ppp[5018]: tun0: LQM: deflink: LQR/ECHO LQR not negotiated Note in this case, ppp reverts to "plan B" where it will just send LCP echos. Now, getting back to the machine that connects to the ERX, if I make the following change to the source code, datawest3# diff -u lcp.c lcp.c.patch --- lcp.c Sat Aug 31 22:12:28 2002 +++ lcp.c.patch Fri Nov 21 19:50:07 2003 @@ -575,6 +575,11 @@ log_Printf(LogLCP, "%s: LayerUp\n", fp->link->name); physical_SetAsyncParams(p, lcp->want_accmap, lcp->his_accmap); lqr_Start(lcp); + + + p->hdlc.lqm.method = LQM_ECHO; + + hdlc_StartTimer(&p->hdlc); fp->more.reqs = fp->more.naks = fp->more.rejs = lcp->cfg.fsm.maxreq * 3; datawest3# i.e. to force "plan b", where it just uses LCP echos everything against the ERX works! It sends and receives LCP echo requests just fine, and if it does not hear 5 back in a row, it determines the links is down. As a work around, is there any way to force the LQM_ECHO method through the ppp.conf ? Has anyone else run into this ? ---Mike -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.0.1.1.0.20031121221443.08ed0860>