Date: Sun, 09 Dec 2007 13:59:38 -0800 From: Julian Elischer <julian@elischer.org> To: cpghost <cpghost@cordula.ws> Cc: Brian Somers <brian@Awfulhak.org>, Alexander Motin <mav@FreeBSD.org>, freebsd-stable@freebsd.org Subject: Re: "no matching session" in ng_pppoe.c 1.74.2.4? (RELENG_6) Message-ID: <475C654A.2070607@elischer.org> In-Reply-To: <20071209215013.3cd794fe@epia-2.farid-hajji.net> References: <20071206030500.746c782d@epia-2.farid-hajji.net> <4757E39C.8020009@FreeBSD.org> <20071206161107.3c0c9a82@epia-2.farid-hajji.net> <20071209173359.710ea5bd@epia-2.farid-hajji.net> <475C3E49.6000906@elischer.org> <20071209215013.3cd794fe@epia-2.farid-hajji.net>
next in thread | previous in thread | raw e-mail | index | archive | help
cpghost wrote: > On Sun, 09 Dec 2007 11:13:13 -0800 > Julian Elischer <julian@elischer.org> wrote: > >> cpghost wrote: >>> On Thu, 6 Dec 2007 16:11:07 +0100 >>> cpghost <cpghost@cordula.ws> wrote: >>> >>>> On Thu, 06 Dec 2007 13:57:16 +0200 >>>> Alexander Motin <mav@FreeBSD.org> wrote: >>>> >>>>> cpghost wrote: >>>>>> The problem is that the last mile carrier of the PPP provider >>>>>> that this router is attached to disconnects the ppp session >>>>>> forcibly once every 24h. Before the update, ppp would detect >>>>>> this and reconnect immediately. After the update, ppp doesn't >>>>>> recover gracefully from this anymore, but spits out on the >>>>>> console: >>>>>> >>>>>> ng_pppoe[5]: no matching session >>>>>> >>>>>> for hours, and tries to connect again every two minutes without >>>>>> success, until I manually stop and restart the userland ppp >>>>>> daemon (and then the connection is immediately restored with a >>>>>> new session). I've tried this for a few days now, and it is >>>>>> always the same: it's definitely not a problem on the provider's >>>>>> side: As soon as ppp restarts, it gets a new session without any >>>>>> problems and connects again. >>>>>> >>>>>> Since the last working sources were from 2007/09/25, and >>>>>> ng_pppoe.c was at rev. 1.74.2.3; and the new revision of >>>>>> ng_pppoe.c is now at 1.74.2.4; I'm suspecting that whatever >>>>>> was changed there could be the cause (because this "no matching >>>>>> session" is being logged from there). >>>>> I have tested and unable to reproduce that myself with ppp -> mpd >>>>> or mpd >>>>> - -> mpd PPPoE connections. Actually I am not sure about any >>>>> difference between reconnect and ppp restart. From the ng_pppoe >>>>> node point of view it should be the same. >>>>> >>>>> Could you provide tcpdump output for connection tries from your >>>>> Ethernet interface? Use "-pes 0" options please. >>>> Will do; but I'll first have to wait 24h from now to get a >>>> forcibly disconnected session (I've just had to restart ppp >>>> again). >>> All right, I've got a good tcpdump now: >>> >>> # tcpdump -i sis0 -n -pes 0 >>> tcpdump: verbose output suppressed, use -v or -vv for full protocol >>> decode listening on sis0, link-type EN10MB (Ethernet), capture size >>> 65535 bytes >>> >>> 17:06:08.400881 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, >>> ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq >>> 0xC0C263C1] [Service-Name] >>> >>> 17:06:08.400891 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC0ED45C1] >>> [Service-Name] >>> >>> 17:06:08.400898 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE > PADI [Host-Uniq 0x40C263C1] >>> [Service-Name] >>> >>> 17:06:08.400904 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, >>> ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq >>> 0x80CA63C1] [Service-Name] >>> >>> 17:06:08.400910 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80C963C1] >>> [Service-Name] >> >> why are we sending PADIs for 4 different sessions? >> what does ngctl list >> show? > > Right now, with the working connection: > > # ngctl list > There are 6 total nodes: > Name: ngctl53522 Type: socket ID: 000002ec Num hooks: 0 > Name: <unnamed> Type: socket ID: 000002eb Num hooks: 1 > Name: <unnamed> Type: pppoe ID: 00000005 Num hooks: 2 > Name: sis2 Type: ether ID: 00000003 Num hooks: 0 > Name: sis1 Type: ether ID: 00000002 Num hooks: 0 > Name: sis0 Type: ether ID: 00000001 Num hooks: 1 > >>> 17:06:08.528227 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0xC0C263C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >> >> they respond to the first one we sent. >> this info should now be sent the ppp daemon. >> >> >>> 17:08:08.488679 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x806320C1] >>> [Service-Name] >>> >>> 17:08:08.488690 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE > PADI [Host-Uniq 0x40D063C1] >>> [Service-Name] >>> >>> 17:08:08.488696 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, >>> ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq >>> 0x00C063C1] [Service-Name] >>> >>> 17:08:08.488702 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40CE63C1] >>> [Service-Name] >>> >>> 17:08:08.488708 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80EC45C1] >>> [Service-Name] >> >> hey these are 4 more new pppoe sessions making 8 >> Are they just accumlating from each try? maybe we are not cleaning >> up? >> >>> 17:08:08.552036 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE > PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x806320C1] [Service-Name] [AC-Cookie >>> "..7\t.K.,.!y.y.E"] >>> >>> 17:08:08.557191 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x40D063C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:08:08.572971 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x00C063C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:08:08.577148 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x40CE63C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:08:08.581343 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x80EC45C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >> router has responded to 4 more.. they should all be reporting back to >> whatever started them. >> >> >> some retries.. >> I haven't looked at all the host-unique fields to see which are new >> and which are retries.. whatever it is.. we have way too many >> sessions. >> >> >>> 17:10:08.577488 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x80D063C1] >>> [Service-Name] >>> >>> 17:10:08.577499 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, >>> ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq >>> 0x80C463C1] [Service-Name] >>> >>> 17:10:08.577505 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC0CC63C1] >>> [Service-Name] >>> >>> 17:10:08.577511 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE > PADI [Host-Uniq 0x00E745C1] >>> [Service-Name] >>> >>> 17:10:08.577517 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, >>> ethertype PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq >>> 0x40EC45C1] [Service-Name] >>> >>> 17:10:08.631297 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x80D063C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:10:08.637181 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x80C463C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:10:08.646060 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0xC0CC63C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:10:08.649262 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x00E745C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> 17:10:08.660599 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0x40EC45C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >>> ----------- manually restarting ppp(1), then: >>> ------------------------ >>> >>> 17:10:47.306928 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0x40C663C1] >>> [Service-Name] >>> >>> 17:10:47.306939 00:00:24:c2:45:74 > ff:ff:ff:ff:ff:ff, ethertype >>> PPPoE D (0x8863), length 32: PPPoE PADI [Host-Uniq 0xC06220C1] >>> [Service-Name] >>> >> we still have 2 sessions instead of 1, but there is less confusion >> so things sort themselves out. >> >> Hey, notice that the host-unique fields are in reverse order... >> the first probably shoudl be: C1 63 C6 40 but instead is: 40 C6 63 >> C1 (It's a based on a kernel memory address and can't start with 40) >> this is not a problem, it just needs to eb unique, >> just intersting that it's getting put out without a htonl() >> >>> 17:10:47.359585 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, >>> ethertype PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name >>> "DSSX43-erx"] [Host-Uniq 0x40C663C1] [Service-Name] [AC-Cookie >>> "..7\t.K.,.!y.y.E"] >>> >>> 17:10:47.365720 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADO [AC-Name "DSSX43-erx"] >>> [Host-Uniq 0xC06220C1] [Service-Name] [AC-Cookie "..7\t.K.,.!y.y.E"] >>> >> the router responds to both with PADO >> >>> 17:10:47.365911 00:00:24:c2:45:74 > 00:90:1a:a0:15:b7, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADR [Host-Uniq 0xC06220C1] >>> [AC-Cookie "..7\t.K.,.!y.y.E"] [AC-Name "DSSX43-erx"] [Service-Name] >>> >> We take one and run with it.. teh other is allowed to languish. >> >> I think MAV may be able to see more.. >> He will need your mpd config files too. >> (or are you using ppp(8)?) > > I'm using userland ppp(8). ok, so I'm including brian (mr ppp) in the CCs > >> either way, >> config files, and output of ngctl list >> >> will be interesting. > > "ngctl list" see above (right now, the connection is up; I'd need > to wait again until 17:05 UTC+1 until the next forced disconnect > and hope to hit this condition again for a "ngctl list" with more > sessions. > > /etc/ppp/ppp.conf: > > default: > set log Phase Chat LCP IPCP CCP tun command > ident user-ppp VERSION (built COMPILATIONDATE) > > sling2: > set device PPPoE:sis0 > #set MTU 1492 > #set MRU 1492 > set MTU 1460 > set MRU 1460 > set dial > set crtscts off > set speed sync > #accept lqr > #enable lqr > #enable lqr echo > disable lqr > set echoperiod 30 > enable echo > disable deflate > disable pred1 > disable vjcomp > disable acfcomp > disable protocomp > set log Phase LCP IPCP CCP Warning Error Alert > set ifaddr 10.0.0.1/0 10.0.0.2/0 0.0.0.0 0.0.0.0 > set login > set authname XXXXXXXXXXXXXXXXXXXXX > set authkey XXXXXXXXX > set timeout 0 > add default HISADDR > set server /var/run/internet "" 0177 > nat enable yes > > /etc/rc.conf: > ppp_enable="YES" > ppp_profile="sling2" > ppp_user="root" > ppp_mode="ddial" > ppp_nat="YES" > > (The MTU/MRU of 1460 instead of 1492 is required by ISP) > > As said, this is exactly the configuration that worked up until > Sep 25th 2007 without any problems. > >>> 17:10:47.463059 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE D (0x8863), length 66: PPPoE PADS [ses 0x1906] [Service-Name] >>> [Host-Uniq 0xC06220C1] [AC-Name "DSSX43-erx"] [AC-Cookie >>> "..7\t.K.,.!y.y.E"] >>> >>> 17:10:49.338956 00:00:24:c2:45:74 > 00:90:1a:a0:15:b7, ethertype >>> PPPoE S (0x8864), length 36: PPPoE [ses 0x1906] LCP (0xc021), >>> length 16: LCP, Conf-Request (0x01), id 1, length 16 >>> >>> 17:10:49.570907 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE S (0x8864), length 64: PPPoE [ses 0x1906] LCP (0xc021), >>> length 20: LCP, Conf-Request (0x01), id 175, length 20 >>> >>> 17:10:49.571646 00:90:1a:a0:15:b7 > 00:00:24:c2:45:74, ethertype >>> PPPoE S (0x8864), length 64: PPPoE [ses 0x1906] LCP (0xc021), >>> length 16: LCP, Conf-Ack (0x02), id 1, length 16 >>> >>> 17:10:49.584762 00:00:24:c2:45:74 > 00:90:1a:a0:15:b7, ethertype >>> PPPoE S (0x8864), length 40: PPPoE [ses 0x1906] LCP (0xc021), >>> length 20: LCP, Conf-Ack (0x02), id 175, length 20 >>> >>> ------- More LCP packets from ses 0x1906 then regular traffic >>> -------- >>> >>> HTH... >>> >>> Thanks, >>> -cpghost. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?475C654A.2070607>