Date: Sun, 01 May 2011 19:00:08 +0200 From: Schoch Christian <e0326715@student.tuwien.ac.at> To: Michael =?utf-8?b?VMO8eGVu?= <Michael.Tuexen@lurchi.franken.de> Cc: freebsd-net@freebsd.org Subject: Re: [SCTP] ICMP unreachable message reenables data transmit Message-ID: <20110501190008.179970yneogqya3c@webmail.tuwien.ac.at> In-Reply-To: <13E5D4BB-5B2C-42B3-A43E-0F260317DE6B@lurchi.franken.de> References: <20110430091148.31393q3py4j4bg38@webmail.tuwien.ac.at> <FD1FC82D-29D1-4186-A0AC-504653C28D85@lurchi.franken.de> <20110430121518.25761cpmtrp0jtpy@webmail.tuwien.ac.at> <F4802366-A9B4-4515-8E6D-E5C80C06408B@lurchi.franken.de> <20110501131048.22413db5jyxywss8@webmail.tuwien.ac.at> <13E5D4BB-5B2C-42B3-A43E-0F260317DE6B@lurchi.franken.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Zitat von Michael T=C3=BCxen: > On May 1, 2011, at 1:10 PM, Schoch Christian wrote: > >>> On Apr 30, 2011, at 12:15 PM, Schoch Christian wrote: >>>> >>>>> On Apr 30, 2011, at 9:11 AM, Schoch Christian wrote: >>>>> >>>>>> During a measurement with CMT-SCTP and PF i figured out, that =20 >>>>>> sometimes a ICMP Destination unreachable message triggers a =20 >>>>>> message transmission on an inactive data path that has been =20 >>>>>> primary before. >>>>>> >>>>>> It looks as the ICMP message is reseting the inactive state =20 >>>>>> back to active without reseting RTO. >>>>>> >>>>>> This behavior is triggered by a returning heartbeat message =20 >>>>>> when no ICMP unreachable by data is sent quite before. >>>>>> >>>>>> Test system are two multi-homed hosts with FreeBSD8.1 and a =20 >>>>>> WANem host between. >>>>>> >>>>>> A wireshark log can be provided on demand (quite large). >>>>> Hi Christian, >>>>> >>>>> any chance to upgrade the FreeBSD machines to head or to use newer >>>>> SCTP sources, which I could provide? It would require a recompilation >>>>> of the kernel... >>>> >>>> It is possible, but the results could be provided not until next week >>>> if a reboot is necessary. >>>> I can use any sources you could provide me since nothing else is =20 >>>> done at this systems. >>> OK, but maybe I can try to understand what is going on. >>> >>> How many paths do you have? One is inactive, but was primary, so it >>> is confirmed. On another one, you get an ICMP (which one? Port unreacha= ble, >>> host unreachable, ...). Do you have more than two paths? >> >> Setup looks like this: >> >> -------- ----cut--- >> Host A WANem Host B >> -------- ---------- >> >> Transfer is running on both path from A to B till the primary link =20 >> is cut between WANem and the receiver and the whole transfer =20 >> switches to the second path. The ICMP message (Host not reachable =20 >> with a Heartbeat as attachment) is received on the primary =20 >> interface from WANem host. > OK, understood. >> >> As I tested this morning, the primary path is switching to =20 >> unreachable due to the ICMP message but should be in this state =20 >> quite before by exceeding path.max_retrans. >> So this ICMP message does two things: >> - Set the primary path to unreachable >> - Triggers something to retry data transfer on the primary path. > After looking at the tracefile, I somewhat agree. > * Do you see something like > ICMP (thresh ??/??) takes interface ?? down > on the console? This would be printed if the ICMP takes the > path to unreachable? (It should also be in /var/log/messages) > * If the path is already unreachable, nothing should happen > in response to the ICMP message. > 08:04:18 kernel: ICMP (thresh 2/3) takes interface 0xc4e20510 down Same timestamp as the faulty start in the tracefile. > So the question is: Is the path unreachable before the ICMP message > is received? Due to the timely difference between first retransmission and ICMP =20 message it should be in unavailable state. But it seams that too many =20 retransmission occur and the ICMP message is moving the path to =20 unavailable state. I picked my eyes to the RTO of primary path and could figure out the =20 following: inital state: rto.min =3D 100ms RTO =3D 100ms after cutting the link: RTO rises to 200ms and 400ms as expected but not higher (rto.max=3D60000) Another test with path_rxt_max =3D 1 worked as expected. So I assume some problems with the retransmission counter when larger =20 than 1 (something like count =3D 1 instead of count >=3D 1) > Is your application monitoring the SCTP notification? > What about the above printout from the kernel? Yes, the notifications are monitored and logged (sctp_menu) - the =20 notification for SCTP_PEER_ADDR_CHANGE comes right after ICMP. Best regards, Christian > Best regards > Michael >> >>> The ICMP message would not reset the RTO, since you need an ACKed TSN >>> or a HB-ACK to to that. Since it is inactive, it is missing these. >>> >>> Sending on an inactive path is OK, as soon as you enter the dormant >>> state, which means all your paths are inactive. >>> >> Transfer is still running on second link which is active. > That sounds good. >> >>> Are you using the PF support for CMT? >> >> Yes, but without NR-SACK and DAC. > OK. >> >> I uploaded the pcap file to: >> http://37116.vs.webtropia.com/cmt_2.pcap > That was helpful! >> >> Best regards, >> Christian >> >>> >>> Best regards >>> Michael >>>> >>>>> >>>>> Are you using IPv4 or IPv6? >>>>> >>>> >>>> IPv4 >>>> >>>> >>>>> Best regards >>>>> Michael >>>>>> >>>>>> Regards, >>>>>> Schoch Christian >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g" >>>>>> >>>>> >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>>>> >>>> >>>> >>> >>> >> >> >> > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110501190008.179970yneogqya3c>