From owner-freebsd-stable@FreeBSD.ORG Mon Jul 7 11:57:11 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CD9365EE; Mon, 7 Jul 2014 11:57:11 +0000 (UTC) Received: from cu01176b.smtpx.saremail.com (cu01176b.smtpx.saremail.com [195.16.151.151]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4D2E92C2F; Mon, 7 Jul 2014 11:57:10 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop04.sare.net (Postfix) with ESMTPSA id CD6199DC9B8; Mon, 7 Jul 2014 13:57:08 +0200 (CEST) Subject: Re: Fix Emulex "oce" driver in CURRENT Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Borja Marcos In-Reply-To: Date: Mon, 7 Jul 2014 13:57:07 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <6C8CF68D-68E2-4168-AA0A-6A629D363371@sarenet.es> References: <453BA9EC-BB63-4258-8141-847F41315E1E@sarenet.es> To: Luigi Rizzo X-Mailer: Apple Mail (2.1283) Cc: "freebsd-net@freebsd.org" , freebsd-current , Stable Stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jul 2014 11:57:12 -0000 On Jul 7, 2014, at 1:23 PM, Luigi Rizzo wrote: > On Mon, Jul 7, 2014 at 1:03 PM, Borja Marcos = wrote: > we'll try to investigate, can you tell us more about the environment = you use ? > (FreeBSD version, card model (PCI id perhaps), iperf3 invocation line, > interface configuration etc.) >=20 > The main differences between 10.0.747.0 and the code in head (after > our fix) is the use > of drbr_enqueue/dequeue versus the peek/putback in the transmit = routine. >=20 >=20 > Both drivers still have issues when the link flaps because the > transmit queue is not cleaned > up properly (unlike what happens in the linux driver and all FreeBSD > drivers for different > hardware), so it might well be that you are seeing some side effect of > that or other > problem which manifests itself differently depending on the = environment. >=20 > 'instant panic' by itself does not tell us anything about what could > be the problem you experience (and we do not see it with either = driver). The environment details are here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D183391 The way I produce an instant panic is: 1) Connect to another machine (cross connect cable) 2) iperf3 -s on the other machine=20 (The other machine is different, it has an "ix" card) 3) iperf3 -t 30 -P 4 -c 10.0.0.1 -N In less than 30 seconds, panic. mierda dumped core - see /var/crash/vmcore.0 Mon Jul 7 13:06:44 CEST 2014 FreeBSD mierda 10.0-STABLE FreeBSD 10.0-STABLE #2: Mon Jul 7 11:41:45 = CEST 2014 root@mierda:/usr/obj/usr/src/sys/GENERIC amd64 panic: sbsndptr: sockbuf 0xfffff800a70489b0 and mbuf 0xfffff801a3326e00 = clashing GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you = are welcome to change it and/or distribute copies of it under certain = conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for = details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: sbsndptr: sockbuf 0xfffff800a70489b0 and mbuf 0xfffff801a3326e00 = clashing cpuid =3D 12 KDB: stack backtrace: #0 0xffffffff8092a470 at kdb_backtrace+0x60 #1 0xffffffff808ef9c5 at panic+0x155 #2 0xffffffff80962710 at sbdroprecord_locked+0 #3 0xffffffff80a8ba8c at tcp_output+0xdbc #4 0xffffffff80a8987f at tcp_do_segment+0x30ff #5 0xffffffff80a85b34 at tcp_input+0xd04 #6 0xffffffff80a1af57 at ip_input+0x97 #7 0xffffffff809ba512 at netisr_dispatch_src+0x62 #8 0xffffffff809b1ae6 at ether_demux+0x126 #9 0xffffffff809b278e at ether_nh_input+0x35e #10 0xffffffff809ba512 at netisr_dispatch_src+0x62 #11 0xffffffff81c19ab9 at oce_rx+0x3c9 #12 0xffffffff81c19536 at oce_rq_handler+0xb6 #13 0xffffffff81c1bb1c at oce_intr+0xdc #14 0xffffffff80938b35 at taskqueue_run_locked+0xe5 #15 0xffffffff809395c8 at taskqueue_thread_loop+0xa8 #16 0xffffffff808c057a at fork_exit+0x9a #17 0xffffffff80ccb51e at fork_trampoline+0xe Uptime: 51m20s Borja.