From owner-freebsd-net@freebsd.org Wed Jan 13 02:48:36 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB0F4A803AB for ; Wed, 13 Jan 2016 02:48:36 +0000 (UTC) (envelope-from flewis@panasas.com) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0075.outbound.protection.outlook.com [157.56.110.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 399EE1A00 for ; Wed, 13 Jan 2016 02:48:35 +0000 (UTC) (envelope-from flewis@panasas.com) Received: from DM2PR08MB445.namprd08.prod.outlook.com (10.141.86.14) by DM2PR08MB272.namprd08.prod.outlook.com (10.141.54.153) with Microsoft SMTP Server (TLS) id 15.1.361.13; Tue, 12 Jan 2016 21:13:27 +0000 Received: from DM2PR08MB447.namprd08.prod.outlook.com (10.141.86.19) by DM2PR08MB445.namprd08.prod.outlook.com (10.141.86.14) with Microsoft SMTP Server (TLS) id 15.1.361.13; Tue, 12 Jan 2016 21:13:24 +0000 Received: from DM2PR08MB447.namprd08.prod.outlook.com ([10.141.86.19]) by DM2PR08MB447.namprd08.prod.outlook.com ([10.141.86.19]) with mapi id 15.01.0361.006; Tue, 12 Jan 2016 21:13:24 +0000 From: "Lewis, Fred" To: "freebsd-net@freebsd.org" CC: "Sundararajan, Lakshmi" , "Kothavade, Pushkar" , "Pokala, Ravi" , "Lewis, Fred" Subject: Kernel panic from lagg_ioctl and lagg_port_ioctl Thread-Topic: Kernel panic from lagg_ioctl and lagg_port_ioctl Thread-Index: AQHRTX4SEw2RKdNoKkSugD8zam7OVA== Date: Tue, 12 Jan 2016 21:13:23 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=flewis@panasas.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [66.31.107.140] x-microsoft-exchange-diagnostics: 1; DM2PR08MB445; 5:KFppq6YsngtIInUL1romhfC0UMEvrkxfrHri2SpdKY1krOgzLXby9wM55nigAbFW4yU1ymQ1njLyn9HXlUwITv1FeRor6HdLp44UTYvM02D1/x1hdiIo9mcMWYsesnSJYrGIDVSs9Wn6Do+ENo4uow==; 24:naxNtWg8JL0VoHho/MMegUujZty8b6QEnvZSMmJSrMfAWhHyjoQbBObCB8U+pK57pYMBcBoC/P/ME/Edjt834H9n8/wMNY3YfreU+Hg+SVE= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM2PR08MB445; x-ms-office365-filtering-correlation-id: 1f090ae0-ea96-4471-59ba-08d31b9534e8 x-ld-processed: acf01c9d-c699-42af-bdbb-44bf582e60b0,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(8121501046)(520078)(3002001)(10201501046); SRVR:DM2PR08MB445; BCL:0; PCL:0; RULEID:; SRVR:DM2PR08MB445; x-forefront-prvs: 081904387B x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(189002)(199003)(164054003)(92566002)(2906002)(101416001)(110136002)(16236675004)(107886002)(81156007)(36756003)(106356001)(5001960100002)(99286002)(66066001)(1220700001)(97736004)(11100500001)(106116001)(189998001)(50986999)(4001430100002)(54356999)(105586002)(2900100001)(86362001)(102836003)(2501003)(2351001)(4326007)(586003)(229853001)(3846002)(77096005)(1096002)(6116002)(5008740100001)(10400500002)(87936001)(122556002)(5002640100001)(40100003)(5004730100002); DIR:OUT; SFP:1101; SCL:1; SRVR:DM2PR08MB445; H:DM2PR08MB447.namprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: panasas.com does not designate permitted sender hosts) MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Jan 2016 21:13:23.9528 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: acf01c9d-c699-42af-bdbb-44bf582e60b0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR08MB445 X-Microsoft-Exchange-Diagnostics: 1; DM2PR08MB272; 2:phBvIeJ8Cf4ZIuQvIpQhJBIZb8gEKpFKdp2Fa+3+n6OsH+h0HowbbQmAt1P1UTdjLAH4sguf3KQcdeY2vokc7NeGNpk7f/iwdU6GSV+lG8EBogAt7eVf0m0zH0HqoalYCAx5dMxBwOwIBGt1vneKWQ==; 23:yN4h3j2T481zTun4ndNo2OtgDcCCi4fr84VjSK8djEPomrkTLbhJz4ZKEEK7M8yflkpvhXBYuAQYyHW7yaZKUtqU27L/dV6BDkCz71DNmXnTXWLEwshp2hsZzfqbckAmeFGoSdFRPpz6FO2bhEIIIM7oZZrja2CJUR/ac5ECBV0PH2wVVhhTpJ782SRXVf54 X-OriginatorOrg: panasas.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jan 2016 02:48:36 -0000 Hi FreeBSD Networking folks. We are seeing a kernel panics on stable/10 that are being caused by lagg_io= ctl() and lagg_port_ioctl(). The panic occurs when moving from an lacp configuration to, say, a failover= configuration. Please double-check me, but what appears to be happening is that the softwa= re context is not getting cleaned up properly on a mode change and lacp_portreq() is getting called w= hen the lagg is set to failover mode. In particular, sc->sc_portreq is left pointing to lacp_portreq when = the mode is no longer lacp. In earlier versions of lagg_ioctl() (e.g. stable/10/r171247) all of the cal= lout vectors are set to NULL which I think will prevent the problem. Similar NUL= Ling code is also in stable/7. I didn't check other releases. case SIOCSLAGG: if (sc->sc_proto !=3D LAGG_PROTO_NONE) { LAGG_WLOCK(sc); error =3D sc->sc_detach(sc); /* Reset protocol and pointers */ sc->sc_proto =3D LAGG_PROTO_NONE; sc->sc_detach =3D NULL; sc->sc_start =3D NULL; sc->sc_input =3D NULL; sc->sc_port_create =3D NULL; sc->sc_port_destroy =3D NULL; sc->sc_linkstate =3D NULL; sc->sc_init =3D NULL; sc->sc_stop =3D NULL; sc->sc_lladdr =3D NULL; sc->sc_req =3D NULL; sc->sc_portreq =3D NULL; } Looks like the above code was taken out via r287723. Evidently this has been made moot in HOL via r272170 and r272178 (maybe others). Here is one of the backtrace snippets: panic() at panic+0x155/frame 0xfffffe201e3df2e0 trap_fatal() at trap_fatal+0x38f/frame 0xfffffe201e3df340 trap_pfault() at trap_pfault+0x308/frame 0xfffffe201e3df3e0 trap() at trap+0x47a/frame 0xfffffe201e3df5f0 calltrap() at calltrap+0x8/frame 0xfffffe201e3df5f0 --- trap 0xc, rip =3D 0xffffffff804b9811, rsp =3D 0xfffffe201e3df6b0, rbp = =3D 0xfffffe201e3df730 --- __mtx_lock_sleep() at __mtx_lock_sleep+0x1a1/frame 0xfffffe201e3df730 __mtx_lock_flags() at __mtx_lock_flags+0x5a/frame 0xfffffe201e3df750 lacp_portreq() at lacp_portreq+0x2f/frame 0xfffffe201e3df780 lagg_port2req() at lagg_port2req+0x62/frame 0xfffffe201e3df7b0 lagg_port_ioctl() at lagg_port_ioctl+0x14b/frame 0xfffffe201e3df820 ifioctl() at ifioctl+0x162b/frame 0xfffffe201e3df8e0 kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe201e3df950 sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe201e3df9a0 Is there any chance of getting this fixed in stable/10 before code freeze? We have tested a set of diffs that fix the issue and will submit them for = review shortly. Thanks, -Fred