From owner-freebsd-bugs@FreeBSD.ORG Fri Apr 25 22:40:01 2014 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B127DFED for ; Fri, 25 Apr 2014 22:40:01 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8DBC511AC for ; Fri, 25 Apr 2014 22:40:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.8/8.14.8) with ESMTP id s3PMe1Rt034345 for ; Fri, 25 Apr 2014 22:40:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s3PMe1Gh034344; Fri, 25 Apr 2014 22:40:01 GMT (envelope-from gnats) Resent-Date: Fri, 25 Apr 2014 22:40:01 GMT Resent-Message-Id: <201404252240.s3PMe1Gh034344@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Alan Somers Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 50DA8F6A for ; Fri, 25 Apr 2014 22:33:57 +0000 (UTC) Received: from cgiserv.freebsd.org (cgiserv.freebsd.org [IPv6:2001:1900:2254:206a::50:4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3D88F116A for ; Fri, 25 Apr 2014 22:33:57 +0000 (UTC) Received: from cgiserv.freebsd.org ([127.0.1.6]) by cgiserv.freebsd.org (8.14.8/8.14.8) with ESMTP id s3PMXv0j083843 for ; Fri, 25 Apr 2014 22:33:57 GMT (envelope-from nobody@cgiserv.freebsd.org) Received: (from nobody@localhost) by cgiserv.freebsd.org (8.14.8/8.14.8/Submit) id s3PMXvVM083834; Fri, 25 Apr 2014 22:33:57 GMT (envelope-from nobody) Message-Id: <201404252233.s3PMXvVM083834@cgiserv.freebsd.org> Date: Fri, 25 Apr 2014 22:33:57 GMT From: Alan Somers To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Subject: kern/189003: Page fault in lacp_req() while the lagg is being destroyed X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Apr 2014 22:40:01 -0000 >Number: 189003 >Category: kern >Synopsis: Page fault in lacp_req() while the lagg is being destroyed >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Apr 25 22:40:01 UTC 2014 >Closed-Date: >Last-Modified: >Originator: Alan Somers >Release: 11.0 CURRENT >Organization: Spectra Logic >Environment: FreeBSD alans-fbsd-head 11.0-CURRENT FreeBSD 11.0-CURRENT #53 r264920M: Fri Apr 25 13:52:21 MDT 2014 alans@ns1.eng.sldomain.com:/vmpool/obj/usr/home/alans/freebsd/head/sys/GENERIC amd64 >Description: If you do an "ifconfig -am" in one thread while doing an "ifconfig lagg0 destroy" in another thread, at least two panics may result. One is in lacp_req(), caused by NULL == lsc. What happens is that the "ifconfig lagg0 destroy" thread does this: 1) lagg_clone_destroy() acquires LAGG_WLOCK(sc) 2) lagg_clone_destroy() calls lagg_lacp_detach, which calls lacp_detach, which sets sc->sc_psc = NULL 3) lagg_clone_destroy() calls LAGG_WUNLOCK(sc) then the "ifconfig status" thread does this: 1) calls lagg_ioctl(SIOCGLAGG) 2) lagg_ioctl() acquires LAGG_RLOCK(sc, &tracker) 3) lagg_ioctl() calls sc->sc_req, which dereferences to lacp_req 4) lacp_req does *lsc = LACP_SOFTC(sc), which returns NULL 5) lacp_req dereferences lsc, and panics db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe009781d380 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe009781d430 witness_warn() at witness_warn+0x4b5/frame 0xfffffe009781d4f0 trap_pfault() at trap_pfault+0x59/frame 0xfffffe009781d590 trap() at trap+0x4d5/frame 0xfffffe009781d7a0 calltrap() at calltrap+0x8/frame 0xfffffe009781d7a0 --- trap 0xc, rip = 0xffffffff81eb9b44, rsp = 0xfffffe009781d860, rbp = 0xfffffe009781d890 --- lacp_req() at lacp_req+0x14/frame 0xfffffe009781d890 lagg_ioctl() at lagg_ioctl+0x270/frame 0xfffffe009781d970 ifioctl() at ifioctl+0xbf7/frame 0xfffffe009781da30 kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe009781da90 sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe009781dae0 amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe009781dbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe009781dbf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = 0x7fffffffd808, rbp = 0x7fffffffe290 --- >How-To-Repeat: First, backout change 253687. That will increase the likelihood of hitting this panic. Run this script: #! /usr/local/bin/bash ifconfig tap0 create sleep .2 ifconfig tap1 create sleep .2 ifconfig tap2 create sleep .2 ifconfig tap0 up sleep .2 ifconfig tap1 up sleep .2 ifconfig tap2 up sleep .2 while true; do echo "About to create" ifconfig lagg0 create #sleep 0.2 echo "About to up" ifconfig lagg0 up laggproto lacp laggport tap0 laggport tap1 laggport tap2 192.0.0.2/24 sleep 0.2 echo "About to destroy" ifconfig lagg0 destroy sleep 0.2 done & while true; do ifconfig -am > /dev/null done >Fix: The purpose of lacp_req is to return LACP property information to userland when you do "ifconfig lagg0". So I think that it would be ok if it returned a block full of zeros. This would only happen while the interface is being destroyed, and userland should be able to deal with that. So my proposed fix (attached), is to simply check for NULL == lsc and return early. Patch attached with submission follows: Index: sys/net/ieee8023ad_lacp.c =================================================================== --- sys/net/ieee8023ad_lacp.c (revision 264920) +++ sys/net/ieee8023ad_lacp.c (working copy) @@ -590,10 +590,20 @@ { struct lacp_opreq *req = (struct lacp_opreq *)data; struct lacp_softc *lsc = LACP_SOFTC(sc); - struct lacp_aggregator *la = lsc->lsc_active_aggregator; + struct lacp_aggregator *la; + bzero(req, sizeof(struct lacp_opreq)); + + /* + * If the LACP softc is NULL, return with the opreq structure full of + * zeros. It is normal for the softc to be NULL while the lagg is + * being destroyed. + */ + if (NULL == lsc) + return; + + la = lsc->lsc_active_aggregator; LACP_LOCK(lsc); - bzero(req, sizeof(struct lacp_opreq)); if (la != NULL) { req->actor_prio = ntohs(la->la_actor.lip_systemid.lsi_prio); memcpy(&req->actor_mac, &la->la_actor.lip_systemid.lsi_mac, >Release-Note: >Audit-Trail: >Unformatted: