From owner-freebsd-bugs@FreeBSD.ORG Fri Mar 2 15:30:07 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 881281065675 for ; Fri, 2 Mar 2012 15:30:07 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5F5A88FC19 for ; Fri, 2 Mar 2012 15:30:07 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q22FU7D3026366 for ; Fri, 2 Mar 2012 15:30:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q22FU7kE026363; Fri, 2 Mar 2012 15:30:07 GMT (envelope-from gnats) Resent-Date: Fri, 2 Mar 2012 15:30:07 GMT Resent-Message-Id: <201203021530.q22FU7kE026363@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Vladislav Movchan Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8ADDE1065670 for ; Fri, 2 Mar 2012 15:26:30 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 6BD508FC08 for ; Fri, 2 Mar 2012 15:26:30 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q22FQUKw087151 for ; Fri, 2 Mar 2012 15:26:30 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q22FQUmv087150; Fri, 2 Mar 2012 15:26:30 GMT (envelope-from nobody) Message-Id: <201203021526.q22FQUmv087150@red.freebsd.org> Date: Fri, 2 Mar 2012 15:26:30 GMT From: Vladislav Movchan To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/165630: [ndis][panic][patch] IRQL_NOT_GREATER_THAN X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Mar 2012 15:30:07 -0000 >Number: 165630 >Category: kern >Synopsis: [ndis][panic][patch] IRQL_NOT_GREATER_THAN >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Mar 02 15:30:06 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Vladislav Movchan >Release: FreeBSD 10.0-CURRENT >Organization: >Environment: FreeBSD starlight 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r232379: Fri Mar 2 13:24:25 EET 2012 root@starlight:/usr/obj/usr/src/sys/Mephistopheles amd64 >Description: There is a race condition inside of sys/compat/ndis/subr_hal.c:KfRaiseIrql() function which could lead to "IRQL_NOT_GREATER_THAN" panic. Race could happen if after execution of this line: oldirql = KeGetCurrentIrql(); kthread got preempted and later appeared on another CPU core which has another "irq level". In this case "oldirql != DISPATCH_LEVEL" check will have no much sense as on current CPU core "irq level" could be different. As result we will skip sched_pin + mtx_lock lines when it was necessary execute them, or execute then when it necessary to skip. To fix this race we pin to current CPU core before obtaining "irq level", then we made "oldirql != DISPATCH_LEVEL" check and later we "unpin" (if we find that previous sched_pin() was not actually necessary). Panic message: panic: IRQL_NOT_GREATER_THAN cpuid = 3 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 panic() at panic+0x1cd KfLowerIrql() at KfLowerIrql+0x62 ntoskrnl_workitem_thread() at ntoskrnl_workitem_thread+0xab fork_exit() at fork_exit+0x189 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80d111acf0, rbp = 0 --- KDB: enter: panic Backtrace: #0 doadump (textdump=-787372224) at /usr/src/sys/kern/kern_shutdown.c:268 268 if (textdump && textdump_pending) { (kgdb) #0 doadump (textdump=-787372224) at /usr/src/sys/kern/kern_shutdown.c:268 #1 0xffffffff802de36c in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:573 #2 0xffffffff802de6a1 in db_command (last_cmdp=0xffffffff80dcd560, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:449 #3 0xffffffff802de8f0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0xffffffff802e0ad4 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff806d3bb1 in kdb_trap (type=3, code=0, tf=0xffffff80d111a970) at /usr/src/sys/kern/subr_kdb.c:629 #6 0xffffffff80942f68 in trap (frame=0xffffff80d111a970) at /usr/src/sys/amd64/amd64/trap.c:591 #7 0xffffffff8092c1cf in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #8 0xffffffff806d36ab in kdb_enter (why=0xffffffff80a3fb16 "panic", msg=0x80
) at cpufunc.h:63 #9 0xffffffff8069aea6 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:633 #10 0xffffffff840d6ff2 in KfLowerIrql (oldirql=Variable "oldirql" is not available. ) at /usr/src/sys/modules/ndis/../../compat/ndis/subr_hal.c:417 #11 0xffffffff840dba5b in ntoskrnl_workitem_thread (arg=Variable "arg" is not available. ) at /usr/src/sys/modules/ndis/../../compat/ndis/subr_ntoskrnl.c:2766 #12 0xffffffff8066b039 in fork_exit ( callout=0xffffffff840db9b0 , arg=0xfffffe00206a9cc0, frame=0xffffff80d111ac40) at /usr/src/sys/kern/kern_fork.c:992 #13 0xffffffff8092c6fe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:602 #14 0x0000000000000000 in ?? () #15 0x0000000000000000 in ?? () #16 0x0000000000000001 in ?? () #17 0x0000000000000000 in ?? () #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0xffffffff8240f200 in tdq_cpu () #39 0x0000000000000000 in ?? () #40 0xffffffff8240f1c0 in affinity () #41 0xfffffe0020971900 in ?? () #42 0xffffff80d111a980 in ?? () #43 0xffffff80d111a928 in ?? () #44 0xfffffe00082bf900 in ?? () #45 0xffffffff806c64ac in sched_switch (td=0xfffffe00206a9cc0, newtd=0xffffffff840db9b0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/sched_ule.c:1890 Previous frame inner to this frame (corrupt stack?) (kgdb) >How-To-Repeat: There are no special conditions necessary to reproduce this panic - just load ndis interface with traffic and wait for the race. I've used benchmarks/iperf running 10 parallel threads ("-P 10" flag) with tcp window set to 1Mb ("-w 1m" flag). Machine with ndis interface was a "client" of iperf (so it was mostly sending). Adding bpf(4) consumers (trafshow or tcpdump) allows to trigger this race/panic faster. Using faster NIC (1Gbps NIC instead of wifi) allows to trigger this race/panic faster. >Fix: Attached patch fixed this problem for me. Patch attached with submission follows: Index: /usr/src/sys/compat/ndis/subr_hal.c =================================================================== --- /usr/src/sys/compat/ndis/subr_hal.c (revision 232379) +++ /usr/src/sys/compat/ndis/subr_hal.c (working copy) @@ -392,16 +392,17 @@ { uint8_t oldirql; + sched_pin(); oldirql = KeGetCurrentIrql(); /* I am so going to hell for this. */ if (oldirql > irql) panic("IRQL_NOT_LESS_THAN"); - if (oldirql != DISPATCH_LEVEL) { - sched_pin(); + if (oldirql != DISPATCH_LEVEL) mtx_lock(&disp_lock[curthread->td_oncpu]); - } + else + sched_unpin(); /*printf("RAISE IRQL: %d %d\n", irql, oldirql);*/ return (oldirql); >Release-Note: >Audit-Trail: >Unformatted: