From owner-freebsd-hackers@FreeBSD.ORG Wed Dec 5 16:00:35 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1E00FF for ; Wed, 5 Dec 2012 16:00:35 +0000 (UTC) (envelope-from timon@timon.net.nz) Received: from frost.plasmahost.ru (plasmahost.ru [178.63.60.242]) by mx1.freebsd.org (Postfix) with ESMTP id 52DB78FC12 for ; Wed, 5 Dec 2012 16:00:34 +0000 (UTC) Received: from timon.home.timon.net.nz ([87.242.97.4]) (AUTH: PLAIN timon@timon.net.nz, TLS: TLSv1/SSLv3,256bits,CAMELLIA256-SHA) by frost.plasmahost.ru with ESMTPSA; Wed, 05 Dec 2012 15:54:37 +0000 id 0000F5EA.0000000050BF6E3E.0000A98A Message-ID: <50BF6E6B.2070203@timon.net.nz> Date: Wed, 05 Dec 2012 19:55:23 +0400 From: Alexandr Matveev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: sleepq problem Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Dec 2012 16:00:35 -0000 Hello, I'm writing a storage controller driver for 9.0-RELEASE-p4 and i'm using sleepq at initialization to sleep until command is processed by controller: struct command { <...> uint8_t done; }; void send_command_and_wait(struct command *cmd) { command->done = 0; send_command(cmd); for (;;) { sleepq_lock(&command->done); if (command->done) break; sleepq_add(&command->done, NULL, "wait for completion", SLEEPQ_SLEEP, 0); sleepq_wait(&command->done, 0); } sleepq_release(&command->done); } Interrupt handler calls special function when command is processed: void command_finish(struct command *cmd) { sleepq_lock(&command->done); command->done = 1; sleepq_signal(&command->done, SLEEPQ_SLEEP, 0, 0); sleepq_release(&command->done); } This code panics very often with following messages: Sleeping thread (tid 100248, pid 1859) owns a non-sleepable lock sched_switch() at sched_switch+0xf1 mi_switch() at mi_switch+0x170 sleepq_wait() at sleepq_wait+0x44 send_command_and_wait() at send_command_with_retry+0x77 <...> panic: sleeping thread cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 panic() at panic+0x187 propagate_priority() at propagate_priority+0x161 turnstile_wait() at turnstile_wait+0x1b8 _mtx_lock_sleep() at _mtx_lock_sleep+0xb0 _mtx_lock_flags() at _mtx_lock_flags+0x96 softclock() at softclock+0x25e intr_event_execute_handlers() at intr_event_execute_handlers+0x66 ithread_loop() at ithread_loop+0x96 fork_exit() at fork_exit+0x11d fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80002fad00, rbp = 0 --- Where tid 100248 is my driver thread which is sleeping & waiting for command completion: db> show thread 100248 Thread 100243 at 0xfffffe0146aa98c0: proc (pid 1859): 0xfffffe02a6815488 name: kldload stack: 0xffffff8464bf2000-0xffffff8464bf5fff flags: 0x4 pflags: 0 state: INHIBITED: {SLEEPING} wmesg: wait for completion wchan: 0xffffff8464c1e244 priority: 127 container lock: sleepq chain (0xffffffff81101af8) But I can't understand what goes wrong. Sleepq chain lock is owned by the other thread: db> show lock 0xffffffff81101af8 class: spin mutex name: sleepq chain flags: {SPIN, RECURSE} state: {OWNED} owner: 0xfffffe0008377000 (tid 100019, pid 12, "swi4: clock") Unfortunately, I can't find any examples of using sleepq in drivers. What am I missing or don't understand? -- Alexandr Matveev