From owner-freebsd-current@FreeBSD.ORG Sun Dec 6 19:04:10 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6EC05106568F for ; Sun, 6 Dec 2009 19:04:10 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-fx0-f209.google.com (mail-fx0-f209.google.com [209.85.220.209]) by mx1.freebsd.org (Postfix) with ESMTP id 007618FC0C for ; Sun, 6 Dec 2009 19:04:09 +0000 (UTC) Received: by fxm2 with SMTP id 2so1442730fxm.13 for ; Sun, 06 Dec 2009 11:04:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=2Fj5QatopccQzvGApzgVKbH7rctGzYnGxy8ocG6/Mjs=; b=bR4QiVv33MgAc2FLlyBRi5IiLNd03ZgMCUtFjfPoRLPRU8v4yYbiCQ+w4cmhHT+d6I KKnK9ilzwcvn7Zg7UTWafKSN9UwFGpjlzF5N4rRzSuOK7PnH9efhuOUOgOo60Elb+dwu MheOPotWOuEavBTIZg6/bfAUFWIAVRw4yhxRk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=K6ggJxlficuqRXTs++A6OONhQPGH5RQqXA/1gFCzSlwWMm0LBEFwBsDf0K3IHH1OkG Zty29Hxn0J+WyPJNdnfGw7m0pTr1lCKdteND/5ecqxSl+oU1RUqIQSD4QVuBTR3TTQlu Ai2PX7zg2r9pDEm2qLLSxv3zQpHfG4B/Q4zSQ= MIME-Version: 1.0 Sender: asmrookie@gmail.com Received: by 10.223.6.9 with SMTP id 9mr849696fax.84.1260126248519; Sun, 06 Dec 2009 11:04:08 -0800 (PST) In-Reply-To: <4B1BBEC4.7040906@icyb.net.ua> References: <4B1B9600.4080709@icyb.net.ua> <4B1BBEC4.7040906@icyb.net.ua> Date: Sun, 6 Dec 2009 20:04:08 +0100 X-Google-Sender-Auth: e2df55e701fbd02e Message-ID: <3bbf2fe10912061104j53ef5be2yb1019699308b0473@mail.gmail.com> From: Attilio Rao To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org Subject: Re: process stuck in stat/../cache_lookup: ktorrent, zfs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Dec 2009 19:04:10 -0000 2009/12/6 Andriy Gapon : > on 06/12/2009 13:31 Andriy Gapon said the following: >> System is recent 9-current, amd64. >> I see that sometimes ktorrent gets stuck during heavy download (multiple= files >> in parallel, high speed). =C2=A0It is completely unresponsive and not ki= llable even >> with SIGKILL. > [snip] >> #0 =C2=A0sched_switch (td=3D0xffffff012a6c5700, newtd=3D0xffffff00015333= 80, >> flags=3DVariable "flags" is not available. >> ) at /usr/src/sys/kern/sched_ule.c:1865 >> #1 =C2=A00xffffffff80374baf in mi_switch (flags=3D260, newtd=3D0x0) at >> /usr/src/sys/kern/kern_synch.c:449 >> #2 =C2=A00xffffffff803a795b in sleepq_switch (wchan=3DVariable "wchan" i= s not available. >> ) at /usr/src/sys/kern/subr_sleepqueue.c:509 >> #3 =C2=A00xffffffff803a8645 in sleepq_wait (wchan=3D0xffffff0105b457f8, = pri=3D80) at >> /usr/src/sys/kern/subr_sleepqueue.c:588 >> #4 =C2=A00xffffffff80351184 in __lockmgr_args (lk=3D0xffffff0105b457f8, = flags=3D2097408, >> ilk=3D0xffffff0105b45820, wmesg=3DVariable "wmesg" is not available. >> ) at /usr/src/sys/kern/kern_lock.c:216 > > So some more data: > (kgdb) fr 4 > > #4 =C2=A00xffffffff80351184 in __lockmgr_args (lk=3D0xffffff0105b457f8, f= lags=3D2097408, > ilk=3D0xffffff0105b45820, wmesg=3DVariable "wmesg" is not available. > ) at /usr/src/sys/kern/kern_lock.c:216 > 216 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= sleepq_wait(&lk->lock_object, pri); > (kgdb) p *lk > $8 =3D {lock_object =3D {lo_name =3D 0xffffffff80ad55b6 "zfs", lo_flags = =3D 91947008, > lo_data =3D 0, lo_witness =3D 0x0}, lk_lock =3D 3, lk_timo =3D 51, lk_pri= =3D 80} > (kgdb) p/x flags > $9 =3D 0x200100 > (kgdb) p/x lk->lock_object.lo_flags > $12 =3D 0x57b0000 > > Apparently sleeplk is inlined into __lockmgr_args. > > So it looks like this is a LK_SHARED|LK_INTERLOCK lockmgr call which has = not > taken any easy path and ended up in sleepq_wait, but wakeup never comes f= or it, > perhaps missed? I think that a 'missed wakeup' is a too fast (and wrong) conclusion. here the problem is that the lock is held in shared mode (lk->lk_lock =3D 3) so you would need to know what happened to the owners once they got the lock. The only way you can do that, though, is with shared acquisitions, then you should try to reproduce it with WITNESS on. Once you have such datas we could digg further. Attilio --=20 Peace can only be achieved by understanding - A. Einstein