From owner-freebsd-current@FreeBSD.ORG Mon Jun 28 17:45:38 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 048681065675; Mon, 28 Jun 2010 17:45:38 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5A2CB8FC16; Mon, 28 Jun 2010 17:45:36 +0000 (UTC) Received: by gyf3 with SMTP id 3so802282gyf.13 for ; Mon, 28 Jun 2010 10:45:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=pCxAcdXlTG8D2rpiJUDkFLV/NJMRG7cX953G5w0FX5k=; b=DqURB3T67yBW1IITqh9vctDoUPIkWw2lpLOWFwxEOMetQTMOHA9NrR2cvb0eLjRnhp m/2dhaRstyHCQ7BG9rIzojYHKv8y0Y5hAE7o2n8GeGTwqLh+/lpQ4XSrRscvTeKot+yB cP+1dLJQ3P9s/cvRVmcrW1iIN5CXdU1l5S0e8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=ZyvM5MeSdwa6Bw6eXGN045ZLFuHgf2sNTxEApUYLUUhq52xif39BVRmRGqFhgmWyJG G34VtWSDwh/lWDZZdKGg9Aho9Dfh/9u6AoIn4vJfMqVvHQLWdH3IT0kja6pqqiTHPuHQ ByQdPGD/iHrC5YDHWKbjIu4BjxiAsZPVpH5Bs= MIME-Version: 1.0 Received: by 10.229.211.81 with SMTP id gn17mr1348078qcb.83.1277747129842; Mon, 28 Jun 2010 10:45:29 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.229.44.136 with HTTP; Mon, 28 Jun 2010 10:45:29 -0700 (PDT) In-Reply-To: <201006281132.57541.jhb@freebsd.org> References: <201006281132.57541.jhb@freebsd.org> Date: Mon, 28 Jun 2010 19:45:29 +0200 X-Google-Sender-Auth: gdX9PN1KwJLjln2vX3QHldHXsUQ Message-ID: From: Attilio Rao To: John Baldwin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, pluknet , Anton Yuzhaninov Subject: Re: panic in deadlkres X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jun 2010 17:45:38 -0000 2010/6/28 John Baldwin : > On Friday 25 June 2010 4:52:22 pm pluknet wrote: >> On 25 June 2010 13:50, Anton Yuzhaninov wrote: >> > I've got panic on 9-current from Jun 25 2010 >> > >> > May be this is bug in deadlock resolver >> > >> > panic: blockable sleep lock (sleep mutex) process lock @ >> > /usr/src/sys/kern/kern_clock.c:203 >> > >> > db> show alllocks >> > Process 0 (kernel) thread 0xc4dcd270 (100047) >> > shared sx allproc (allproc) r =3D 0 (0xc0885ebc) locked @ >> > /usr/src/sys/kern/kern_clock.c:193 >> > >> > db> show lock 0xc4dcd270 >> > =C2=A0class: spin mutex >> > =C2=A0name: D >> > =C2=A0flags: {SPIN, RECURSE} >> > =C2=A0state: {OWNED} >> > >> > (kgdb) bt >> > #0 =C2=A0doadump () at pcpu.h:248 >> > #1 =C2=A00xc05ae59f in boot (howto=3D260) at > /usr/src/sys/kern/kern_shutdown.c:416 >> > #2 =C2=A00xc05ae825 in panic (fmt=3DVariable "fmt" is not available. >> > ) at /usr/src/sys/kern/kern_shutdown.c:590 >> > #3 =C2=A00xc048ff45 in db_panic (addr=3DCould not find the frame base = for > "db_panic". >> > ) at /usr/src/sys/ddb/db_command.c:478 >> > #4 =C2=A00xc0490533 in db_command (last_cmdp=3D0xc086ef1c, cmd_table= =3D0x0, > dopager=3D1) at /usr/src/sys/ddb/db_command.c:445 >> > #5 =C2=A00xc0490662 in db_command_loop () at /usr/src/sys/ddb/db_comma= nd.c:498 >> > #6 =C2=A00xc04923ef in db_trap (type=3D3, code=3D0) at > /usr/src/sys/ddb/db_main.c:229 >> > #7 =C2=A00xc05dade6 in kdb_trap (type=3D3, code=3D0, tf=3D0xc4b31bd0) = at > /usr/src/sys/kern/subr_kdb.c:535 >> > #8 =C2=A00xc078696b in trap (frame=3D0xc4b31bd0) at > /usr/src/sys/i386/i386/trap.c:692 >> > #9 =C2=A00xc076ca0b in calltrap () at /usr/src/sys/i386/i386/exception= .s:165 >> > #10 0xc05daf30 in kdb_enter (why=3D0xc07ea02d "panic", msg=3D0xc07ea02= d > "panic") at cpufunc.h:71 >> > #11 0xc05ae806 in panic (fmt=3D0xc07efd94 "blockable sleep lock (%s) %= s @ > %s:%d") at /usr/src/sys/kern/kern_shutdown.c:573 >> > #12 0xc05ee30b in witness_checkorder (lock=3D0xc5148088, flags=3D9, > file=3D0xc07e3b20 "/usr/src/sys/kern/kern_clock.c", line=3D203, interlock= =3D0x0) >> > =C2=A0 =C2=A0at /usr/src/sys/kern/subr_witness.c:1067 >> > #13 0xc05a093c in _mtx_lock_flags (m=3D0xc5148088, opts=3D0, file=3D0x= c07e3b20 > "/usr/src/sys/kern/kern_clock.c", line=3D203) >> > =C2=A0 =C2=A0at /usr/src/sys/kern/kern_mutex.c:200 >> > #14 0xc05706a9 in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 >> > #15 0xc0588721 in fork_exit (callout=3D0xc05705ea , arg=3D0= x0, > frame=3D0xc4b31d38) at /usr/src/sys/kern/kern_fork.c:843 >> > #16 0xc076ca80 in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:270 >> >> Hi! >> >> [throw in ideas (just ignore them if they're dumb, thinking badly atm).] >> >> AFAIK, that indicates that some thread already has >> a spin mutex and then it tries to acquire a sleep mutex. >> >> Looks like kern/kern_clock.c v1.213 (SVN rev 206482) >> has a regression in handling ticks wrap-up >> w.r.t. it doesn't release a thread mutex, does it? > > This looks like a correct analysis to me. > >> >From subr_witness.c: >> 1062: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Since sp= in locks include a critical section, this > check >> 1063: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * implicit= ly enforces a lock order of all sleep >> locks before >> 1064: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * all spin= locks. >> 1065: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */ >> 1066: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (td->td_= critnest !=3D 0 && !kdb_active) >> 1067: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0panic("blockable sleep lock (%s) %s @ %s:%d", >> 1068: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0class->lc_name, lock->lo_name, file, line); >> >> >From kern_clock.c, v1.213 (in several places, while holding a thread lo= ck): >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Handle ticks wra= p-up. */ >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (ticks < td->td_= blktick) >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 continue; >> >> Should not it be like the next: >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Handle ticks wra= p-up. */ >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (ticks < td->td_= blktick) { >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 thread_unlock(td); >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 continue; >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } >> >> The precondition idea to reproduce it is to lock a subject thread >> in some deadlkres callout, handle re-wrap condition, then try >> to lock a process to witch the thread belongs in (n+m)'th deadlkres >> callout, or in different context. Thanks, that may be fixed in r209577. Attilio --=20 Peace can only be achieved by understanding - A. Einstein