From owner-freebsd-stable@FreeBSD.ORG Sat Oct 27 20:46:56 2012 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 403DF524; Sat, 27 Oct 2012 20:46:56 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7665A8FC0A; Sat, 27 Oct 2012 20:46:55 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so3868801lag.13 for ; Sat, 27 Oct 2012 13:46:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=DjIENDFgGyXEh8Clk8GL7SthpRi8J2NZKdSizlEo4jY=; b=vSrMBISN3/KXeLsZJl9KcfX4E7bRg7RRMabRTMrx5WrZpK+xqRrhCQ353xnSrl/6Dr NrytOvWs0slBQgpBFXNQnF0Xrn2sVibZ95oofVN5FRHK0FJavNCkIXDdqluf8vWT/2V+ EmYq80xewrxqdvduVRvxGFrY4icAoonE9u6Y35+L00dnB0cRtVmj3hAYfWAG8eoVoOHg fIjszpy5xAi8N9cIrsVF+NXvt5qbTp90tgGtlfc8zPITuNKw0B4y44sxPPxxE4WOYIkY rR4kChdCBzyUeg/2ObQzMgKLNbqkXxNjwz0q7ryxS2TQKn4hNS94IxlDZgSOwEApbAfu ODRg== MIME-Version: 1.0 Received: by 10.152.132.3 with SMTP id oq3mr23150138lab.18.1351370813789; Sat, 27 Oct 2012 13:46:53 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.30.37 with HTTP; Sat, 27 Oct 2012 13:46:53 -0700 (PDT) In-Reply-To: References: <5022840B.3060708@omnilan.de> <5048C6D1.8020007@omnilan.de> Date: Sat, 27 Oct 2012 21:46:53 +0100 X-Google-Sender-Auth: q43k9p_yS0mpXCrqUN8hSyviGr4 Message-ID: Subject: Re: lock violation in unionfs (9.0-STABLE r230270) From: Attilio Rao To: Harald Schmalzbauer Content-Type: text/plain; charset=UTF-8 Cc: stable@freebsd.org, daichi@freebsd.org, Pavel Polyakov X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Oct 2012 20:46:56 -0000 On Sat, Sep 8, 2012 at 12:48 AM, Attilio Rao wrote: > On Thu, Sep 6, 2012 at 4:52 PM, Harald Schmalzbauer > wrote: >> schrieb Attilio Rao am 09.08.2012 20:26 (localtime): >>> On 8/8/12, Harald Schmalzbauer wrote: >>>> schrieb Pavel Polyakov am 06.03.2012 11:20 (localtime): >>>>>>> mount -t unionfs -o noatime /usr /mnt >>>>>>> >>>>>>> insmntque: mp-safe fs and non-locked vp: 0xfffffe01d96704f0 is not >>>>>>> exclusive locked but should be >>>>>>> KDB: enter: lock violation >>>>>> Pavel, >>>>>> can you give a spin to this patch?: >>>>>> http://www.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch >>>>>> >>>>>> I think that the unlocking is due at that point as the vnode lock can >>>>>> be switch later on. >>>>>> >>>>>> Let me know what you think about it and what the test does. >>>>> Thanks! >>>>> This patch fixes the problem with lock violation. Sorry I've tested it so >>>>> late. >>>> Hello, >>>> >>>> this patch still applies cleanly to RELENG_9_1. Was there another fix >>>> for the issue or has it just not been PR-sent and thus forgotten? >>> Can you and Pavel try the attached patch? Unfortunately I had no time >>> to test it, I just made in 5 free mins from a non-FreeBSD workstation, >> >> Sorry, couldn't test earlier, but now I did: >> With this patch applied the machine hangs without debug kernel and the >> latter gives the following panic: >> System call nmount returning with the following locks held: >> exclusive lockmgr ufs (ufs) r = 0 (0xc5438278) locked @ >> src/sys/fs/unionfs/union_vnops.c:1938 >> panic: witness_warn >> cpuid = 0 >> KDB: stack backtrace: >> db_trace_self_wrapper(c0a04f7f,c0c112c4,d1de3bb4,c097aa8c,fc,...) at >> db_trace_self_wrapper+0x26 >> kdb_backtrace(c0a4965f,0,c09c2ede3c1c,0,...) at kdb_backtrace+0x2a >> witness_warn(2,0,c0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4 >> syscall(d1de3d08) ar syscall+0x415 >> Xint0x80_syscall() at Xint0x80_syscall+0x21 >> --- syscall (0, FreeBSD ELF32, nosys), eip = 0x280b883f,esp = >> 0xbfbfe46c, ebp = 0xbfbfede8 --- >> KDB: enter: panic >> [ thread pid 86 tid 100054 ] >> Stopped ad kdb_enter+0x3a: movl $0,kdb_why >> db> bt >> Tracing pid 86 tid 100054 td 0xc541b000 >> kdb_enter(c0a00d16,c0a09130,0,0,0,...) at panix+0x190 >> witness_warn(2,0,x0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4 >> syscall(d1de3d08) at syscall+0x415 >> Xint0x80_syscall() at Xint0x80_syscall+0x21 >> >> Hmm, I guess I forgot to install kernel debug symbols... >> Coming back if I have more > > Unfortunately unionfs does very wrong things with the insmntque() locking. > It basically expects the vnode to return locked in the same way > requested by the precedent namei() (when that happens) but when you do > insmntque() you can only have an LK_EXCLUSIVE lock on the vnode. Hello, the following patch should workout the issues around unionfs_nodeget() a bit: http://www.freebsd.org/~attilio/unionfs_nodeget2.patch Unfortunately unionfs code is rather messy in the lookup path about locking requirements so follow what it needs to be done there is a bit difficult. I have no way to test this patch, so it is just test-compiled at the moment, but I would need that you also test lookup path (so directory "ls", find(1) on the whole unionfs volume, etc.) to validate it someway. If it panics again, please provide the kernel.debug and the vmcore.X file. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein