Date: Sat, 25 Jan 2014 16:54:59 +0100 From: =?utf-8?Q?Hans_Petter_Selasky?= <hans.petter.selasky@bitfrost.no> To: =?utf-8?Q?freebsd-fs=40freebsd=2Eorg?= <freebsd-fs@freebsd.org> Subject: FW: lock order reversals w/ backtrace Message-ID: <zarafa.52e3de53.0da5.7253d5c71778a401@mail.lockless.no>
next in thread | raw e-mail | index | archive | help
FYI - can anyone comment?
--HPS
-----Original message-----
> From:Thomas Hoffmann <trh411@gmail.com <mailto:trh411@gmail.com> >
> Sent: Saturday 25th January 2014 15:55
> To: freebsd-current <freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> >
> Cc: Hans Petter Selasky <hans.petter.selasky@bitfrost.no <mailto:hans.petter.selasky@bitfrost.no> >
> Subject: Re: lock order reversals w/ backtrace
>
> On Fri, Jan 24, 2014 at 11:47 PM, Thomas Hoffmann <trh411@gmail.com <mailto:trh411@gmail.com> > wrote:
>
> > On Fri, Jan 24, 2014 at 9:52 AM, Thomas Hoffmann <trh411@gmail.com <mailto:trh411@gmail.com> > wrote:
> >
> >> On Thu, Jan 23, 2014 at 11:49 AM, Hans Petter Selasky <
> >> hans.petter.selasky@bitfrost.no <mailto:hans.petter.selasky@bitfrost.no> > wrote:
> >>
> >>> Hi,
> >>>
> >>> Can you see if you can snap some keywords of the backtraces, like usb_xxx usbdx_xxx cam scsi or something like that.
> >>>
> >>> Else I believe there are some sysctl options to prevent the final reboot somehow so that you can write down the messages.
> >>>
> >>>
> >>> --HPS
> >>>
> >>>
> >>> -----Original message-----
> >>> > From:Thomas Hoffmann <trh411@gmail.com <mailto:trh411@gmail.com> >
> >>> > Sent: Thursday 23rd January 2014 17:15
> >>>
> >>> > To: freebsd-current <freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> >
> >>> > Subject: lock order reversals w/ backtrace
> >>> >
> >>> > A few days ago I started running 11.0-CURRENT at r260971 for the first time.
> >>>
> >>>
> >>> >
> >>> > The last couple of times I shutdown my system I noticed 2 or 3 short "lock
> >>> > order reversal" messages with accompanying backtraces scroll by. Do these
> >>> > messages represent a problem that I should report or can I ignore them as
> >>>
> >>>
> >>> > debug in nature? If I should report them, how or where do these messages
> >>> > get logged? I can find no reference to them in syslog or anywhere else upon
> >>> > my subsequent reboot.
> >>> >
> >>> > I also had a couple of these messages pop up the other day while
> >>>
> >>>
> >>> > mounting/umounting USB thumb drives. I did not think anything of them at
> >>> > the time as the mounts/umounts completed successfully.
> >>> >
> >>> > Please advise. Thanks.
> >>> >
> >>> > -Tom
> >>>
> >>> I managed to snap a photo of my screen during shutdown.
> >> Here is the full text of the first of two lock order reversals I got last
> >> night during system shutdown, both of which are zfs-related to (it
> >> appears?) unmounts. A few lines got chopped as they were out-of-frame when
> >> I took the photo:
> >>
> >> shutting down local daemons:
> >> lock order reversal:
> >> 1st 0xfffff801194f67c8 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1237
> >> 2nd 0xfffff801194f6420 syncer (syncer) @
> >> /usr/src/sys/kern/vfs_subr.c:22[..chopped...]
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at db_trace_delf_wrapper+0x2b/frame
> >> 0xfffffe01ac78[...chopped...]
> >> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe01ac784650
> >> witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe01ac7846e0
> >> __lockmgr_args() __lockmgr_args+0x878/frame 0xfffffe01ac784810
> >> vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe01ac784830
> >> VOP_LOCK1_APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01ac784860
> >> _vn_lock() at _vn_lock+0xab/frame 0xfffffe01ac7848d0
> >> vputx() at vputx+0x240/frame 0xfffffe01ac784930
> >> dounmount at dounmount+0x327/frame 0xfffffe01ac7849b0
> >> sys_unmount() at sys_unmount+0x356/frame 0xfffffe01ac784ac0
> >> amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe01ac784bf0
> >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe01ac784bf0
> >> --- syscall (22, FreeBSD ELF64, sys_unmount, rip = 0x80191c72a,
> >> rsp[...chopped...]
> >>
> >> I have a zpool on an external USB HDD that I export at shutdown via
> >> rc.shutdown.local. Don't know if that is contributing to this or not. When
> >> I get a chance to bring down the system I will manually export it before
> >> shutdown to see if that makes any difference.
> >>
> >
> > Was able to capture the full text of the lock order reversal I've been
> > experiencing at shutdown. Turns out to be associated with the export of one
> > of my zpools that is hosted on an external USB HDD. Normally I export the
> > zpool from rc.shutdown.local, but tonight I exported it manually before I
> > shutdown and got the following.
> >
> > lock order reversal:
> > : 1st 0xfffff8011952b5f0 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1237
> > : 2nd 0xfffff8011952b068 syncer (syncer) @
> > /usr/src/sys/kern/vfs_subr.c:2212
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfffffe01add6e5a0
> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe01add6e650
> > witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe01add6e6e0
> > __lockmgr_args() at __lockmgr_args+0x878/frame 0xfffffe01add6e810
> > vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe01add6e830
> > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01add6e860
> > _vn_lock() at _vn_lock+0xab/frame 0xfffffe01add6e8d0
> > vputx() at vputx+0x240/frame 0xfffffe01add6e930
> > dounmount() at dounmount+0x327/frame 0xfffffe01add6e9b0
> > sys_unmount() at sys_unmount+0x356/frame 0xfffffe01add6eae0
> > amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe01add6ebf0
> > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe01add6ebf0
> > --- syscall (22, FreeBSD ELF64, sys_unmount), rip = 0x80191c72a, rsp =
> > 0x7fffffffc4c8, rbp = 0x7fffffffc960 ---
> >
> > When imported and mounted, the zpool reports no errors and I have not
> > experienced any anomalies reading or writing to the zpool. Should I be
> > concerned?
> >
> > Hmm, seems like I'm the only one interested in this.
>
> I think I have resolved the problem with the zpool on the external USB HDD.
> I copied off the data, destroyed the datasets and zpool, recreated the
> zpool and datasets and restored the data. I have executed several exports
> on the zpool and have received no "lock order reversal" messages.
>
> I am still getting lock order reversal messages for my bootpool and zroot
> zpools at shutdown. Recreating bootpool would be relatively easy, but if
> I'm going to recreate zroot, I might as well do a reinistall, am I am not
> prepared to do at this time. So I'm still in search of an alternative way
> to resolve this issue. My zpools all show "no known data errors", scrub
> cleanly, zpool upgrade reports that all zpools support all features, and
> I've not experienced any other issues related to my zpools.
>
> Based on the above lock order reversal data, the error is in this block of
> code (, specifically /usr/src/sys/kern/vfs_mount.c:1237)
>
> if ((coveredvp = mp->mnt_vnodecovered) != NULL) {
> mnt_gen_r = mp->mnt_gen;
> VI_LOCK(coveredvp);
> vholdl(coveredvp);
> vn_lock(coveredvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_RETRY);
> /* LINE 1237 */
> vdrop(coveredvp);
> /*
> * Check for mp being unmounted while waiting for the
> * covered vnode lock.
> */
> if (coveredvp->v_mountedhere != mp ||
> coveredvp->v_mountedhere->mnt_gen != mnt_gen_r) {
> VOP_UNLOCK(coveredvp, 0);
> return (EBUSY);
> }
> }
>
>
> -Tom
> _______________________________________________
> freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current <http://lists.freebsd.org/mailman/listinfo/freebsd-current>
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org <mailto:freebsd-current-unsubscribe@freebsd.org> "
>
From owner-freebsd-fs@FreeBSD.ORG Sat Jan 25 16:28:19 2014
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
[IPv6:2001:1900:2254:206a::19:1])
(using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
(No client certificate requested)
by hub.freebsd.org (Postfix) with ESMTPS id 68EDA7F4;
Sat, 25 Jan 2014 16:28:19 +0000 (UTC)
Received: from tensor.andric.com (unknown
[IPv6:2001:7b8:3a7:1:2d0:b7ff:fea0:8c26])
(using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
(No client certificate requested)
by mx1.freebsd.org (Postfix) with ESMTPS id 131141E7F;
Sat, 25 Jan 2014 16:28:19 +0000 (UTC)
Received: from [IPv6:2001:7b8:3a7::719e:bbe8:ba8a:34e0] (unknown
[IPv6:2001:7b8:3a7:0:719e:bbe8:ba8a:34e0])
(using TLSv1 with cipher AES128-SHA (128/128 bits))
(No client certificate requested)
by tensor.andric.com (Postfix) with ESMTPSA id E54C85C44;
Sat, 25 Jan 2014 17:28:12 +0100 (CET)
Subject: Re: BUG: possible NULL pointer dereference in nfs server
Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\))
Content-Type: multipart/signed;
boundary="Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04";
protocol="application/pgp-signature"; micalg=pgp-sha1
X-Pgp-Agent: GPGMail 2.1 (6062eb4)
From: Dimitry Andric <dim@FreeBSD.org>
In-Reply-To: <1577222508.16050114.1390610315654.JavaMail.root@uoguelph.ca>
Date: Sat, 25 Jan 2014 17:27:58 +0100
Message-Id: <97D8A7EA-12F3-4EFC-A933-BD8528B6306A@FreeBSD.org>
References: <1577222508.16050114.1390610315654.JavaMail.root@uoguelph.ca>
To: Rick Macklem <rmacklem@uoguelph.ca>
X-Mailer: Apple Mail (2.1827)
Cc: Roman Divacky <rdivacky@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Jan 2014 16:28:19 -0000
--Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
On 25 Jan 2014, at 01:38, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> Roman Divacky wrote:
>> Hi,
>>
>> In nfs_nfsdstate.c:nfsrv_lockctrl() we call
>>
>> getlckret = nfsrv_getlockfh(vp, new_stp->ls_flags, NULL, &nfh, p);
>>
>> then in nfsrv_getlockfh() we, based on the value of flags, might
>> dereference the NULL pointer:
>>
>>
>> nfsrv_getlockfh(vnode_t vp, u_short flags,
>> struct nfslockfile **new_lfpp, fhandle_t *nfhp, NFSPROC_T *p)
>>
>>
>> if (flags & NFSLCK_OPEN) {
>> new_lfp = *new_lfpp;
>> fhp = &new_lfp->lf_fh;
>>
>>
> I took a look and don't see a problem. NFSLCK_OPEN is only set for Opens
> { nfsrvd_open() } and it never calls nfsrv_lockctrl(). nfsrv_lockctrl() is
> always called with other flags in new_stp->ls_flags set, but never NFSLCK_OPEN,
> because nfsrv_lockctrl() is handling byte range lock cases after a file has been
> opened { which means the "else" case for this "if" always applies.
The point is that the compiler cannot always know this, when it inlines
nfsrv_getlockfh(). In the first invocation of nfsrv_getlockfh(), there
is no problem:
712 APPLESTATIC void
713 nfsrv_dumplocks(vnode_t vp, struct nfsd_dumplocks *ldumpp, int maxcnt,
714 NFSPROC_T *p)
715 {
...
726 ret = nfsrv_getlockfh(vp, 0, NULL, &nfh, p);
Here both the 'flags' and 'new_lfpp' arguments are zero, so the part in
nfsrv_getlockfh() where it writes to a NULL pointer can never be
executed, therefore it can be optimized away safely.
In the next invocation, the state of the 'flags' argument is not known
to the compiler, however:
1369 tryagain:
1370 if (new_stp->ls_flags & NFSLCK_LOCK)
1371 MALLOC(other_lop, struct nfslock *, sizeof (struct nfslock),
1372 M_NFSDLOCK, M_WAITOK);
...
1387 getlckret = nfsrv_getlockfh(vp, new_stp->ls_flags, NULL, &nfh, p);
If it inlines this, the result looks approximately like:
1 {
2 fhandle_t *fhp = NULL;
3 struct nfslockfile *new_lfp;
4 int error;
5
6 if (new_stp->ls_flags & NFSLCK_OPEN) {
7 new_lfp = *NULL;
8 fhp = &new_lfp->lf_fh;
9 } else if (&nfh) {
10 fhp = &nfh;
11 } else {
12 panic("nfsrv_getlockfh");
13 }
14 error = nfsvno_getfh(vp, fhp, p);
15 NFSEXITCODE(error);
16 getlckret = error;
17 }
The code in line 7 is the problematic part. Since this is undefined,
the compiler inserts a trap instruction here. I think the problem Roman
encountered is that on sparc64, there is no equivalent to x86's ud2
instruction, so it inserts a call to abort() instead, and that function
is not available in kernel-land.
...
> Sorry, I'm not a compiler guy, so I don't know why a compiler would
> generate a trap instruction, but since new_lfpp is never NULL when
> this is executed, I don't see a problem.
>
> If others feel that this needs to be re-coded, please let me know what
> you think the code should look like? (A test for non-NULL with a panic()
> before it is used?)
>
> Is a trap instruction that never gets executed a problem?
It's better to avoid undefined behavior in any case. Just add a NULL
check, that should be sufficient.
-Dimitry
--Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename=signature.asc
Content-Type: application/pgp-signature;
name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
iEYEARECAAYFAlLj5hkACgkQsF6jCi4glqO2hACeKUFQagxvdOrYIgc3YqdmERwq
VbIAnR5Td7N+Dqjj4BAiAN8nVTmqkW2A
=9r0K
-----END PGP SIGNATURE-----
--Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?zarafa.52e3de53.0da5.7253d5c71778a401>
