Date: Mon, 29 Jul 2013 22:48:37 -0400 From: J David <j.david.lists@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Konstantin Belousov <kostikbel@gmail.com>, Steven Hartland <killing@multiplay.co.uk>, freebsd-stable@freebsd.org, re <re@freebsd.org>, Michael Tratz <michael@esosoft.com> Subject: Re: NFS deadlock on 9.2-Beta1 Message-ID: <CABXB=RTSKtUUkmrq409ASO7xRXH0kLDepz2qoB11pK8XAhxXvw@mail.gmail.com> In-Reply-To: <1710471570.3603170.1375141032147.JavaMail.root@uoguelph.ca> References: <F20E755D-EE01-4411-8790-1E2BC7D8CD5D@esosoft.com> <1710471570.3603170.1375141032147.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
If it is helpful, we have 25 nodes testing the 9.2-BETA1 build and without especially trying to exercise this bug, we found sendfile()-using processes deadlocked in WCHAN newnfs on 5 of the 25 nodes. The ones with highest uptime (about 3 days) seem most affected, so it does seem like a "sooner or later" type of thing. Hopefully the fix is easy and it won't be an issue, but it definitely does seem like a problem 9.2-RELEASE would be better off without. Unfortunately we are not in a position to capture the requested debugging information at this time; none of those nodes are running a debug version of the kernel. If Michael is unable to get the information as he hopes, we can try to do that, possibly over the weekend. For the time being, we will convert half the machines to rollback r250907 to try to confirm that resolves the issue. Thanks all! If one has to encounter a problem like this, it is nice to come to the list and find the research already so well underway!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RTSKtUUkmrq409ASO7xRXH0kLDepz2qoB11pK8XAhxXvw>