From owner-freebsd-stable@FreeBSD.ORG Sat Feb 6 17:20:34 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28F201065676; Sat, 6 Feb 2010 17:20:34 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id B0A028FC08; Sat, 6 Feb 2010 17:20:33 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id o16HKTst062500; Sat, 6 Feb 2010 18:20:30 +0100 (CET) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id o16HKTOW062499; Sat, 6 Feb 2010 18:20:29 +0100 (CET) (envelope-from marius) Date: Sat, 6 Feb 2010 18:20:29 +0100 From: Marius Strobl To: Rick Macklem Message-ID: <20100206172029.GK77522@alchemy.franken.de> References: <20100126073336.GA1955@server.vk2pj.dyndns.org> <20100131010618.GA1864@server.vk2pj.dyndns.org> <20100131162854.GC77522@alchemy.franken.de> <201002010946.57253.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: rmacklem@freebsd.org, dfr@freebsd.org, freebsd-stable@freebsd.org, Peter Jeremy , John Baldwin Subject: Re: uma_zalloc_arg complaining about non-sleepable locks X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Feb 2010 17:20:34 -0000 On Mon, Feb 01, 2010 at 11:54:54AM -0500, Rick Macklem wrote: > > > On Mon, 1 Feb 2010, John Baldwin wrote: > > >>>I'd say that your patch works. > >> > >>John, are you okay with that patch? > >>http://people.freebsd.org/~marius/fha_extract_info_realign2.diff > >> > >>It's intention is to: > >>- Move nfs_realign() from the NFS client to the shared NFS code and > >> remove the NFS server version in order to reduce code duplication. > >> The shared version now uses a second parameter how, which is passed > >> on to m_get(9) and m_getcl(9) as the server used M_WAIT while the > >> client requires M_DONTWAIT, and replaces the the previously unused > >> parameter hsiz. > > I don't think the client side can use M_WAIT/M_WAITOK if it wants to. > (I believe that it was once called from the socket upcall and that was > why M_DONTWAIT was used, but that isn't how it is called in the client > now.) > > I mentioned this to dfr@ because I use a version shared between client > and server for the experimental one that does M_WAIT, but he didn't > think the regular client needed to be changed. Basically, I think the > current code in the regular client can return ENOMEM for I/O system calls, > which probably isn't what the app. expected. In the NFS client, you end up > with this "do I wait forever?" or "do I return an error to an I/O system > call which an app. doesn't expect" issue cropping up here and there. I > don't know the correct answer to this, but tend to lean in the "wait > forever" direction. Well, as demonstrated by the problem with fha_extract_info() "waiting forever" must also match the locking so I'd rather leave chanGing the regular NFS client to someone who understands it better than I do :) Btw., newnfs_realign() should use #ifdef __NO_STRICT_ALIGNMENT instead of #if defined(__i386__) in order to bypass realigning on platforms which can deal with unaligned accesses. > > >>- Change nfs_realign() to use nfsm_aligned() so as with other NFS code > >> the alignment check isn't actually performed on platforms without > >> strict alignment requirements for performance reasons because as the > >> comment suggests only occasionally occurs with TCP. > >>- Change fha_extract_info() to use nfs_realign() with M_NOWAIT rather > >> than M_DONTWAIT because it's called with the RPC sp_lock held. > >> > > Btw, from an historical perspective, this was originally added for the > DEC PMAX port (MIPS R2000), which would trap whenever an unaligned ptr > was used. Then, there was this involved chunk of MIPS assembly code that > worked around the unaligned ptr access and the trap returned. Obviously > this was a major performance hit and happened fairly frequently for NFS > over ISO TP4. As you've seen, it happens infrequently enough over TCP > (back then, I think it was only when the TCP window size ended up really > small, although I'm way out of date on the TCP stack, so I have no idea > now:-) that I'd only do the re-alignment on achitectures that will crash > if it isn't done? > > I missed the beginning of this. Was there crashes occurring because > the alignment wasn't being done or problems because the alignment > wasn't working correctly? (I guess I'm trying to say that, if the > arch works without nfs_realign(), then you might consider just not > doing it for that arch. I suppose that isn't a good enough reason > to not fix the function, though?;-) The original problem with fha_extract_info() was that it didn't realign the mbuf data at all but called nfsm_srvmtofh_xx() which assumes the 4-byte alignment required by XDR. As mentioned above, changing nfs_realign() to use nfsm_aligned() has the effect you propose of not performing the realignment on platforms which can deal with unaligned acesses. Actually one could take this one step further and #ifdef the whole nfs_realign() like you do with newnfs_realign() but nfsm_aligned() already existed and I'm reluctant to rototill foreign code too much. Also the compiler should be smart enough to optimize this down to just incrementing nfs_realign_test when __NO_STRICT_ALIGNMENT is defined. Marius