From owner-freebsd-fs@freebsd.org Thu Jul 2 01:14:05 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E12619924DE for ; Thu, 2 Jul 2015 01:14:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 93BDB1669 for ; Thu, 2 Jul 2015 01:14:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2ByAwA6j5RV/61jaINbDoQ9gxm6GgmHZgKCAhQBAQEBAQEBgQqEIgEBAQMBI1YFCwIBCBgCAg0HEgICVwIEiDoItheWXgEBAQEGAQEBAQEBHIEhiimEUjQHGIJQgUMFjBeHeY0bky6DWwImY4EpHIEUWiKBd4ECAQEB X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="221510675" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Jul 2015 21:14:03 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 7406415F533; Wed, 1 Jul 2015 21:14:03 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id PE6b5B0OBMva; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id DD8B215F54D; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ZT-L3EUcG4Km; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BB57315F533; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Date: Wed, 1 Jul 2015 21:14:02 -0400 (EDT) From: Rick Macklem To: Graham Allan Cc: freebsd-fs@freebsd.org Message-ID: <1203156989.2786078.1435799642755.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55948916.4080405@physics.umn.edu> References: <55946FFE.8070402@physics.umn.edu> <972685551.2776991.1435795831472.JavaMail.zimbra@uoguelph.ca> <55948916.4080405@physics.umn.edu> Subject: Re: Strange NFS problem implicating nfsuserd? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Strange NFS problem implicating nfsuserd? Thread-Index: pO8+2GSPF7vYELtZlb+jdlxc7cDdOA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 01:14:05 -0000 Graham Allan wrote: > On 7/1/2015 7:10 PM, Rick Macklem wrote: > >> > >> I've reproduced this across 4-5 different servers and a similar number > >> of different client systems. I'm wondering if any plausible explanation > >> suggests itself? > >> > > > > As far as I know, the domain is only set when > > the nfsuserd is started and it just uses the domain part of the machine's > > host name if not explicitly defined by "-domain". Maybe there is some bug > > in nfsuserd.c that gets tickled by the option, although I just looked and > > the argument parsing looks ok. > > > > If your xxx.yyy.zzz is identical, then I can't see how this would affect > > anything. > > > > What will cause intermittent mapping problems is having more than one > > username that maps to the same uid. (One of them will be cached at random.) > > (There was a common case of both "root" and "toor" in the password database > > for uid == 0.) > > Yes, on the face of it this report appears crazy to me too :-) > > If I hadn't tried a dozen other things including reverting FreeBSD patch > level, linux kernel/package versions, tweaking/checking ldap lookup > settings (nslcd etc), before simply removing the "domain=" argument to > nfsuserd, I wouldn't believe it possible. I also took a quick look > through nfsuserd.c and couldn't see anything to explain it. I want to > think something else must be going on, but adding or removing that > parameter appears to toggle the problem on and off deterministically. > > I was always able to get a failure within 10-60 minutes or so, so having > the nfsuserd cache timeout at 600 minutes seems like it should eliminate > any intermittent id lookup issues. > I'll take another look at nfsuserd.c. Maybe it does something stupid like getting the length of the argument wrong (trailing blank or null or something like that, that doesn't show up when it is printed out). All I can think of is a subtle bug in nfsuserd.c when the argument is specified. > I guess I could try... > (1) rpcdebug on the linux client, though I'm not sure which flags to > enable to log idmapping issues. > (2) watch nfsuserd with truss and look for different behaviors. > (3) capture NFS traffic, examine with wireshark > I'd try #3 if I were you and see if the owner and owner_group names look right. I'll post if I find anything in nfsuserd.c, rick > Graham >