Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Dec 2014 15:52:41 +0000
From:      Mike Gelfand <Mike.Gelfand@LogicNow.com>
To:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, "hackers@freebsd.org" <hackers@freebsd.org>
Subject:   Re: [BUG] Getting path to program binary sometimes fails
Message-ID:  <27C465FC-E8C7-44CB-A812-65213BB8AC9F@logicnow.com>
In-Reply-To: <2066750.N3TZpYSHCy@ralph.baldwin.cx>
References:  <91809230-5E81-4A6E-BFD6-BE8815A06BB2@logicnow.com> <201411201125.30087.jhb@freebsd.org> <BC392D92-5DD4-4012-90D4-17C4BC1566CE@logicnow.com> <2066750.N3TZpYSHCy@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help

On Dec 5, 2014, at 6:19 PM, John Baldwin <jhb@freebsd.org> wrote:

> On Friday, December 05, 2014 12:01:15 PM Mike Gelfand wrote:
>> John,
>> 
>> Sorry for late reply.
>> 
>> On Nov 20, 2014, at 7:25 PM, John Baldwin <jhb@freebsd.org> wrote:
>>>> Since you’re saying that current behavior is not a defect, maybe
>>>> documentation is wrong (incomplete, misleading) then? I will readily
>>>> accept
>>>> the “not a defect” explanation, but only if one wouldn’t have to ask you
>>>> every time this oddity is met. If this is the expected error condition,
>>>> what should I do to get the path reliably? Should I retry (and how many
>>>> times)? You’re saying cache is being purged; does it mean that when I
>>>> ask for path then cache is populated again? Does it guarantee then that
>>>> I’ll be able to get the path on next call? Could you guarantee that I’ll
>>>> be able to get the path at all if I fail two or more times? Should I
>>>> rely on ENOENT specifically when retrying?> 
>>> Is this over NFS?  NFS is more aggressive than local filesystems in
>>> purging
>>> name cache entries because there are inherent races in NFS with certain
>>> fileservers (ones that don't use sub-second timestamps), so by default
>>> entries always expire after about a minute.  You can change that via the
>>> 'nametimeo' mount option (takes a count in seconds).
>> 
>> No, not NFS but ZFS. Could that be an issue? The FreeBSD 8 machine I
>> mentioned before has UFS.
>> 
>> Also, as you can see from the video I recorded (and from the code I
>> provided), path resolution succeeds and fails within fractions of a second
>> after process startup.
> 
> Are you seeing vnodes being actively recycled?  In particular, do you see 
> vfs.numvnodes close to kern.maxvnodes?  You can try raising kern.maxvnodes.  
> If vfs.numvnodes grows up to the limit then as long as you can stomach the RAM 
> of having more vnodes around that would increase the changes of your paths 
> remaining valid.

When the call works, sysctl returns:
    vfs.numvnodes: 59638
    kern.maxvnodes: 204723
The times it doesn't, the output is:
    vfs.numvnodes: 60017
    kern.maxvnodes: 204723
I've selected maximum numbers. Monitoring was made with
    while sysctl vfs.numvnodes kern.maxvnodes; do sleep 0.1; done

So it seems that's not related, correct? 60K is much less than 200K.


Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?27C465FC-E8C7-44CB-A812-65213BB8AC9F>