Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Jun 2018 15:17:25 -0700
From:      Devin Teske <dteske@FreeBSD.org>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Devin Teske <dteske@FreeBSD.org>, Gleb Smirnoff <glebius@freebsd.org>, Warner Losh <imp@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r335690 - head/sys/kern
Message-ID:  <63B2C22B-9B7C-4F8C-A153-8B4A6618558A@FreeBSD.org>
In-Reply-To: <CANCZdfoTY3kFYj7kthBwhZfudPvOuynd6O_V12XDftC7qBQ3aA@mail.gmail.com>
References:  <201806270411.w5R4B9ZB078994@repo.freebsd.org> <20180627235216.GO1165@FreeBSD.org> <CANCZdfpdBOin=w0qEGXM%2BciTAbETezYTOPq1nuEV1pT5QATEHA@mail.gmail.com> <16B9F36D-E92A-4FEC-B096-5E24A4E180FC@FreeBSD.org> <CANCZdfpJzFn0tj-PGY8Zp%2BMT7AZrYxTR5DEjHsFeDFfiDCnxGw@mail.gmail.com> <623F6077-0A89-4D35-9E83-76BF40CC290F@FreeBSD.org> <CANCZdfonitWdaAXXoQnPZtMd_ytumGdWr=WYfX5CHYMytsXuOA@mail.gmail.com> <7C4AEADC-1973-482B-A7A5-75F23397D44E@FreeBSD.org> <CANCZdfoTY3kFYj7kthBwhZfudPvOuynd6O_V12XDftC7qBQ3aA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jun 28, 2018, at 12:36 AM, Warner Losh <imp@bsdimp.com> wrote:
>=20
>=20
>=20
> On Thu, Jun 28, 2018 at 12:54 AM, Devin Teske <dteske@freebsd.org =
<mailto:dteske@freebsd.org>> wrote:
>=20
>> On Jun 27, 2018, at 7:35 PM, Warner Losh <imp@bsdimp.com =
<mailto:imp@bsdimp.com>> wrote:
>>=20
>>=20
>>=20
>> On Wed, Jun 27, 2018 at 8:14 PM, Devin Teske <dteske@freebsd.org =
<mailto:dteske@freebsd.org>> wrote:
>>=20
>>> On Jun 27, 2018, at 6:58 PM, Warner Losh <imp@bsdimp.com =
<mailto:imp@bsdimp.com>> wrote:
>>>=20
>>>=20
>>>=20
>>> On Wed, Jun 27, 2018 at 7:49 PM, Devin Teske <dteske@freebsd.org =
<mailto:dteske@freebsd.org>> wrote:
>>>=20
>>>> On Jun 27, 2018, at 5:59 PM, Warner Losh <imp@bsdimp.com =
<mailto:imp@bsdimp.com>> wrote:
>>>>=20
>>>>=20
>>>>=20
>>>> On Wed, Jun 27, 2018 at 5:52 PM, Gleb Smirnoff <glebius@freebsd.org =
<mailto:glebius@freebsd.org>> wrote:
>>>> On Wed, Jun 27, 2018 at 04:11:09AM +0000, Warner Losh wrote:
>>>> W> Author: imp
>>>> W> Date: Wed Jun 27 04:11:09 2018
>>>> W> New Revision: 335690
>>>> W> URL: https://svnweb.freebsd.org/changeset/base/335690 =
<https://svnweb.freebsd.org/changeset/base/335690>;
>>>> W>=20
>>>> W> Log:
>>>> W>   Fix devctl generation for core files.
>>>> W>  =20
>>>> W>   We have a problem with vn_fullpath_global when the file =
exists. Work
>>>> W>   around it by printing the full path if the core file name =
starts with /,
>>>> W>   or current working directory followed by the filename if not.
>>>>=20
>>>> Is this going to work when a core is dumped not at current working =
directory,
>>>> but at absolute path? e.g. kern.corefile=3D/var/log/cores/%N.core
>>>>=20
>>>> Yes. That works.
>>>> =20
>>>> Looks like the vn_fullpath_global needs to be fixed rather than =
problem
>>>> workarounded.
>>>>=20
>>>> It can't be fixed reliably. FreeBSD does not and cannot map a vnode =
to a name. The only reason we're able to at all is due to the name =
cache. And when we recreate a file, we invalidate the name cache. And =
even if we fixed that, there's no guarantee the name cache won't get =
flushed before we translate the name.... Linux can do this because it =
keeps the path name associated with the inode. FreeBSD simply doesn't.
>>>>=20
>>>=20
>>> They said it couldn't be done, but I personally have done it ...
>>>=20
>>> I map vnodes to full paths in dwatch. It's not impossible, just =
implausibly hard.
>>>=20
>>> I derived my formula by reading the C code which was very =
twisty-turny and rather hard to read at times (and I have been using C =
since 1998 on everything from AIX and OSF/1 to HP/UX, Cygwin, MinGW, =
FreeBSD, countless Linux-like things, IRIX, and a God-awful remainder of =
many others; the vnode code was an adventure to say the least).
>>>=20
>>> You're welcome to see how it's done in =
/usr/libexec/dwatch/vop_create
>>>=20
>>> I load up a clause-local variable named "path" with a path =
constructed from a pointer to a vnode structure argument to =
VOP_CREATE(9).
>>>=20
>>> The D code could easily be rewritten back into C to walk the vnode =
to completion (compared to the D code which is bounded by depth-limit =
since DTrace doesn't provide loops; so you have to, as I have done, use =
a higher-level language wrapper to repeat the code to some desired =
limit; dwatch defaulting to 64 for directory depth limit).
>>>=20
>>> Compared this, say, to vfssnoop.d from Chapter 5 of the DTrace book =
which utilizes the vfs:namei:lookup: probes. My approach in dwatch is =
far more accurate, produces full paths, and I've benchmarked it at lower =
overhead with better results.
>>>=20
>>> Maybe some of this can be useful? If not, just ignore me.
>>>=20
>>> IMHO, there's no benefit from the crazy hoops than the super simple =
heuristic I did: if the path starts with / print it. If the path doesn't =
print cwd / and then the path.
>>>=20
>>> I look forward to seeing your conversion of the D to C that works, =
even when there's namespace cache pressure.
>>>=20
>>=20
>> I was looking at the output of "dwatch -dX vop_create" (the -d flag =
to dwatch causes the generated dwatch to be printed instead of executed) =
and thinking to myself:
>>=20
>> I could easily convert this, as I can recognize the flattened loop =
structure. However, not many people will be able to do it. I wasn't =
really volunteering, but now I'm curious what other people see when they =
run "dwatch -dX vop_create". If it's half as bad as I think it is, then =
I'll gladly do it -- but will have to balance it against work =
priorities.
>>=20
>> We don't have the data you do in vop_create anymore, just the vp at =
the point in the code where we want to do the reverse translation... But =
we also have a name that's been filled in from the template and cwd, =
which gives us almost the same thing as we'd hoped to dig out of the =
name cache...
>>=20
>=20
> Given:
>=20
> 	struct vnode *vp, const char *fi_name
>=20
> Here's a rough (not optimized; unchecked) crack at a C equivalent:
>=20
> Thanks, but this code suffers from the same problem that =
vn_fullpath_global() suffers from: we clear out parts of the cache when =
we open an existing file with O_CREAT. This one NULL leads to a cascade =
of failures, as show below.
>=20
>=20
>=20
> 	#include <sys/mount.h>
> 	#include <sys/vnode.h>
>=20
> 	#include <limits.h>
>=20
> 	int i, max_depth =3D 64;
> 	char *d_name =3D NULL;
> 	char *fi_fs =3D NULL;
> 	char *fi_mount =3D NULL;
> 	char *cp;
> 	struct namecache *ncp =3D NULL;
> 	struct mount *mount =3D NULL;
> 	struct vnode *dvp =3D NULL;
> 	char *nc[max_depth+1];
> 	char path[PATH_MAX] =3D __DECONST(char *, "");
>=20
> 	if (vp !=3D NULL) {
> 		ncp =3D vp->v_cache_dst.tqh_first;
>=20
> For my case, this is NULL since we clear out the cache when we =
re-create the file. I confirmed this by adding a printf where I put in =
my workaround and I get:
> vp->v_cache_dst.tqh_first is 0xfffff80003bd7a10 (first time, no =
cat.core exists)
> vp->v_cache_dst.tqh_first is 0 (second time, cat.core does exist)
>=20
> 		mount =3D vp->v_mount; /* ptr to vfs we are in */
> 		if (vp->v_cache_dd !=3D NULL)
> 			d_name =3D vp->v_cache_dd->nc_name;
> 	}
> 	if (mount !=3D NULL) {
> 		fi_fs =3D mount->mnt_stat.f_fstypename;
> 		if (fi_fs !=3D NULL && *fi_fs !=3D '\0') {
> 			if (!strcmp(fi_fs, "devfs")) ncp =3D NULL;
> 		} else {
> 			if (fi_name =3D=3D NULL || *fi_name =3D=3D '\0') =
ncp =3D NULL;
> 		}
> 		fi_mount =3D mount->mnt_stat.f_mntonname;
> 	} else {
> 		ncp =3D NULL;
> 	}
> 	for (i =3D 0; i <=3D max_depth; i++) {
> 		nc[i] =3D NULL;
> 	}
> 	if (ncp !=3D NULL) {
> 		/* Reach into vnode of parent of name */
> 		if (ncp->nc_dvp !=3D NULL)
> 			dvp =3D ncp->nc_dvp->v_cache_dst.tqh_first;
> 		if (dvp !=3D NULL) nc[0] =3D dvp->nc_name;
> 	}
>=20
> since ncp is NULL here, we never set nc[0]
>=20
> =20
> 	if (nc[0] =3D=3D NULL || *nc[0] =3D=3D '\0' || !strcmp(nc[0], =
"/"))
> 		dvp =3D NULL;
>=20
> so dvp is NULL
> =20
> 	/*
> 	 * BEGIN Pathname-depth loop
> 	 */
> 	for (i =3D 1; i < max_depth; i++) {
> 		if (dvp =3D=3D NULL) break;
>=20
> and we break out of this loop without setting any other nc[] entries.
> =20
> 		if (dvp->nc_dvp !=3D NULL)
> 			dvp =3D dvp->nc_dvp->v_cache_dst.tqh_first;
> 		else
> 			dvp =3D NULL;
> 		if (dvp !=3D NULL) nc[i] =3D dvp->nc_name;
> 	}
> 	/*
> 	 * END Pathname-depth loop
> 	 */
> 	if (dvp !=3D NULL) {
> 		if (dvp->nc_dvp !=3D NULL)
> 			dvp =3D dvp->nc_dvp->v_cache_dst.tqh_first;
> 		else
> 			dvp =3D NULL;
> 		if (dvp !=3D NULL && dvp->nc_dvp !=3D NULL)
> 			nc[max_depth] =3D __DECONST(char *, "...");
> 	}
> 	if (fi_mount !=3D NULL) {
> 		/*
> 		 * Join full path
> 		 * NB: Up-to but not including the parent directory =
(joined below)
> 		 */
> 		strcat(path, fi_mount);
> 		if (strcmp(fi_mount, "/")) strcat(path, "/");
> 		for (i =3D max_depth; i >=3D 0; i--) {
> 			char *namei =3D nc[i];
> 			strcat(path, namei =3D=3D NULL ? "" : namei);
>=20
> which joins a lot of empty strings together
> =20
> 			if (namei !=3D NULL) strcat(path, "/");
> 		}
> 		/* Join the parent directory name */
> 		if (d_name !=3D NULL && *d_name !=3D '\0') {
> 			strcat(path, d_name);
> 			strcat(path, "/");
> 		}
> 		/* Join the entry name */
> 		if (fi_name !=3D NULL && *fi_name !=3D '\0')
> 			strcat(path, fi_name);
>=20
> since we don't have the fi_name from the VOP_CREATE where we want to =
do this, we have nothing for here.
> =20
> 	}
>=20
> So I appreciate the effort here, but I'm not sure that it's useful in =
this context. vn_fullpath* already can do this sort of thing, but can't =
in the case where vp->v_cache_dst.tqh_first is NULL.
>=20


Thank you so much for the thoughtful response. It means a lot really.
--=20
Cheers,
Devin=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?63B2C22B-9B7C-4F8C-A153-8B4A6618558A>