From owner-svn-src-all@freebsd.org Fri Jun 29 22:17:29 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CDDA7F7F150; Fri, 29 Jun 2018 22:17:28 +0000 (UTC) (envelope-from devin@shxd.cx) Received: from shxd.cx (mail.shxd.cx [64.201.244.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 54AE97E9CD; Fri, 29 Jun 2018 22:17:28 +0000 (UTC) (envelope-from devin@shxd.cx) Received: from [74.217.198.10] (port=57357 helo=[10.1.4.66]) by shxd.cx with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1fYsN2-000CII-2u; Fri, 29 Jun 2018 12:19:24 +0000 From: Devin Teske Message-Id: <63B2C22B-9B7C-4F8C-A153-8B4A6618558A@FreeBSD.org> Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: svn commit: r335690 - head/sys/kern Date: Fri, 29 Jun 2018 15:17:25 -0700 In-Reply-To: Cc: Devin Teske , Gleb Smirnoff , Warner Losh , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org To: Warner Losh References: <201806270411.w5R4B9ZB078994@repo.freebsd.org> <20180627235216.GO1165@FreeBSD.org> <16B9F36D-E92A-4FEC-B096-5E24A4E180FC@FreeBSD.org> <623F6077-0A89-4D35-9E83-76BF40CC290F@FreeBSD.org> <7C4AEADC-1973-482B-A7A5-75F23397D44E@FreeBSD.org> X-Mailer: Apple Mail (2.3273) Sender: devin@shxd.cx Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.26 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jun 2018 22:17:29 -0000 > On Jun 28, 2018, at 12:36 AM, Warner Losh wrote: >=20 >=20 >=20 > On Thu, Jun 28, 2018 at 12:54 AM, Devin Teske > wrote: >=20 >> On Jun 27, 2018, at 7:35 PM, Warner Losh > wrote: >>=20 >>=20 >>=20 >> On Wed, Jun 27, 2018 at 8:14 PM, Devin Teske > wrote: >>=20 >>> On Jun 27, 2018, at 6:58 PM, Warner Losh > wrote: >>>=20 >>>=20 >>>=20 >>> On Wed, Jun 27, 2018 at 7:49 PM, Devin Teske > wrote: >>>=20 >>>> On Jun 27, 2018, at 5:59 PM, Warner Losh > wrote: >>>>=20 >>>>=20 >>>>=20 >>>> On Wed, Jun 27, 2018 at 5:52 PM, Gleb Smirnoff > wrote: >>>> On Wed, Jun 27, 2018 at 04:11:09AM +0000, Warner Losh wrote: >>>> W> Author: imp >>>> W> Date: Wed Jun 27 04:11:09 2018 >>>> W> New Revision: 335690 >>>> W> URL: https://svnweb.freebsd.org/changeset/base/335690 = >>>> W>=20 >>>> W> Log: >>>> W> Fix devctl generation for core files. >>>> W> =20 >>>> W> We have a problem with vn_fullpath_global when the file = exists. Work >>>> W> around it by printing the full path if the core file name = starts with /, >>>> W> or current working directory followed by the filename if not. >>>>=20 >>>> Is this going to work when a core is dumped not at current working = directory, >>>> but at absolute path? e.g. kern.corefile=3D/var/log/cores/%N.core >>>>=20 >>>> Yes. That works. >>>> =20 >>>> Looks like the vn_fullpath_global needs to be fixed rather than = problem >>>> workarounded. >>>>=20 >>>> It can't be fixed reliably. FreeBSD does not and cannot map a vnode = to a name. The only reason we're able to at all is due to the name = cache. And when we recreate a file, we invalidate the name cache. And = even if we fixed that, there's no guarantee the name cache won't get = flushed before we translate the name.... Linux can do this because it = keeps the path name associated with the inode. FreeBSD simply doesn't. >>>>=20 >>>=20 >>> They said it couldn't be done, but I personally have done it ... >>>=20 >>> I map vnodes to full paths in dwatch. It's not impossible, just = implausibly hard. >>>=20 >>> I derived my formula by reading the C code which was very = twisty-turny and rather hard to read at times (and I have been using C = since 1998 on everything from AIX and OSF/1 to HP/UX, Cygwin, MinGW, = FreeBSD, countless Linux-like things, IRIX, and a God-awful remainder of = many others; the vnode code was an adventure to say the least). >>>=20 >>> You're welcome to see how it's done in = /usr/libexec/dwatch/vop_create >>>=20 >>> I load up a clause-local variable named "path" with a path = constructed from a pointer to a vnode structure argument to = VOP_CREATE(9). >>>=20 >>> The D code could easily be rewritten back into C to walk the vnode = to completion (compared to the D code which is bounded by depth-limit = since DTrace doesn't provide loops; so you have to, as I have done, use = a higher-level language wrapper to repeat the code to some desired = limit; dwatch defaulting to 64 for directory depth limit). >>>=20 >>> Compared this, say, to vfssnoop.d from Chapter 5 of the DTrace book = which utilizes the vfs:namei:lookup: probes. My approach in dwatch is = far more accurate, produces full paths, and I've benchmarked it at lower = overhead with better results. >>>=20 >>> Maybe some of this can be useful? If not, just ignore me. >>>=20 >>> IMHO, there's no benefit from the crazy hoops than the super simple = heuristic I did: if the path starts with / print it. If the path doesn't = print cwd / and then the path. >>>=20 >>> I look forward to seeing your conversion of the D to C that works, = even when there's namespace cache pressure. >>>=20 >>=20 >> I was looking at the output of "dwatch -dX vop_create" (the -d flag = to dwatch causes the generated dwatch to be printed instead of executed) = and thinking to myself: >>=20 >> I could easily convert this, as I can recognize the flattened loop = structure. However, not many people will be able to do it. I wasn't = really volunteering, but now I'm curious what other people see when they = run "dwatch -dX vop_create". If it's half as bad as I think it is, then = I'll gladly do it -- but will have to balance it against work = priorities. >>=20 >> We don't have the data you do in vop_create anymore, just the vp at = the point in the code where we want to do the reverse translation... But = we also have a name that's been filled in from the template and cwd, = which gives us almost the same thing as we'd hoped to dig out of the = name cache... >>=20 >=20 > Given: >=20 > struct vnode *vp, const char *fi_name >=20 > Here's a rough (not optimized; unchecked) crack at a C equivalent: >=20 > Thanks, but this code suffers from the same problem that = vn_fullpath_global() suffers from: we clear out parts of the cache when = we open an existing file with O_CREAT. This one NULL leads to a cascade = of failures, as show below. >=20 >=20 >=20 > #include > #include >=20 > #include >=20 > int i, max_depth =3D 64; > char *d_name =3D NULL; > char *fi_fs =3D NULL; > char *fi_mount =3D NULL; > char *cp; > struct namecache *ncp =3D NULL; > struct mount *mount =3D NULL; > struct vnode *dvp =3D NULL; > char *nc[max_depth+1]; > char path[PATH_MAX] =3D __DECONST(char *, ""); >=20 > if (vp !=3D NULL) { > ncp =3D vp->v_cache_dst.tqh_first; >=20 > For my case, this is NULL since we clear out the cache when we = re-create the file. I confirmed this by adding a printf where I put in = my workaround and I get: > vp->v_cache_dst.tqh_first is 0xfffff80003bd7a10 (first time, no = cat.core exists) > vp->v_cache_dst.tqh_first is 0 (second time, cat.core does exist) >=20 > mount =3D vp->v_mount; /* ptr to vfs we are in */ > if (vp->v_cache_dd !=3D NULL) > d_name =3D vp->v_cache_dd->nc_name; > } > if (mount !=3D NULL) { > fi_fs =3D mount->mnt_stat.f_fstypename; > if (fi_fs !=3D NULL && *fi_fs !=3D '\0') { > if (!strcmp(fi_fs, "devfs")) ncp =3D NULL; > } else { > if (fi_name =3D=3D NULL || *fi_name =3D=3D '\0') = ncp =3D NULL; > } > fi_mount =3D mount->mnt_stat.f_mntonname; > } else { > ncp =3D NULL; > } > for (i =3D 0; i <=3D max_depth; i++) { > nc[i] =3D NULL; > } > if (ncp !=3D NULL) { > /* Reach into vnode of parent of name */ > if (ncp->nc_dvp !=3D NULL) > dvp =3D ncp->nc_dvp->v_cache_dst.tqh_first; > if (dvp !=3D NULL) nc[0] =3D dvp->nc_name; > } >=20 > since ncp is NULL here, we never set nc[0] >=20 > =20 > if (nc[0] =3D=3D NULL || *nc[0] =3D=3D '\0' || !strcmp(nc[0], = "/")) > dvp =3D NULL; >=20 > so dvp is NULL > =20 > /* > * BEGIN Pathname-depth loop > */ > for (i =3D 1; i < max_depth; i++) { > if (dvp =3D=3D NULL) break; >=20 > and we break out of this loop without setting any other nc[] entries. > =20 > if (dvp->nc_dvp !=3D NULL) > dvp =3D dvp->nc_dvp->v_cache_dst.tqh_first; > else > dvp =3D NULL; > if (dvp !=3D NULL) nc[i] =3D dvp->nc_name; > } > /* > * END Pathname-depth loop > */ > if (dvp !=3D NULL) { > if (dvp->nc_dvp !=3D NULL) > dvp =3D dvp->nc_dvp->v_cache_dst.tqh_first; > else > dvp =3D NULL; > if (dvp !=3D NULL && dvp->nc_dvp !=3D NULL) > nc[max_depth] =3D __DECONST(char *, "..."); > } > if (fi_mount !=3D NULL) { > /* > * Join full path > * NB: Up-to but not including the parent directory = (joined below) > */ > strcat(path, fi_mount); > if (strcmp(fi_mount, "/")) strcat(path, "/"); > for (i =3D max_depth; i >=3D 0; i--) { > char *namei =3D nc[i]; > strcat(path, namei =3D=3D NULL ? "" : namei); >=20 > which joins a lot of empty strings together > =20 > if (namei !=3D NULL) strcat(path, "/"); > } > /* Join the parent directory name */ > if (d_name !=3D NULL && *d_name !=3D '\0') { > strcat(path, d_name); > strcat(path, "/"); > } > /* Join the entry name */ > if (fi_name !=3D NULL && *fi_name !=3D '\0') > strcat(path, fi_name); >=20 > since we don't have the fi_name from the VOP_CREATE where we want to = do this, we have nothing for here. > =20 > } >=20 > So I appreciate the effort here, but I'm not sure that it's useful in = this context. vn_fullpath* already can do this sort of thing, but can't = in the case where vp->v_cache_dst.tqh_first is NULL. >=20 Thank you so much for the thoughtful response. It means a lot really. --=20 Cheers, Devin=