Date: Thu, 8 Dec 2016 15:34:38 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Eric van Gyzen <vangyzen@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r309676 - in head: bin/ps lib/libkvm sys/compat/freebsd32 sys/kern sys/sys usr.bin/procstat usr.bin/top Message-ID: <20161208130730.O1089@besplex.bde.org> In-Reply-To: <201612071504.uB7F4MCi020382@repo.freebsd.org> References: <201612071504.uB7F4MCi020382@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 7 Dec 2016, Eric van Gyzen wrote: > Log: > Export the whole thread name in kinfo_proc > > kinfo_proc::ki_tdname is three characters shorter than > thread::td_name. Add a ki_moretdname field for these three > extra characters. Add the new field to kinfo_proc32, as well. > Update all in-tree consumers to read the new field and assemble > the full name, except for lldb's HostThreadFreeBSD.cpp, which > I will handle separately. Bump __FreeBSD_version. tdname stuff was very badly implemented. It abuses the ki_ocomm field for ki_tdname. ki_ocomm was reserved for the old command name to support old applications. It should have been populated with an intelligently abbreviated copy of ki_comm, so that old applications which don't know about the ki_comm renaming see a useful name. It was actually populated by blindly truncating p_comm. This was good enough in most cases. Then when thread support was added, ki_ocomm was broken copying td_name to it. For the non-threaded case, this makes no difference, but for the threaded case this breaks ki_ocomm for old applications and it doesn't even work right for the threaded case since ki_ocomm is too short. This is the problem that you are partly fixing now. ki_ocomm should have remained as the truncated ki_comm. ki_comm should have been modified to contain an intelligently abbreviated copy of the command and thread names, so that not so old applications see any extra info in the thread name. A new field was needed for the thread name, and perhaps for the previous (non-mixed) command name, so that applications can see the raw names p_comm and td_name if the understand threads. Renaming ki_comm was rather gratuitous. It was only expanded from 16+1 to 19+1 bytes. This was done mainly because 16+1 gives either misalgnment or wastes space. The latter occurred for p_comm[] in struct proc. So we expanded to 19+1 but no more to get a slightly longer name at no cost in struct proc. But there was a large cost for using this. ac_comm[] in struct acct still has size 16+0 (it is not NUL-terminated). And the expansion didn't just work in user.h since the ki_comm[16+1] was packed there. The expansion was done cleanly by keeping renaming the old field but renaming it to ki_ocomm[16+1], and adding a new field ki_comm[19+1]. Then abusing ki_ocomm made a mess. ki_ocomm and OCOMMLEN are under a bogus BURN_BRIDGES ifdef in user.h. This doesn't even break the ABI. It only breaks the API, but that is less important and can be done in the head branch without any ifdefs (just after you update all uses of the API in src). The ABI was broken by abusing ki_ocomm. There is also much ugliness and loss of info by combining ki_comm and ki_tdname unintelligently and and then truncating. The combined length of 19+16 (now +3 more) is too wide for most displays. The unintelligences also adds excessive markup like '[]{}:' characters, giving lengths of 40+. Then blind truncation tends to lost the most useful info at the end. Your 'more' field instead of renaming again not as clean or wasteful of space as renaming again, but it is less invasive. > Modified: head/bin/ps/print.c > ============================================================================== > --- head/bin/ps/print.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/bin/ps/print.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -120,11 +120,12 @@ command(KINFO *k, VARENT *ve) > if (cflag) { > /* If it is the last field, then don't pad */ > if (STAILQ_NEXT(ve, next_ve) == NULL) { > - asprintf(&str, "%s%s%s%s", > + asprintf(&str, "%s%s%s%s%s", > k->ki_d.prefix ? k->ki_d.prefix : "", > k->ki_p->ki_comm, > (showthreads && k->ki_p->ki_numthreads > 1) ? "/" : "", > - (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : ""); > + (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "", > + (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_moretdname : ""); This expands the style bugs by copying a too-long line and expanding it further. With intelligent combination, you couldn't do it in a big printf() arg list. FreeBSD cluster machines seem to have stopped showing system processes (even idle) in ps and top, so I can't easily see many bad messes from unintelligent combination or debug things related to system load. Locally, I get for top -SH: PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 10 root 155 ki31 0K 64K CPU3 3 1:59 100.27% idle{idle: cpu3} 10 root 155 ki31 0K 64K CPU5 5 1:59 100.27% idle{idle: cpu5} 10 root 155 ki31 0K 64K CPU6 6 1:59 100.27% idle{idle: cpu6} 10 root 155 ki31 0K 64K CPU7 7 1:59 100.27% idle{idle: cpu7} 10 root 155 ki31 0K 64K CPU2 2 1:59 100.27% idle{idle: cpu2} 10 root 155 ki31 0K 64K CPU1 1 1:58 100.27% idle{idle: cpu1} 10 root 155 ki31 0K 64K RUN 0 1:58 100.27% idle{idle: cpu0} 10 root 155 ki31 0K 64K CPU4 4 1:58 100.27% idle{idle: cpu4} 0 root -16 - 0K 160K swapin 5 0:29 0.00% kernel{swapper} 13 root -68 - 0K 120K - 0 0:00 0.00% usb{usbus2} 13 root -68 - 0K 120K - 5 0:00 0.00% usb{usbus1} 0 root 8 - 0K 160K - 0 0:00 0.00% kernel{thread ta 1 root 8 0 732K 372K wait 4 0:00 0.00% init 11 root -60 - 0K 168K WAIT 0 0:00 0.00% intr{swi4: clock 11 root -88 - 0K 168K WAIT 4 0:00 0.00% intr{irq265: ahc 13 root -68 - 0K 120K - 0 0:00 0.00% usb{usbus0} 12 root -8 - 0K 24K - 0 0:00 0.00% geom{g_event} 678 bde 8 0 6344K 6168K wait 6 0:00 0.00% bash This suffers some truncation, but freefall is much worse: some undisplayed username is wider, so 5 columns are wasted for USERNAME and only 11 are left. for COMMAND. The above shows the common lossage that the first part of ki_tdname is the same as the whole ki_comm. ki_tdname needs to be longer to hold this, but is shorter. Unintelligent combination duplicates ki_comm and then truncates info. The other names are less consistently bad. "usb" is duplicated, and "intr" is duplicated as "irq". Other names mostly are just too long. ki_tdname is sometimes a description, not a name. ki_comm doesn't claim to be a name, but more usually is. It takes -a to get args for top. top -a and ps l suffer from to much truncation of COMMAND to actually show many args. > } else > str = strdup(k->ki_p->ki_comm); > > @@ -172,14 +173,16 @@ ucomm(KINFO *k, VARENT *ve) > char *str; > > if (STAILQ_NEXT(ve, next_ve) == NULL) { /* last field, don't pad */ > - asprintf(&str, "%s%s%s%s", > + asprintf(&str, "%s%s%s%s%s", > k->ki_d.prefix ? k->ki_d.prefix : "", > k->ki_p->ki_comm, > (showthreads && k->ki_p->ki_numthreads > 1) ? "/" : "", > - (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : ""); > + (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "", > + (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_moretdname : ""); As above. > } else { > if (showthreads && k->ki_p->ki_numthreads > 1) > - asprintf(&str, "%s/%s", k->ki_p->ki_comm, k->ki_p->ki_tdname); > + asprintf(&str, "%s/%s%s", k->ki_p->ki_comm, > + k->ki_p->ki_tdname, k->ki_p->ki_moretdname); > else > str = strdup(k->ki_p->ki_comm); > } Here you fixed the formatting style bug, but it might be considered a style bug to use strdup() instead of asprintf() for the simple case. style(9) says to always use fprintf() and not fputs()...or whatever, and we can do the same things for asprintf() to get uniform logic for the allocation. The printf() part is just not so uniform when its arg list depends on (showthreads && k->ki_p->ki_numthreads > 1). > @@ -192,7 +195,8 @@ tdnam(KINFO *k, VARENT *ve __unused) > char *str; > > if (showthreads && k->ki_p->ki_numthreads > 1) > - str = strdup(k->ki_p->ki_tdname); > + asprintf(&str, "%s%s", k->ki_p->ki_tdname, > + k->ki_p->ki_moretdname); > else > str = strdup(" "); Here always using asprintf() is clearly better because it avoids having to count the spaces in the literal string. > Modified: head/lib/libkvm/kvm_proc.c > ============================================================================== > --- head/lib/libkvm/kvm_proc.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/lib/libkvm/kvm_proc.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -426,8 +426,6 @@ nopgrp: > kp->ki_pri.pri_native = mtd.td_base_pri; > kp->ki_lastcpu = mtd.td_lastcpu; > kp->ki_wchan = mtd.td_wchan; > - if (mtd.td_name[0] != 0) > - strlcpy(kp->ki_tdname, mtd.td_name, MAXCOMLEN); This seems to have been dead code. > kp->ki_oncpu = mtd.td_oncpu; > if (mtd.td_name[0] != '\0') > strlcpy(kp->ki_tdname, mtd.td_name, sizeof(kp->ki_tdname)); > In my version, most string lengths are not #defined and/or not exported in user.h, so as to force use of sizeof() like here. > Modified: head/sys/compat/freebsd32/freebsd32.h > ============================================================================== > --- head/sys/compat/freebsd32/freebsd32.h Wed Dec 7 14:35:05 2016 (r309675) > +++ head/sys/compat/freebsd32/freebsd32.h Wed Dec 7 15:04:22 2016 (r309676) > @@ -315,7 +315,8 @@ struct kinfo_proc32 { > char ki_comm[COMMLEN+1]; > char ki_emul[KI_EMULNAMELEN+1]; > char ki_loginclass[LOGINCLASSLEN+1]; > - char ki_sparestrings[50]; > + char ki_moretdname[MAXCOMLEN-TDNAMLEN+1]; > + char ki_sparestrings[46]; It's inconsistent to use #define'd values to calculate 3+1 for the size of the new field, then hard code this 3+1 in the adjustment of 50 to 46. I prefer to hard-code all the array sizes and then use sizeof() to recover them. I only do this in usee.h, not here. > int ki_spareints[KI_NSPARE_INT]; The #define's for the spare counts are more actively harmful. You needed to not have one for the char spares to make the above magic adjustment of 4 as easy as possible. > int ki_oncpu; > int ki_lastcpu; > > Modified: head/sys/kern/kern_proc.c > ============================================================================== > --- head/sys/kern/kern_proc.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/sys/kern/kern_proc.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -1020,7 +1020,14 @@ fill_kinfo_thread(struct thread *td, str > strlcpy(kp->ki_wmesg, td->td_wmesg, sizeof(kp->ki_wmesg)); > else > bzero(kp->ki_wmesg, sizeof(kp->ki_wmesg)); > - strlcpy(kp->ki_tdname, td->td_name, sizeof(kp->ki_tdname)); > + if (strlcpy(kp->ki_tdname, td->td_name, sizeof(kp->ki_tdname)) >= > + sizeof(kp->ki_tdname)) { > + strlcpy(kp->ki_moretdname, > + td->td_name + sizeof(kp->ki_tdname) - 1, > + sizeof(kp->ki_moretdname)); > + } else { > + bzero(kp->ki_moretdname, sizeof(kp->ki_moretdname)); > + } I think this bzero() and the one above for wmesg and a couple of others are bogus. There should have been a bzero() of the whole struct to fill any padding, and that fills all zero fields too. > Modified: head/sys/sys/user.h > ============================================================================== > --- head/sys/sys/user.h Wed Dec 7 14:35:05 2016 (r309675) > +++ head/sys/sys/user.h Wed Dec 7 15:04:22 2016 (r309676) > @@ -180,12 +180,13 @@ struct kinfo_proc { > char ki_comm[COMMLEN+1]; /* command name */ > char ki_emul[KI_EMULNAMELEN+1]; /* emulation name */ > char ki_loginclass[LOGINCLASSLEN+1]; /* login class */ > + char ki_moretdname[MAXCOMLEN-TDNAMLEN+1]; /* more thread name */ > /* > * When adding new variables, take space for char-strings from the > * front of ki_sparestrings, and ints from the end of ki_spareints. > * That way the spare room from both arrays will remain contiguous. > */ > - char ki_sparestrings[50]; /* spare string space */ > + char ki_sparestrings[46]; /* spare string space */ > int ki_spareints[KI_NSPARE_INT]; /* spare room for growth */ > int ki_oncpu; /* Which cpu we are on */ > int ki_lastcpu; /* Last cpu we were on */ > The #define's for the sizes are actually sort of needed. We are (ab)using the MI values in the compat headers. The values should be, but aren't required to be the same, especially for the spares. They are just what they have to be to preserve the ABI, and the ABIs have the same number of spares for historical reasons. Only strings are sure to have the same ABI for individual fields. > Modified: head/usr.bin/procstat/procstat.c > ============================================================================== > --- head/usr.bin/procstat/procstat.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/usr.bin/procstat/procstat.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -35,6 +35,7 @@ > #include <libprocstat.h> > #include <stdio.h> > #include <stdlib.h> > +#include <string.h> > #include <sysexits.h> > #include <unistd.h> > > @@ -126,6 +127,21 @@ kinfo_proc_sort(struct kinfo_proc *kipp, > qsort(kipp, count, sizeof(*kipp), kinfo_proc_compare); > } > > +const char * > +kinfo_proc_thread_name(const struct kinfo_proc *kipp) > +{ > + static char name[MAXCOMLEN+1]; Don't use the style bug of no spaces around binary operators when nearby code sdoesn't already have it. > + > + strlcpy(name, kipp->ki_tdname, sizeof(name)); > + strlcat(name, kipp->ki_moretdname, sizeof(name)); Should probably use 1 snprintf(). > ... > Modified: head/usr.bin/procstat/procstat_threads.c > ============================================================================== > --- head/usr.bin/procstat/procstat_threads.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/usr.bin/procstat/procstat_threads.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -71,11 +71,10 @@ procstat_threads(struct procstat *procst > xo_open_container(threadid); > xo_emit("{dk:process_id/%5d/%d} ", kipp->ki_pid); > xo_emit("{:thread_id/%6d/%d} ", kipp->ki_tid); > - xo_emit("{d:command/%-16s/%s} ", strlen(kipp->ki_comm) ? > + xo_emit("{d:command/%-19s/%s} ", strlen(kipp->ki_comm) ? > kipp->ki_comm : "-"); > - xo_emit("{:thread_name/%-16s/%s} ", (strlen(kipp->ki_tdname) && > - (strcmp(kipp->ki_comm, kipp->ki_tdname) != 0)) ? > - kipp->ki_tdname : "-"); > + xo_emit("{:thread_name/%-19s/%s} ", > + kinfo_proc_thread_name(kipp)); > if (kipp->ki_oncpu != 255) > xo_emit("{:cpu/%3d/%d} ", kipp->ki_oncpu); > else if (kipp->ki_lastcpu != 255) > It's actually more invasive to make applications concatentate the names, but if you added a new field with the correct length you would have to change lots of hard-coded 16's for the old wrong size. Hard-coding these in printf formats is easiest and almost OK since the ABI prevents the size changing often. > Modified: head/usr.bin/top/machine.c > ============================================================================== > --- head/usr.bin/top/machine.c Wed Dec 7 14:35:05 2016 (r309675) > +++ head/usr.bin/top/machine.c Wed Dec 7 15:04:22 2016 (r309676) > @@ -991,8 +991,8 @@ format_next_process(caddr_t handle, char > if (!(flags & FMT_SHOWARGS)) { > if (ps.thread && pp->ki_flag & P_HADTHREADS && > pp->ki_tdname[0]) { > - snprintf(cmdbuf, cmdlen, "%s{%s}", pp->ki_comm, > - pp->ki_tdname); > + snprintf(cmdbuf, cmdlen, "%s{%s%s}", pp->ki_comm, > + pp->ki_tdname, pp->ki_moretdname); > } else { > snprintf(cmdbuf, cmdlen, "%s", pp->ki_comm); > } procstat has a little more intelligence to produce '-' when the thread name just duplicates the command name, but according to my top output it rarely does. I think it mainly produces '-' by finding a null thread name for the unthreaded case. procstat uses strcmp(). Comparing prefixes could do more. The above uses the null thread name test and other things to produce a different format instead of a thread name of '-'. It is not so easy to use a general function to prepare the names for special formatting. top doesn't have the hard-coded 16's and 19's in format strings. It would have just worked (except for bad combination) with a clean rename. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161208130730.O1089>