From owner-svn-src-all@FreeBSD.ORG Thu Feb 16 08:39:59 2012 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD949106564A; Thu, 16 Feb 2012 08:39:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 2F4458FC0C; Thu, 16 Feb 2012 08:39:58 +0000 (UTC) Received: from c211-30-171-136.carlnfd1.nsw.optusnet.com.au (c211-30-171-136.carlnfd1.nsw.optusnet.com.au [211.30.171.136]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q1G8dtvU024795 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Feb 2012 19:39:56 +1100 Date: Thu, 16 Feb 2012 19:39:55 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Eitan Adler In-Reply-To: <201202160511.q1G5BZNk099785@svn.freebsd.org> Message-ID: <20120216181210.K1423@besplex.bde.org> References: <201202160511.q1G5BZNk099785@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r231814 - in head/sys: kern sys X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Feb 2012 08:39:59 -0000 On Thu, 16 Feb 2012, Eitan Adler wrote: > Log: > Add a timestamp to the msgbuf output in order to determine when when > messages were printed. > > This can be enabled with the kern.msgbuf_show_timestamp sysctl Apart from being fundamentally broken, this adds lots of bloat and style bugs. The msgbuf is a very low-level interface, and was careful not to do stuff like this. I'm still waiting for the previous round of breakage of it, that replaces careful atomic ops by spinlocks, to be backed out. Code at this level cannot use any normal locking, and used to be carefully written to not do so. The spinlocks break it, for example, if there is a trap while holding the lock and the trap handler wants to use the message buffer. Interrupts are possible too, but they are disabled on the same CPU for technical reasons, so they cannot cause deadlock here. The main possiblities for traps are NMIs. The NMI handler shouldn't call printf, but perhaps it does. It might cause a panic. Then panic can only print by blowing open the locks. It might be a STOP IPI. The stop function uses printf to diagnose other blockages, at least with certain options. Anyway printf must never block endlessly, so msgbuf functions must not never block endlessly either, so neither can use any normal locking, since normal locking can easily block endlessly in deadlock conditions, by the definition of deadlock. > Modified: head/sys/kern/subr_msgbuf.c > ============================================================================== > --- head/sys/kern/subr_msgbuf.c Thu Feb 16 05:06:07 2012 (r231813) > +++ head/sys/kern/subr_msgbuf.c Thu Feb 16 05:11:35 2012 (r231814) > @@ -47,6 +49,14 @@ > static u_int msgbuf_cksum(struct msgbuf *mbp); > > /* > + * > + */ First style bug: 3 lines for a content-free comment. > +static int msgbuf_show_timestamp = 0; > +SYSCTL_INT(_kern, OID_AUTO, msgbuf_show_timestamp, CTLFLAG_RW | CTLFLAG_TUN, > + &msgbuf_show_timestamp, 0, "Show timestamp in msgbuf"); > +TUNABLE_INT("kern.msgbuf_show_timestamp", &msgbuf_show_timestamp); Not too bad. > @@ -149,6 +159,33 @@ msgbuf_do_addchar(struct msgbuf *mbp, u_ > *seq = MSGBUF_SEQNORM(mbp, *seq + 1); > } > > +static inline void Bogus inlining. This function is about 10 times too large to be worth inlining, and gcc-4 will already do excessive inlining for it if it is only used once. > +msgbuf_do_addchar(struct msgbuf * const mbp, u_int * const seq, const int c) > +{ In KNF, local variables for a function go here... > + > + if (msgbuf_show_timestamp && > + (mbp->msg_flags & MSGBUF_NEXT_NEW_LINE) != 0) { > + char buf[32]; > + char const *bufp; > + struct timespec ts; > + int err; ...not here. The struct would be sorted first in KNF. > + > + getnanouptime(&ts); By calling this, you open the msgbuf to any deadlock possibilities that are in getnanouptime(). There are none currently, but this is not guaranteed. There are some for nanoutuptime(), since some timecounters use normal locking. `witness' is careful not to use any timecounter functions, though timestamps for it would be considerably more useful than here. I thought that it used timestamps, but can find any in it now. And shouldn't this be a human-readable real time instead of the monotonic time? `ticks' is just as good for a human-unreadable monotonic time (except it overflows after 24+ days and wraps at 48+ days with HZ = 1000). > + err = snprintf(buf, sizeof (buf), "[%jd.%ld] ", > + (intmax_t)ts.tv_sec, ts.tv_nsec / 1000); This can convert `ticks' to much the same human-unreadable time that getnanouptime() returns. Using getnanouptime() is bogus because the time in microseconds is printed. getmicrouptime() would return that directly and not need to be converted. Who cares about the microseconds part? If anyone cares, then it should be printed in human-readable %06ld format. One bug in %ld format is that it gives ".1" for 1 microseconds, but ".1" looks like 1/10. With %06ld format, it is fairly clear that the precision is in microseconds, and interpreting the time as in seconds with a fractional part gives the correct time. Other representation problems: - my idea of using the real time is no good. Times in seconds since the Epoch are more than a billion, so they are not human-readable. - times since boot time are fairly human-unreadable too, after running for a few hours makes them large. - the microseconds precision is bogus. getnanouptime() only provides a resolution of 1/HZ, with garbage in the low bits. If anyone cares about the fractional part, then when HZ = 100 it should be printed using %02ld format like time(1) bogusly does (the resolution hasn't been 1/100 for time(1) for almost 20 years), and when HZ = 1000 it should be printed using %03ld format. > + > + for (bufp = buf; *bufp != '\0'; bufp++) > + __msgbuf_do_addchar(mbp, seq, *bufp); > + > + mbp->msg_flags &= ~MSGBUF_NEXT_NEW_LINE; > + } > + > + __msgbuf_do_addchar(mbp, seq, c); > + In KNF, statements are separated by a single newline, not 2. The change to spinlocks also added a lot of style bugs of this. > ... No further style problems in the .c file. > Modified: head/sys/sys/msgbuf.h > ============================================================================== > --- head/sys/sys/msgbuf.h Thu Feb 16 05:06:07 2012 (r231813) > +++ head/sys/sys/msgbuf.h Thu Feb 16 05:11:35 2012 (r231814) > @@ -46,7 +46,9 @@ struct msgbuf { > u_int msg_cksum; /* checksum of contents */ > u_int msg_seqmod; /* range for sequence numbers */ > int msg_lastpri; /* saved priority value */ > - int msg_needsnl; /* set when newline needed */ > + u_int msg_flags; In KNF, the primary indentation is 1 tab, but this was broken for msgbuf in the same commit that broke its locking; this commit follows KNF, but this is inconsistent. The previous breakage was not in the usual way which expands 1 tab to 2 -- here it expands 1 tab to 1 tab plus 3 spaces, giving a weird indentatation of 11 columns. Weird indentations like this are difficult to be consistent with. > +#define MSGBUF_NEEDNL 0x01 /* set when newline needed */ > +#define MSGBUF_NEXT_NEW_LINE 0x02 In KNF, the primary indentation is 1 tab after #define. This is still followed for all other #define's in this file. The style of these #define's is very inconsistent. Only one has a comment. One abbreviates newline as NL, and the other verbosifies it as NEW_LINE. > struct mtx msg_lock; /* mutex to protect the buffer */ > }; So the fundamental error in this commit is just the logic error of calling the possibly-unsafe function getnanouptime(), and my complaint is mainly about the previous breakage. I first thought that this change was about print_uptime() in shutdown. This uses a lot of code to unprettily print the uptime in a primitive human-readable format (days, hours, minutes and seconds with dhms tags but no separators like ':'). It doesn't bother printing fractional seconds. It doesn't bogusly cast tv_sec to intmax_t. It bogusly casts to long instead. intmax_t is excessive whenever tv_sec is a delta-time. Here casting to plain int is enough for uptimes of up to 68 years, since 16-bit ints are not supported. print_uptime() is even less needed with uptimes in the msgbuf. It was almost useless precisely because it wasn't in the msgbuf. You had to watch the console to see it, but if you do that then you probably remember when it was booted, or just rebooted manually and should have checked the boot time first. Bruce