From owner-freebsd-arch@FreeBSD.ORG Thu Apr 19 09:31:16 2007 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A4D7E16A404; Thu, 19 Apr 2007 09:31:16 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 51CFA13C46E; Thu, 19 Apr 2007 09:31:16 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id C5AC447413; Thu, 19 Apr 2007 05:31:15 -0400 (EDT) Date: Thu, 19 Apr 2007 10:31:15 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Diomidis Spinellis In-Reply-To: <46231C64.9010707@aueb.gr> Message-ID: <20070419101815.Y2913@fledge.watson.org> References: <461958CC.4040804@aueb.gr> <20070414170218.M76326@fledge.watson.org> <4621E826.6050306@aueb.gr> <20070415105157.J84174@fledge.watson.org> <46231C64.9010707@aueb.gr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@FreeBSD.org, re@FreeBSD.org Subject: Re: Accounting changes X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Apr 2007 09:31:16 -0000 On Mon, 16 Apr 2007, Diomidis Spinellis wrote: > Robert Watson wrote: > >>>> What do you think of the idea of changing the file format a little to >>>> include a short file header at the front, and that the first field of >>>> that head is zero-filled u_int32_t, and the second a version number? >>>> Right now, the first field of the acct structure is the name of the >>>> command, which will always be a non-nul string, so always have a first >>>> character non-nul. If we see non-nul data in the file header's first >>>> field, we use the old structure layout, and otherwise we check the >>>> version number and use the new layout? This would provide backwards >>>> compatibility for reading old accounting data, which I would think would >>>> generally be desirable, and allow us to explicitly version the file in >>>> the future. >> >> The sites I know of that use accounting don't care about CPU use in the >> sa(8) sense at all. They care about tracking commands run. While acct(5) >> doesn't do this extraordinarily well, it does it well enough to allow basic >> command execution logging and analysis. Hence the desire to be able to >> continue readding preserved acct(5) data files in the future. > > I see three options for satisfying this requirement. > > One is to move the existing acct.h into usr.bin/lastcomm, and add to > lastcomm(1) and option to read legacy files. I don't like this approach, > because it doesn't include sa(8) in the picture, and, more importantly, it > doesn't scale well for future changes. Every time we change the type of a > field of acct.h (for example widening ac_gid) we will have to add > architecture-specific code in the legacy file reading module. If we're willing to assume architectures can only read their own accounting files (the status quo), the above argument doesn't really make sense. You end up with a series of versions of "struct acct", and that code is architecture-neutral. Thinking about it more, I'm not sure a per file header is even required or desired (as I had previously suggested), simply a per-record versioning scheme, allowing a reboot onto a new kernel to continue to write to the existing accounting data. Read the first 16 bytes, if the first byte is non-0 then it's the original "struct acct" layout, and otherwise the second byte is the version number to use. Or in the interests of forward compatibility, include a length parameter in another 16 bytes so you can skip over records if necessary in order to allow the kernel to move back and forward across file versions if there's a problem after the upgrade. > A variation of the above approach would be to create a library for reading > legacy accounting data formats. I think this is an overkill, given that the > two users are sa(8) and lastcomm(1), and of the two lastcomm appears to be > really needed. Sounds like overkill. All you really need is a common routine to return the next record in the current native version given a file descriptor for the open file, and that one routine can handle the versioning concerns easily. No need to have a library, just compile a common .c file from the lastcomm directory into the sa directory (or vice versa). Notice that sa's decoding routing already does conversion from the file type to C types for computation. > The approach I favor is to add to lastcomm an option to dump an accounting > file in text format, and a second option to read text accounting data from > stdin and write it out in the current accounting file format. Users can > then either store accounting data in (compressed) text files, or pipe them > through a pipeline that will transform the legacy format into the current > one. (In the latter case they will need to keep through an upgrade a > lastcomm(1) binary compiled to read the legacy format - I can provide the > appropriate cvs incantation in UPDATING). This approach also simplifies the > writing of test cases. You're putting the burden on the people with data they need to preserve to deal with checking out specific revisions of accounting source code from CVS, get it building on whatever the current rev is (perhaps requiring a buildworld to get build tools and libraries), etc? Your basic assumption in all of this is that no one uses or preserves accounting data, and I think that is a false assumption. On all of my server boxes, I keep at least five days of back accounting data, and I know of sites that keep back accounting data for months or years. I don't think you should be assuming no one cares about this data and breaking compatibility. Since there's a structured file format, it's easy to provide compatibility (and we can make it easier in the future by adding versioning information this time). I certainly don't object to the text export, but I don't think it really addresses the problem of backward compatibility at all. Robert N M Watson Computer Laboratory University of Cambridge