From owner-freebsd-questions@FreeBSD.ORG Fri Aug 25 18:27:04 2006 Return-Path: X-Original-To: questions@freebsd.org Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CABCC16A4E1 for ; Fri, 25 Aug 2006 18:27:04 +0000 (UTC) (envelope-from mjkarki@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.170]) by mx1.FreeBSD.org (Postfix) with ESMTP id F2AFE43D4C for ; Fri, 25 Aug 2006 18:27:03 +0000 (GMT) (envelope-from mjkarki@gmail.com) Received: by ug-out-1314.google.com with SMTP id m2so990771uge for ; Fri, 25 Aug 2006 11:27:03 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=Uh5jxj8LH3Z999woFGenCCye4LzEAk/4wr+OenNExwK5dpwf162StLo09r8jajBUvM5eDrQtvsLeONQ4VKeB3CswhL2m9c8nO8MfzXEY8lXKsx3WmYUD2A7s6QAC9zBmgGDfXDlujeVCxiv481QP8aTpbXOYjwljzJgESHTQa2A= Received: by 10.66.244.10 with SMTP id r10mr2056434ugh; Fri, 25 Aug 2006 11:27:03 -0700 (PDT) Received: by 10.67.101.7 with HTTP; Fri, 25 Aug 2006 11:27:03 -0700 (PDT) Message-ID: <1b15366e0608251127r7067e0c5wd5ba6dea29e1a011@mail.gmail.com> Date: Fri, 25 Aug 2006 21:27:03 +0300 From: "Matti J. Karki" Sender: mjkarki@gmail.com To: "=?ISO-8859-1?Q?Kyrre_Nyg=E5rd?=" In-Reply-To: <7.0.1.0.2.20060825160431.023a1650@broadpark.no> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <7.0.1.0.2.20060824145822.0194fc10@broadpark.no> <1b15366e0608240618j62d41ad3j537f095b2e566ed5@mail.gmail.com> <7.0.1.0.2.20060824192439.02386de8@broadpark.no> <5.1.0.14.2.20060825064053.01eebec0@209.152.117.178> <1b15366e0608250531q6187d598h78b02e14ab4b5ac2@mail.gmail.com> <7.0.1.0.2.20060825160431.023a1650@broadpark.no> X-Google-Sender-Auth: b9c826816a3a82ff Cc: "W. D." , questions@freebsd.org Subject: Re: Code beautifiers, anyone? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Aug 2006 18:27:05 -0000 On 8/25/06, Kyrre Nyg=E5rd wrote: > > In your script, do these comments look alright then? > > (I simplified them a bit) > > inbuffer =3D re.sub('\) *?\n\{', ') {', inbuffer) # Move curly brackets > to the end of lines > inbuffer =3D re.sub('\) *?{', ') {', inbuffer) # Remove spaces between > closing brackets and opening curly brackets > inbuffer =3D re.sub('else *?\n{', 'else {\n', inbuffer) # Fix curly > brackets in `else' clauses > inbuffer =3D re.sub('{ *?(.+?\n)', '{\n\g<1>', inbuffer) # Break up the > content of curly brackets > inbuffer =3D re.sub('(\n.+?)}', '\g<1>\n}', inbuffer) # Take care of > closing brackets from the above rule Looks OK, except... > inbuffer =3D re.sub('\n +', '\n', inbuffer) # Strip trailing whitespace This will strip spaces at the _beginning_ of line (leading spaces). > inbuffer =3D re.sub('\t+', '', inbuffer) # Strip trailing tabs And this will strip _all_ tabs. This is important to remember. More of that later... > > And also, I noticed you put <'\n +', '\n', inbuffer> twice, > is one enough like in the above example? > Yes and no. I have used to do this operation in two steps. First, before anything, I'll strip all leading spaces from every line of code. Then I'll apply all other rules. And after everything else in place, I'll strip all leading spaces _again_, because some of my rules will produce spaces that will mess up the indentation process later. > > After that, I can't wait to run it over the FreeBSD codebase and watch > the added value it gets. Then I can start selling the script to governmen= ts. > Just kidding :) But it would be nice to reverse engineer all those commer= cial > code parsers that hunt for bugs and create my own that I'll eventually > hook up with some artificial intelligence. > Well, it's good to have ambitious goals... A word of warning. Above, I mentioned that it's important to remember that my example will remove _all_ tabulator characters from text. This means that - for example - all lines with indentation inside the code comments will be messed up (remember, usually a tab character equals 8 spaces and I'm removing tabs all together). This leads me to my point: my code does not handle multi-line comments at all. They may look messed-up. A single curly bracket inside of a comment will throw the indentation code out of sync. Also the code does not address line breaks "\", so for example the macro definitions will not be indented correctly. So, it will take some additional work to be able to run the script without any side effects. The script was meant to be run on a source code with poor style and with no (or very few) comments. -Matti