From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 10 11:24:58 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 891F4106568D for ; Wed, 10 Feb 2010 11:24:58 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id 494138FC28 for ; Wed, 10 Feb 2010 11:24:58 +0000 (UTC) Received: from ds4.des.no (des.no [84.49.246.2]) by smtp.des.no (Postfix) with ESMTP id 4C4E91FFC22; Wed, 10 Feb 2010 11:24:57 +0000 (UTC) Received: by ds4.des.no (Postfix, from userid 1001) id 2F8A3844C4; Wed, 10 Feb 2010 12:24:57 +0100 (CET) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Garrett Cooper References: <86tytqvwky.fsf@ds4.des.no> <26049703-8844-4476-B277-776A4EFC0A53@gmail.com> Date: Wed, 10 Feb 2010 12:24:57 +0100 In-Reply-To: <26049703-8844-4476-B277-776A4EFC0A53@gmail.com> (Garrett Cooper's message of "Tue, 9 Feb 2010 16:14:12 -0800") Message-ID: <86fx59jpti.fsf@ds4.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.95 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Andrew Brampton , freebsd-hackers@freebsd.org Subject: Re: sysctl with regex? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Feb 2010 11:24:58 -0000 Garrett Cooper writes: > C-shell globs as some programming languages referring to it as, > i.e. perl (which this is a subset of the globs concept) allow for > expansion via `*' to be `anything'. Regexp style globs for what you're > looking for would be either .* (greedy) or .+ (non-greedy), with it > being most likely the latter case. Uh, not quite. Formally, a regular expression is a textual representation of a finite state machine that describes a context-free grammar. A glob pattern can be trivially translated to a regular expression, but not the other way around. Basically, * in a glob pattern corresponds to [^/]*, ? corresponds to ., and [abcd] and [^abcd] have the same meaning as in a regular expression. The glob pattern syntax has no equivalent for +, ?, {m,n}, (foo|bar), etc. Some shells implement something that resembles alternations, where {foo,bar} corresponds to (foo|bar), but these are expanded before the glob pattern. For instance, /tmp/{*,*} is expanded to /tmp/* /tmp/*, which is then expanded to two complete copies of the list of files and directories in /tmp. There is no such thing as a "regexp style glob", and I have no idea what you mean by "a subset of the globs concept" or where Perl fits into the discussion. Finally, .* and .+ are *both* greedy. Perl's regular expression syntax includes non-greedy variants for both (.*? and .+? respectively). Note that the [], +, ? and {m,n} notations are merely shorthand for expressions which can be expressed using only concatenation, alternation and the kleene star, which are the only operations available in formal regular expressions. > I'll see if I can whip up a quick patch in the next day or so -- but > before I do that, does it make more sense to do globs or regular > expressions? There are pluses and minuses to each version and would > require some degree of parsing (and potentially escaping). I think you'll find that, at least in this particular case, regular expressions are an order of magnitude harder to implement than glob patterns. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no