From owner-freebsd-questions@FreeBSD.ORG Mon Jul 14 07:08:19 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7556C37B401 for ; Mon, 14 Jul 2003 07:08:19 -0700 (PDT) Received: from conn.mc.mpls.visi.com (conn.mc.mpls.visi.com [208.42.156.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id D307D43F75 for ; Mon, 14 Jul 2003 07:08:18 -0700 (PDT) (envelope-from hawkeyd@visi.com) Received: from sheol.localdomain (hawkeyd-fw.dsl.visi.com [208.42.101.193]) by conn.mc.mpls.visi.com (Postfix) with ESMTP id F26678375 for ; Mon, 14 Jul 2003 09:08:17 -0500 (CDT) Received: (from hawkeyd@localhost) by sheol.localdomain (8.11.6p2/8.11.6) id h6EE8Hm27464 for freebsd-questions@freebsd.org; Mon, 14 Jul 2003 09:08:17 -0500 (CDT) (envelope-from hawkeyd) X-Spam-Policy: http://www.visi.com/~hawkeyd/index.html#mail Date: Mon, 14 Jul 2003 09:08:16 -0500 From: D J Hawkey Jr To: questions at FreeBSD Message-ID: <20030714140816.GA27439@sheol.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i Subject: Attn: sed(1) regular expression gurus X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: hawkeyd@visi.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jul 2003 14:08:19 -0000 Hi all. I'm getting really frustrated by a seemingly simple problem. I'm doing this under FreeBSD 4.5. Given these portions of an e-mail's multi-line Received header as tests: by some.host.at.a.com (Postfix) with ESMTP id 3A4E07B03 by some.host.at.a.com (8.11.6) ESMTP; by some.host.at.a.different.com (8.11.6p2/8.11.6) ESMTP; by some.host.at.another.com ([123.4.56.789]) id 3A4E07B03 by some.host.at.yet.another.com (123.4.56.789) id 3A4E07B03 I want to isolate the addresses (one for the 1st through 3rd, two for the 4th and 5th). Here's the sed(1) command I'm playing with: echo "by nospam.mc.mpls.visi.com (Postfix) with ESMTP id 3A4E07B03" \ |sed -E \ -e "s/by[[:space:]]+//" \ -e "s/(\((\[?([0-9]{1,3}\.){3}[0-9]{1,3}\]?){0}\)|id|with|E?SMTP).*//" In all cases, the parenthetical word is returned, when only the last two should return the parenthetical word. The idea behind the first branch of the second sed(1) command is to match anything that isn't a "digits.digits.digits.digits" pattern. I've tried simpler expressions like "\(\[?[^0-9.]+\]?\)", but it fails on the third example. What the devil am I doing wrong?? Am I exercizing known bugs in GNU's sed(1)? Can anyone dream up a different solution - please, no Perl, but awk(1) is fine. Thanks, Dave -- ______________________ ______________________ \__________________ \ D. J. HAWKEY JR. / __________________/ \________________/\ hawkeyd@visi.com /\________________/ http://www.visi.com/~hawkeyd/