From owner-freebsd-questions@FreeBSD.ORG  Mon Jul 14 07:08:19 2003
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7556C37B401
	for <freebsd-questions@freebsd.org>;
	Mon, 14 Jul 2003 07:08:19 -0700 (PDT)
Received: from conn.mc.mpls.visi.com (conn.mc.mpls.visi.com [208.42.156.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D307D43F75
	for <freebsd-questions@freebsd.org>;
	Mon, 14 Jul 2003 07:08:18 -0700 (PDT)
	(envelope-from hawkeyd@visi.com)
Received: from sheol.localdomain (hawkeyd-fw.dsl.visi.com [208.42.101.193])
	by conn.mc.mpls.visi.com (Postfix) with ESMTP id F26678375
	for <freebsd-questions@freebsd.org>;
	Mon, 14 Jul 2003 09:08:17 -0500 (CDT)
Received: (from hawkeyd@localhost)
	by sheol.localdomain (8.11.6p2/8.11.6) id h6EE8Hm27464
	for freebsd-questions@freebsd.org;
	Mon, 14 Jul 2003 09:08:17 -0500 (CDT)	(envelope-from hawkeyd)
X-Spam-Policy: http://www.visi.com/~hawkeyd/index.html#mail
Date: Mon, 14 Jul 2003 09:08:16 -0500
From: D J Hawkey Jr <hawkeyd@visi.com>
To: questions at FreeBSD <freebsd-questions@freebsd.org>
Message-ID: <20030714140816.GA27439@sheol.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.1i
Subject: Attn: sed(1) regular expression gurus
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: hawkeyd@visi.com
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jul 2003 14:08:19 -0000

Hi all.

I'm getting really frustrated by a seemingly simple problem. I'm doing
this under FreeBSD 4.5.

Given these portions of an e-mail's multi-line Received header as tests:

  by some.host.at.a.com (Postfix) with ESMTP id 3A4E07B03
  by some.host.at.a.com (8.11.6) ESMTP;
  by some.host.at.a.different.com (8.11.6p2/8.11.6) ESMTP;
  by some.host.at.another.com ([123.4.56.789]) id 3A4E07B03
  by some.host.at.yet.another.com (123.4.56.789) id 3A4E07B03

I want to isolate the addresses (one for the 1st through 3rd, two for
the 4th and 5th). Here's the sed(1) command I'm playing with:

  echo "by nospam.mc.mpls.visi.com (Postfix) with ESMTP id 3A4E07B03" \
      |sed -E \
        -e "s/by[[:space:]]+//" \
        -e "s/(\((\[?([0-9]{1,3}\.){3}[0-9]{1,3}\]?){0}\)|id|with|E?SMTP).*//"

In all cases, the parenthetical word is returned, when only the last
two should return the parenthetical word. The idea behind the first
branch of the second sed(1) command is to match anything that isn't a
"digits.digits.digits.digits" pattern. I've tried simpler expressions
like "\(\[?[^0-9.]+\]?\)", but it fails on the third example.

What the devil am I doing wrong?? Am I exercizing known bugs in GNU's
sed(1)? Can anyone dream up a different solution - please, no Perl, but
awk(1) is fine.

Thanks,
Dave

-- 
  ______________________                         ______________________
  \__________________   \    D. J. HAWKEY JR.   /   __________________/
     \________________/\     hawkeyd@visi.com    /\________________/
                      http://www.visi.com/~hawkeyd/