Date: Sun, 28 Mar 2010 22:39:45 -0400 From: Mark Shroyer <subscriber+freebsd@markshroyer.com> To: freebsd-questions@freebsd.org Subject: Re: procmail regex help ... sometimes works, sometimes doesn't... Message-ID: <4BB012F1.6020202@markshroyer.com> In-Reply-To: <471394.79697.qm@web111611.mail.gq1.yahoo.com> References: <471394.79697.qm@web111611.mail.gq1.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 3/28/2010 6:34 PM, George Sanders wrote: > I have added a very standard, very common regex line to my > .procmailrc to filter character sets I can't read: > > UNREADABLE='[^?"]*big5|iso-2022-jp|ISO-2022-KR|euc-kr|gb2312|ks_c_5601-1987|ks_c_5601|3Deuc-kr|koi8' > :0: > * ^Content-Type:.*multipart > * B ?? $ ^Content-Type:.*^?.*charset="?($UNREADABLE) > unreadable_messages > > I know that this works because my "unreadable_messages" mail file is > now full of messages with headers like: > > From: =?GB2312?B?xMLTq9Or?= <uigvrutit@heki.net> > Subject: =?GB2312?B?MjAxMMTqyMvBptfK1LS4w9bYytPKssO0?= > To: "me" <me@me.com> > Content-Type: text/html; > charset="gb2312" > > However, a lot of mail gets through to my inbox that matches: > > From: "osdeiiftnvpp@gmail.com" <xjyfgzyjm@gmail.com> > Reply-To: "osdeiiftnvpp@gmail.com" <xjyfgzyjm@gmail.com> > Message-ID: <533pbxxy2oc> > To: me <me@me.com> > Subject: Fw: \xb8\xf2\xad\xe8\xa5X\xa8\xd3\xbd\xe6~\xb1o\xb4\xa9\xa9f\xaa\xb1\xb5L\xaeM\xa4\xba\xaeg\xb2n\xa7o > X-Mailer: inhalation > Organization: Microsoft Outlook Express 6.00.2462.0000 > Mime-Version: 1.0 > Content-Type: multipart/alternative; > boundary="1-104247307-2712732737=:8213" > Status: RO > X-Status: > X-Keywords: > X-UID: 63502 > > --1-104247307-2712732737=:8213 > Content-Type: text/plain; charset="big5" > Content-Transfer-Encoding: quoted-printable > > However, "big5" is very clearly listed in my regex above, and as far > as I can tell, this mail should match perfectly... > > I cannot see why these "big5" emails are not matching my procmail > regex ... is it obvious to anyone ? This is just a shot in the dark, but do you find that the unreadable messages that this rule successfully matches have the relevant Content-Type header in the message's "main" header group, whereas the messages that should match but fail to do so have the Content-Type header in a MIME attachment, as in your example? (Apologies for the imprecise terminology.) -- Mark Shroyer http://markshroyer.com/contact/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BB012F1.6020202>