Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Sep 2000 17:40:29 -0500
From:      David Drum <david@mu.org>
To:        freebsd-isp@freebsd.org
Subject:   Re: Mail box trimming tool
Message-ID:  <20000917174029.A94914@elvis.mu.org>
In-Reply-To: <200009151707.NAA32297@eviloverlord.org>; from mgoward@eviloverlord.org on Fri, Sep 15, 2000 at 01:07:08PM -0400
References:  <200009151707.NAA32297@eviloverlord.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--vkogqOf2sHV7VnPd
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Quoth Matt Goward:

> Does anyone know of a tool that given a username and quota, will trim a 
> mail box by deleting the oldest message until it is at the quota?  

Based on some statistics I generated for the 60,000-plus mailboxes I
am responsible for, the median size is usually very small (under 1KB)
but the standard deviation is very large (around 1.5MB).  So I don't know
that expiring based on a space quota is appropriate, since mailbox sizes
vary greatly.  We expire messages that are over 14 days old (based on
the headers).  I've attached the script we use.

Regards,

David Drum
david@mu.org

--vkogqOf2sHV7VnPd
Content-Type: application/x-perl
Content-Disposition: attachment; filename="expire_mail.pl"

#!/usr/local/bin/perl
#
# Copyright (c) Information Systems, The Press Association Limited 1993
# Portions Copyright (c) Computer Newspaper Services Limited 1993
# All rights reserved.
# 
# License to use, copy, modify, and distribute this work and its
# documentation for any purpose and without fee is hereby granted,
# provided that you also ensure modified files carry prominent notices
# stating that you changed the files and the date of any change, ensure
# that the above copyright notice appear in all copies, that both the
# copyright notice and this permission notice appear in supporting
# documentation, and that the name of Computer Newspaper Services not
# be used in advertising or publicity pertaining to distribution or use
# of the work without specific, written prior permission from Computer
# Newspaper Services.
# 
# By copying, distributing or modifying this work (or any derived work)
# you indicate your acceptance of this license and all its terms and
# conditions.
# 
# THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT ANY WARRANTIES OF ANY KIND,
# EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO ANY IMPLIED
# WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
# NONINFRINGEMENT OF THIRD PARTY RIGHTS.  THE ENTIRE RISK AS TO THE QUALITY
# AND PERFORMANCE OF THE SOFTWARE, INCLUDING ANY DUTY TO SUPPORT OR
# MAINTAIN, BELONGS TO THE LICENSEE.  SHOULD ANY PORTION OF THE SOFTWARE
# PROVE DEFECTIVE, THE LICENSEE (NOT THE COPYRIGHT OWNER) ASSUMES THE
# ENTIRE COST OF ALL SERVICING, REPAIR AND CORRECTION.  IN NO EVENT SHALL
# THE COPYRIGHT OWNER BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
# DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
# PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
# THIS SOFTWARE.
#
#
# $Id: expire_mail.pl,v 1.20 1999/02/03 00:13:40 root Exp $
#

#
# Information Systems Engineering Group
# Phil Male
#

local($_rcsid) = '$Id: expire_mail.pl,v 1.20 1999/02/03 00:13:40 root Exp $';
local($_copyright) = 'Copyright (c) Information Systems, The Press Association Limited 1993';

use  Getopt::Long;			# option handling
use Time::ParseDate;			# Time parsing
require "ctime.pl";			# time conversion
require "stat.pl";			# file status

# Perl mail expire.
# This program removes old messages from system mailboxes.
# It assumes the format of mailboxes to be standard
# sendmail format mail with a blank line followed by a `From ' line
# starting each and every message. Mailbox locking is via flock.
# Works under SunOS.
#
# Options as follows:
# -verbose		verbose output
# -version		display version information and quit
# -debug		debug mode (no change to mailbox)
# -cron			display messages for crontab output
# -zero			do not delete zero length mailboxes
# -noreset		do not reset access and modification times on mailbox
# -alwaysopen		always open mailbox, never just test modification date
# -appendmsg		append a message detailing deleted messages for the user
# -deliverydate		do not record delivery of mail summary on mailbox date
# -write directory      write the expired mail to a tmporary file in directory
# -older days		messages whose age is greater than days are expired
# -from user		only consider messages from user (regexp)
# -status read|old	only consider messages with status `old' or `read'
# -subject subject	only consider messages with subject (regexp)
#
# Based on expire_mail by Steve Mitchell (steve_mitchell@csufresno.edu)
#

#####
#
# Definitions
#
#####

open (CW, "/etc/sendmail.cw");
$cw = <CW>;
close (CW);
chomp $cw;

# site postmaster - XXX change this as required
$postmaster = "helpdesk";
$postmaster .= "\@$cw" if ($cw ne "");

# current user
$me = getlogin || (getpwuid($<))[0] || "unknown";
$home = $ENV{'HOME'};

# default mailbox for a user - XXX change this as required
$default_mailbox = $ENV{'MAIL'} || "/var/mail/$me";

# notice to append to list of deleted messages
$notice = "
Please read your mail on a regular basis. Old mail should be deleted,
or be filed in your personal mail folders. If you do not know how to
use mail folders, please contact the helpdesk.

If you have any other queries regarding the mail system, please send
mail to $postmaster.

Processed by $_rcsid";

# set the umask for temp files
umask( 0177 );

# make stdout unbuffered
select(STDOUT); $| = 1;

$LOCK_EX = 2;				# lock
$LOCK_UN = 8;				# unlock
$START_TIME = time;			# time right now
$SEC_PER_DAY = 24 * 60 * 60;		# seconds in a day
$line_buffer = "";			# empty line buffer

#####
#
# Support
#
#####

# line buffer for look-ahead

sub get_line
{
  local( $line ) = "";			# line to return

  if( ! ($line_buffer eq "") ) {
    $line = $line_buffer;
    $line_buffer = "";
  } else {
    $line = <MBOX>;
  }
  return $line;
}

# read message from mailbox

sub read_message
{
  local( $msg ) = "";			# message to send back
  local( $seen_from ) = 0;		# seen a from line
  local( $line ) = "";			# current line

  # reset some globals
  $msg_status = "";
  $msg_subject = "";
  $msg_date = "";

  while( $line = &get_line ) {
    
    # if( $line =~ /^From\s+([^\s]+)\s+([A-Za-z]+\s+[A-Za-z]+\s+[0-9]+\s+[0-9:]+.*\s+[0-9]+)$/ ) {
    if( $line =~ /^From\s+(\S+)\s+.*?([\w\s]*\d\d:\d\d:\d\d[\w\s]*).*?$/ ) {
      # if previous line was blank, then legal from line
        # if already seen a legal from line, then this is next message
	if( $seen_from ) {
	  # pushback this from line
	  $line_buffer = $line;
	  return $msg;
	}
	$seen_from++;
        # From line found, extract information
        ( $msg_from, $msg_date ) = ( $1, $2 );
	$msg_stamp = &rctime( $msg_date );
	print "Unable to parse date: $msg_date\n" if (!$msg_stamp);
        $msg_age = &days_old( $msg_stamp );
    } elsif( $line =~ /^[Ss]tatus: ([A-Za-z]+)/ ) {
      ( $msg_status ) = ( $1 );
    } elsif( $line =~ /^[Ss]ubject: (.*)$/ ) {
      ( $msg_subject ) = ( $1 );
    }

    $msg .= $line;
  }

  return $msg;
}

# write a message into a mailbox

sub write_message
{
  print TMPF "@_";
}

# parse the ctime string into a time value
# From line contains local time

sub rctime
{
  local( $pt ) = @_;			# time to convert
  local( $ct ) = -1;			# converted time

  $ct = parsedate ($pt, WHOLE => 1, DATE_REQUIRED => 1, TIME_REQUIRED => 1, NO_RELATIVE => 1, PREFER_PAST => 1);
  # take another crack at it in case the header is malformed
  $ct = parsedate ($pt, DATE_REQUIRED => 1, TIME_REQUIRED => 1, NO_RELATIVE => 1, PREFER_PAST => 1) unless ($ct);
  return $ct;
}

# age in days

sub days_old
{
  local( $agev ) = @_;			# time to convert

  return( ( $START_TIME - $agev ) / $SEC_PER_DAY );
}

# basename

sub basename
{
  local( $path ) = @_;			# path to find the base of
  local( $base ) = rindex( $path, "/" );

  if( $base < 0 ) {
	$base = $path;
  } else {
	$base = substr($path, $base + 1);
  }

  return $base;
}

# usage message

sub usage
{
  print STDERR "usage: expire_mail [-verbose -cron -version] [-zero -alwaysopen -noreset -deliverydate -appendmsg] [-debug -write directory] { [-older days] [-from user] [-status read|old] [-subject subject] } mailbox...\n";
  exit 0;
}

#####
#
# Main
#
#####

&GetOptions("verbose", "version", "debug",
	    "cron", "zero", "noreset",
	    "appendmsg", "deliverydate",
	    "older=i", "from=s", "subject=s",
	    "status=s", "alwaysopen", "write=s") || &usage;

# compat
$opt_older = $opt_older if ($opt_older && !$opt_older);

# check version
if( $opt_version ) {
  print "expire_mail: mail expiry agent\n";
  print "expire_mail: $_rcsid\n";
  &usage;
}

# use default mailbox if non supplied
if( $#ARGV < $[ ) {
  $ARGV[0] = "$default_mailbox";
}

# decode status option
if( $opt_status ) {
  if( $opt_status eq "old" ) {
    $opt_status = "O";
  } elsif( $opt_status eq "read" ) {
    $opt_status = "R";
  } else {
    print STDERR "expire_mail: status may only be one of `old' or `unread'\n";
    &usage;
  }
}

# check we are actually doing some processing
if(!defined $opt_older && !defined $opt_from && 
   !defined $opt_status && !defined $opt_subject ) {
  print STDERR "expire_mail: must specify at least one of -older, -from, -status or -subject\n";
  &usage;
}

# debug mode implies verbose mode
if( $opt_debug ) { $opt_verbose = 1; }

# foreach mailbox...
while( $mailbox = shift ) {

  if( $opt_verbose ) { print STDOUT "Checking mailbox $mailbox\n"; }

  # does mailbox exist
  if( ! -f $mailbox ) { next; }

  # stat the mailbox
  @sb = &Stat($mailbox);

  # can it be deleted now?
  if( !$opt_alwaysopen && $opt_older ) {
    # check the modification date
    $age = &days_old($sb[$ST_MTIME]);
    if( $age > $opt_older ) {
      if( $opt_verbose ) { print STDOUT "Expiring mailbox $mailbox\n"; }
      if( !$opt_debug ) {
        if( $opt_zero ) {
          open( MBOX, ">$mailbox" ) || 
	    print STDERR "expire_mail: failed to truncate $mailbox\n";
	  close( MBOX );
        } else {
          unlink( $mailbox ) ||
	    print STDERR "expire_mail: failed to remove $mailbox\n";
        }
      }
      next;
    }
  }

  # open the mailbox
  if( !open( MBOX, "+<$mailbox" ) ) {
    print STDERR "expire_mail: unable to open $mailbox\n";
    next;
  }

  # lock the mailbox
  if( !flock( MBOX, $LOCK_EX ) ) {
    print STDERR "expire_mail: unable to lock $mailbox\n";
    close( MBOX );
    next;
  }

  # open the temporary file
  $tmpname = "${mailbox}.exp.$$";
  if( !open( TMPF, "+>$tmpname" ) ) {
    print STDERR "expire_mail: unable to create temporary file for $mailbox\n";
    close( MBOX );
    next;
  }

  if(defined $opt_write) {
    my $tmp = $mailbox;
    @a    = split(/\//,$tmp);
    $ntmp = "$opt_write/${a[$#a]}.expwrite.$$";
    if( !open( EXPIRED, "+>$ntmp") ) {
	print STDERR "expire_mail: unable to create debug file for $mailbox\n";
	close MBOX;
	next;
    }
  }
  # init counters
  $count = 0;
  $exp = 0;
  @subjects = ();

  # read each message in turn
  while( $msg = &read_message ) {

    $count++;

    # looking for specific from users
    if( $opt_from ) {
      if( ! ($msg_from =~ /$opt_from/i) ) {
        if( $opt_verbose ) {
	  print STDOUT "\tMsg #$count: from   \r";
	}
	&write_message( $msg );
	next;
      }
    }

    # check message status
    if( $opt_status ) {
      if( !($msg_status =~ /$opt_status/i) ) {
	if( $opt_verbose ) {
	  print STDOUT "\tMsg #$count: status   \r";
	}
	&write_message( $msg );
	next;
      }
    }

    # check message subject
    if( $opt_subject ) {
      if( ! ($msg_subject =~ /$opt_subject/i) ) {
        if( $opt_verbose ) {
	  print STDOUT "\tMsg #$count: subject   \r";
	}
        &write_message( $msg );
        next;
      }
    }

    # only other thing to check is message age
    if( $opt_older ) {
      if( $msg_age <= $opt_older ) {
        if( $opt_verbose ) {
	  print STDOUT "\tMsg #$count: newer   \r";
	}
        &write_message( $msg );
        next;
      }
    }

    # save the expired messages
    if (defined $opt_write) {
	print EXPIRED $msg;
    }

    # log the expiry
    if( $opt_verbose ) {
	print STDOUT "\tMsg #$count: expired   \r";
    }

    # copy message across if in debug
    if( $opt_debug ) {
      &write_message( $msg );
    } else {
      # record the mail message from and subject line
      $pad = ' ' x (25 - length($msg_from) );
      $npad = ' ' x ( 4 - length($count) );
      $subjects[$exp] = "$npad$count $msg_from$pad $msg_date\n     $msg_subject\n";
    }

    # increment the expired message count
    $exp++;
  }

  if( !$opt_debug ) {

    # if sending mail to the owner of the mailbox, append message on the end

    if( defined $opt_appendmsg && $exp > 0 ) {
      chomp( $lt = localtime(time) );
      print TMPF "\nFrom $postmaster $lt\n";
      print TMPF "From: $postmaster (Mail Expiry Agent)\n";
      chomp( $ct = &ctime(time) );
      @ct = split (/\s+/, $ct);
      $ct = join (' ', $ct[0] . ",", $ct[2], $ct[1], $ct[5], $ct[3], "(" . $ct[4] . ")");
      print TMPF "Date: $ct\n";
      $to = &basename( $mailbox );
      print TMPF "To: $to\n";
      print TMPF "Subject: Expired Mail Summary\n\n";
      print TMPF "The following messages have been automatically removed from your\n";
      print TMPF "mailbox by the mail expiry agent.\n\n";
      # fitted to $subjects layout
      print TMPF " Msg From & Subject            Dated\n\n";
      foreach $msg ( @subjects ) {
        print TMPF "$msg\n";
      }
      print TMPF "$notice\n\n";

      if( !$opt_deliverydate ) {
        # set the modification time for the mailbox to be now
        $sb[$ST_MTIME] = time;
      }
    }

    # copy data back into mailbox to preserve permissions, creation time
    # and user and group id

    # zero length the mailbox
    truncate( MBOX, 0 );
    # *** START Critical
    # any data to copy?
    if( $exp <= $count ) {
      # restart both files
      seek(MBOX, 0, 0);
      seek(TMPF, 0, 0);
      # copy file into mailbox, better with sysread/syswrite?
      while( <TMPF> ) {
	print MBOX $_;
      }
    } elsif( !defined $opt_zero ) {
      unlink( $mailbox );
    }
    # *** END Critical

  }

  # unlock mailbox
  flock( MBOX, $LOCK_UN );

  # close files
  close( MBOX );
  close( TMPF );

  unlink( $tmpname );
  
  # reset access and modification dates
  # if we have sent mail, then the modification time is the time of the mail
  if( !defined $opt_noreset ) {
    utime( $sb[$ST_ATIME], $sb[$ST_MTIME], $mailbox );
  }

  # show counters
  if( $opt_verbose || ( $opt_cron && $exp ) ) {
    print "$mailbox contained $count messages, expired $exp messages\n";
  }
}



--vkogqOf2sHV7VnPd--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-isp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000917174029.A94914>