#!/usr/bin/perl -w #-=-=-=-=-=-=- # Author: Fred Morris # Date: 21-May-2004 # Version: 2.7 # =pod =head1 perljacket A processor/filter callable from procmail to do a variety of things: =over 4 =item * Insert X-Headers for BCC and mailing list handling. =item * Support updating of the BCC/mailing list filter via e-mail. =item * Fire handlers at the conclusion of message and MIME header blocks for rewriting multipart/parallel or multipart/alternative as multipart/mixed, rewriting non-multipart text/html bodies as attachments, dropping text/html multipart/alternative parts when a text/plain part exists, ensuring file disposition rather than in-lining of non-text MIME entities, killing VCARDS and similar junk. =back =cut # # #-=-=-=-=-=-=- use strict; use Fcntl qw(:DEFAULT :flock); # -=-=-=-=- Global Variables. use vars qw( %HeaderBlocks @BlockHeaders @OrigHeaders @CurrentHeaders @OrderedHeaders %Boundaries $MIMESepCache @From @To $Rewrite $UseFile %Interpolations @SeparatorRewrite %Options); $Rewrite = 0; =pod =head2 Global Variables The following global variables may be useful to you when writing your own part handlers: =over 4 =item %HeaderBlocks This hash provides subscripts into the following four arrays for block specifiers of the form 0 (for the mail headers), 0.1 for an entity at the top MIME level or a multipart header at that level, 0.1.1 for the first entity subpart of the first MIME part, 0.1.2 for the second sub part of the first MIME part, etc. =item @BlockHeaders Reverses the operation of %HeaderBlocks, providing the key values. =item @OrigHeaders Indexes hashes of the original (unmodified) headers of the part. =item @CurrentHeaders Indexes hashes of the edited headers of the part. =item @OrderedHeaders An array of arrays of the names of headers for the part, in the order encountered. =item $Rewrite If true, then content rewrites are performed. Toggled by the No-Rewrite and Rewrite header cliches. =item %Interpolations Contains strings extracted from headers which are available for substitution into headers. =item @SeparatorRewrite Some idiot spammers thnk it's cute to mangle MIME separators, so we mangle them some more. =back =cut # -=-=-=-=- Arguments. use Getopt::Std; getopts( "fdes:t:", \%Options ); $UseFile = ($Options{"f"}) ? 1 : 0; # FWM, 05-May-2004 =pod =head2 Command Line Options The program takes the following switches: =over 4 =item f If supplied, then input is read from the file C instead of from STDIN. =item d If supplied, then a debugging line is printed each time a handler is fired. =item e If supplied, then each line is echoed before being processed. =item s integer If supplied, then the value is used for additional randomization when generating verification cookies for mail-based filter updates. =item t integer The time to live in seconds for randomized verification cookies. If not supplied, then mail-based filter updates are disabled. =back =cut srand (time() ^ $Options{"s"}) if (defined $Options{"s"}); # -=-=-=-=- Prototypes. # Utility subroutines. sub line($$); sub end_of_file(); sub level($$); sub handlers($$); sub header_match($$$); sub update_filter($); sub build_interpolation($$$&); # The handler subroutines. sub at_Header($); sub at_Multipart($$); sub at_Entity($$); =pod =head2 Part Handlers Three part handlers are defined for processing mail. The firing of these handlers is (mostly) controlled by two special header item flags in perljacket.filter. =over 4 =item at_Header This handler is called before the (global) mail headers are written back out. =item at_Multipart This handler is called before the headers for a multipart entity are written back out. This means that it may be called at the global level as well, after the at_Header handler. =item at_Entity This handler is called before the headers for a (final) entity are written back out. It is ONLY called for MIME final entities, it will NOT be called for the body of a mail message which is not a MIME part. =back =cut # Routines useful for writing handlers sub parent($); sub self_match($$@); sub sibling_match($$@); sub parent_match($$@); sub get_body(); sub insert_body($); sub delete_part(); sub set_header($$$;$); sub get_header($$); sub delete_header($$); sub read_from_file($&&); =pod =head2 Routines for Writing part handlers Some routines are supplied which should be handy for writing part handlers: =over 4 =item parent( entity-spec ) Returns a string containing the "dotted" parent of the passed string. =item self_match, sibling_match, parent_match( matchexpr, subscript, header-list ) Each of these routines matches one or more headers, and returns the subscript where found, and header key. The functions do a bare return if there is no match. The arguments are: =over 4 =item match expression The regular expression to be matched. =item subscript The subscript of the caller, "me". =item headers Array-by-value of the headers to be searched. =back =item get_body() Returns a string with the body of the part at this level, consuming the part. =item insert_body( new-body ) Inserts the string argument at the beginning of the body for this part. =item delete_part() Suppresses the part. =item set_header( header, new-text, subscript, instance ) Sets the specified header. The instance is optional, and has the following semantics: =over 4 =item (-1) Create a new instance. =item (0) Replace/create last instance. =item (1..n) Replace/create (n-1)th instance. =item No instance Equivalent to instance 0. =back =item get_header( header, subscript ) Returns the header array reference for the specified header name and part. =item delete_header( header, subscript ) Deletes all instances of the specified header. =item read_from_file( arg-hash, error-handler, reader-handler ) Encapsulates reading from a file, with error handling and locking. The first argument is a hash, and read_from_file will expect to see: subscript => $subscript filename => name of the file to read Additionally, the error handler can expect to see: error => an error message and the reader handler can expect to see: file => a reference to the filehandle =back =cut # -=-=-=-=- Scope Cliches (pod at bottom of file) @From = qw/ From: Received: Message-ID: Resent-From: Resent-Sender: Sender: Reply-To: /; @To = qw/ To: Cc: Bcc: Resent-To: Resent-Cc: Resent-Bcc: /; # -=-=-=-=- Main processing is at the bottom... # -=-=-=-=- Utility Functions and Subroutines # Function which returns the next line from STDIN, concatenating header lines # as needed. Notice the private variables, which are used for data persistence. # Also notice companion function to return end-of-file condition, and another one # to push a line back into the input stream. { my $is_eof = undef; my @lookahead = (); my $readline = sub { if (@lookahead <= 1) { my $line = undef; if ($UseFile) { # FWM, 05-May-2004 $is_eof = 1 if ($is_eof || (!($line = ))); } else { $is_eof = 1 if ($is_eof || (!($line = ))); } push @lookahead, $line if (defined $line); } return shift @lookahead; }; sub line($$) { # Returns (the line, blank). my $want_header = shift; my $nesting = shift; my $blank = 0; my $line; if ($want_header) { $blank = ($line = &$readline()) =~ /^\s*$/o; # Concatenate continued header lines. if (! $blank) { $line .= &$readline() while ((@lookahead > 0) && ($lookahead[0] =~ /^\s+\S/o)); } return ( $line, $blank ); } else { # For the body, just return the line. $line = &$readline(); } # +++ FWM, 05-May-2004 # Mangle mangled separators. foreach my $sep (@SeparatorRewrite) { my $k; my $v; ($k,$v) = @{$sep}[0 .. 1]; $line =~ s/$k$/$v/; $line =~ s/$k--$/$v--/; } # --- FWM, 05-May-2004 return ( $line, $blank ); } # &line sub unline($) { # Pushes the line back into the input stream. my $line = shift; unshift @lookahead, $line; } # &unline sub end_of_file() { # Persistently returns true on an eof condition. if (defined $is_eof) { return (@lookahead == 0); } else { my $line; if ($UseFile) { # FWM, 05-May-2004 $is_eof = ( $line = ) ? 0 : 1 ; } else { $is_eof = ( $line = ) ? 0 : 1 ; } push @lookahead, $line if (! $is_eof); return $is_eof; } } # &end_of_file } # Function which returns true if there is a match with a header. sub header_match($$$) { my $header = shift; my $matchexpr = shift; my $subscript = shift; my $result; my $instance; # Need to first do a check for the header existing at all, lest it get # auto-vivified by the reference to its array. return 0 if (! exists $OrigHeaders[$subscript]{"\U$header"}); # Onward.. foreach $instance (@{ $OrigHeaders[$subscript]{"\U$header"} }) { last if ($result = ( $instance =~ $matchexpr)); } return $result; } # &header_match # Function which returns the header block code of the parent part. sub parent($) { my @lineage = split( /\./o, shift ); pop @lineage; return join( ".", @lineage ); } # Functions which checks in any number of headers, and returns the locations found as # a subscript and a key. sub self_match($$@) { my $matchexpr = shift; my $subscript = shift; my @headers = @_; my $header; foreach $header (@headers) { return ( $subscript, $header ) if (header_match( $header, $matchexpr, $subscript )); } return; # Not found. } # &self_match sub sibling_match($$@) { my $matchexpr = shift; my $subscript = shift; my @headers = @_; my $i = 1; my $predecessor = parent( $BlockHeaders[$subscript] ); my $header; while (my $sibling = $HeaderBlocks{"$predecessor.$i"}) { if ($sibling != $subscript) { foreach $header (@headers) { return ( $sibling, $header ) if (header_match( $header, $matchexpr, $sibling )); } } $i += 1; } return; # Not found. } # &sibling_match sub parent_match($$@) { my $matchexpr = shift; my $subscript = shift; my @headers = @_; my $predecessor = $HeaderBlocks{ parent( $BlockHeaders[$subscript] ) }; # usage not same as prev rtn my $header; foreach $header (@headers) { return ( $predecessor, $header ) if (header_match( $header, $matchexpr, $predecessor )); } return; # Not found. } # &parent_match # Sets a header. sub set_header($$$;$) { my $header = shift; my $new_text = shift; my $subscript = shift; # # For instance: # -1 = Create a new instance. # 0 = Replace/create last instance. # 1..n = Replace/create (n-1)th instance. # No instance is equivalent to instance 0. my $instance = shift || 0; if (defined $CurrentHeaders[$subscript]{"\U$header"} ) { if ($instance == -1) { $instance = @{$CurrentHeaders[$subscript]{"\U$header"}}; } elsif ($instance == 0) { $instance = $#{$CurrentHeaders[$subscript]{"\U$header"}}; } else { # $instance > 0 $instance -= 1; $instance = @{$CurrentHeaders[$subscript]{"\U$header"}} if ($instance > @{$CurrentHeaders[$subscript]{"\U$header"}}); } } else { $instance = -1; } # If this is a "new-new" header, then use the capitalization specified in the argument. # Otherwise, preserve the existing case. if ($instance == -1) { $CurrentHeaders[$subscript]{"\U$header"}[0] = $new_text; # Since it's "new" new, put it in the list of Ordered Headers for the part. push @{ $OrderedHeaders[$subscript] }, $header; } else { $CurrentHeaders[$subscript]{"\U$header"}[$instance] = $new_text; } } # &set_header # Gets a header array. sub get_header($$) { my $header = shift; my $subscript = shift; return $OrigHeaders[$subscript]{"\U$header"}; } # &get_header sub delete_header($$) { my $header = shift; my $subscript = shift; delete $CurrentHeaders[$subscript]{"\U$header"}; for( my $i = 0; $i < @{$OrderedHeaders[$subscript]}; $i++ ) { splice @{$OrderedHeaders[$subscript]}, $i, 1 if (uc($header) eq uc($OrderedHeaders[$subscript][$i])); } } # &delete_header # Builds an interpolation symbol and stores it into # %Interpolations. FWM, 20-Dec-2003 sub build_interpolation($$$&) { my $header = shift; my $subscript = shift; my $symbol = shift; my $builder = shift; my $aH = get_header( $header, $subscript ); my @matches = (); my $match; foreach my $line (@{$aH}) { push @matches, $match if (defined($match = &$builder( $line ))); } if (@matches > 1) { $Interpolations{$symbol} = '(' . join( '|', @matches ) . ')'; } elsif (@matches == 1) { $Interpolations{$symbol} = $matches[0]; } # and if the array is empty, there is nothing to substitute } # &build_interpolation # Encapsulates reading from a file. sub read_from_file($&&) { my $rhArgs = shift; my $rsError = shift; my $rsRead = shift; my $error = "ok"; my $initial_error; for( my $state = "init"; $state ne "done"; do { # No errors, everything is sequential logically speaking. if ($error eq "ok") { $state = "done" if ($state eq "close"); $state = "close" if ($state eq "read"); $state = "read" if ($state eq "lock"); $state = "lock" if ($state eq "open"); $state = "open" if ($state eq "init"); } else { if (!defined $initial_error) { $rhArgs->{"error"} = $error; &$rsError( $rhArgs ); $initial_error = $error; } $state = "done" if ($state eq "close"); $state = "close" if ($state eq "read") || ($state eq "lock"); $state = "done" if ($state eq "open") || ($state eq "init"); # We're done. return 0 if ($state eq "done"); } }) { ($state eq "init") && do { # Just seein' if you're payin' attention! 1; } or ($state eq "open") && do { my $filename = $rhArgs->{"filename"}; open(RFF, "<$filename") or $error = "open: $!"; 1; } or ($state eq "lock") && do { flock(RFF, LOCK_SH) or $error = "lock: $!"; 1; } or ($state eq "read") && do { $rhArgs->{"file"} = \*RFF; # Weird SuSE 8.2/Perl 5.8 thing. FWM, 21-May-2004 $! = 0; # &$rsRead( $rhArgs ); if ($!) { $error = "read: $!"}; delete $rhArgs->{"file"}; 1; } or ($state eq "close") && do { close(RFF) or $error = "close: $!"; 1; } or $error = "Bozo state = $state"; } # for $state loop return 1; } # &read_from_file # Subroutine which calls the appropriate handler(s). sub handlers($$) { my $part = shift; my $subscript = shift; my $entity = 1; if ($part eq '0') { at_Header($subscript); $entity = 0; } if ( header_match( 'Content-Type:', qr/\s*Multipart/io, $subscript ) ) { at_Multipart($part,$subscript); $entity = 0; } if ($entity) { at_Entity($part,$subscript); } } # &handlers # Function which is called at each MIME level and companion routines. { my $current_nesting; my $emit_headers; my $new_body; # Effectively deletes the part. sub delete_part() { $emit_headers = 0; undef $new_body; } # &delete_part # Defines content to be inserted after the headers. sub insert_body($) { $new_body = shift; } # Read-ahead function to consume the body of the part at this level. sub get_body() { my $body = ""; while (! end_of_file()) { ( my $line, undef ) = line( 0, $current_nesting ); # Until a MIME boundary, loop. if ( $line =~ /^--([^\s-]*(?:(?!--\s*$)-[^\s-]*)*)(--)*\s*$/o ) { if (defined $Boundaries{$1}) { unline( $line ); # push it back last; } } $body .= $line; } # while return $body; } # &get_body # Processing loop which is recursively called for each MIME part. sub level($$) { # Returns ( found level, final separator ) my $nesting = shift; my $part = shift; my $subparts = 0; my $want_header = 1; my $boundary = undef; my $foundsep; my $foundlevel; my $finalsep; my $i; # Set up structures for tracking the headers. $i = @BlockHeaders; $HeaderBlocks{$part} = $i; $BlockHeaders[$i] = $part; # Get the headers. LINE: while (! end_of_file()) { ( my $line, my $blank ) = line( $want_header, $nesting ); print "read: $line" if (defined $Options{"e"}); if ($want_header) { # Until a blank line, want headers. if ($blank) { # Done with headers. $want_header = 0; # Call any appropriate handlers. $current_nesting = $nesting; $emit_headers = 1; handlers($part,$i); # On with the show. # At the start of the body, we'll write all of the headers for the part. if ($emit_headers) { if (defined $MIMESepCache) { print "--$MIMESepCache\n"; undef $MIMESepCache; } foreach my $headertype (@{ $OrderedHeaders[$i] }) { foreach my $header (@{ $CurrentHeaders[$i]{"\U$headertype"} }) { print "$headertype$header"; } # for $header } # for $headertype print "\n"; # A blank line. } # emit headers if (defined $new_body) { print $new_body; undef $new_body; } next LINE; } # Add the header to the header hash, etc. my $matched = (my @header = ( $line =~ /^([^\s:]*)\s*:(.*)/so )); $header[0] .= ':' if ($matched); @header = ( $line =~ /^(\S+)(.*)/so ) if (!$matched); # Look for MIME-part headers. if ( $line =~ /^Content-Type\s*:\s*multipart[^;]*;/igo ) { # Look for the boundary parameter, skipping any intervening parameters. if ( $line =~ /;[\s\n]*boundary\s*="*((?(?<=")[^"]*|[^;\s]*))/ios ) { # " <-- LSE fodder # Treat multilines as single. FWM, 21-May-2002 # Just doing my part to keep people with no self respect who work # for spammers employed.. FWM, 05-May-2004 $boundary = $1; if ($boundary =~ m/--$/o) { my $orig = $boundary; $boundary =~ s/--$/sprintf "%x", rand() * 1000/e; push @SeparatorRewrite, [ qr/^--$orig/, "--$boundary" ]; $header[1] =~ s/boundary\s*="$orig/boundary="$boundary/is; } $Boundaries{$boundary} = $nesting; } } push @{ $OrderedHeaders[$i] }, $header[0] if (!exists $OrigHeaders[$i]{"\U$header[0]"}); push @{ $OrigHeaders[$i]{"\U$header[0]"} }, $header[1]; push @{ $CurrentHeaders[$i]{"\U$header[0]"} }, $header[1]; next LINE; } # $want_header # Until a MIME boundary, loop. if ( $line =~ /^--([^\s-]*(?:(?!--\s*$)-[^\s-]*)*)(--)*\s*$/o ) { # Emit finals. if (defined($finalsep = $2)) { print $line; } # Is it a recognized MIME separator? unless (defined($foundlevel = $Boundaries{$1})) { print $line; # Stop it from eating lines which consist of two # dashes. FWM, 23-Dec-2002 next LINE; } } else { print $line; next LINE; } # Cache non-final MIME separators, don't want to emit them yet. $MIMESepCache = $1 unless (defined $finalsep); # MIME part separator at this level? if ($foundlevel == $nesting) { while ( ( $foundlevel == $nesting) && (!(defined $finalsep)) ) { $subparts += 1; ( $foundlevel, $finalsep ) = level( $nesting + 1, "$part.$subparts"); last if (!(defined $foundlevel)); } if (!(defined $foundlevel)) { # eof delete $Boundaries{$boundary} if (defined $boundary); return ( undef, undef ); } if ($foundlevel < $nesting) { delete $Boundaries{$boundary} if (defined $boundary); return ( $foundlevel, $finalsep ); } if (defined $finalsep) { delete $Boundaries{$boundary} if (defined $boundary); return ( $nesting, undef ); } next LINE; } # Well then, whose is it? delete $Boundaries{$boundary} if (defined $boundary); return ( $foundlevel, $finalsep ); } # !end_of_file() delete $Boundaries{$boundary} if (defined $boundary); return( undef, undef ); } # &level } # -=-=-=-=- Mail-based filter maintenance. # This subroutine handles requests to update the filter. Returns 1 if the mail was # an update-related request, and 0 otherwise. sub update_filter( $ ) { my $subscript = shift; # BCC/Mailing List Cookie Request if (header_match( "Subject:", qr/^\s*REQUEST UPDATE COOKIE/io, $subscript )) { # Throw away the current body, munge some headers. set_header( "Content-Type:", " text/plain\n", $subscript ); my $junk = get_body(); undef $junk; # Calculate the cookie. if (! defined($Options{"t"})) { set_header( "Subject:", " REQUEST UPDATE COOKIE FAILED\n", $subscript ); set_header( "X-PerlJacket:", " Error: mail-based updates disabled.\n", $subscript, -1 ); my $body = "'t' option not supplied when perljacket was invoked.\n" . "Mail-based updates are disabled.\n\n"; insert_body( $body ); return 1; } my $cookie = int(rand 999999) + 1; my $time_to_live = time() + $Options{"t"}; # Save it. Use a state engine to manage the task. my $error = "ok"; my $initial_error; for( my $state = "init"; $state ne "done"; do { # No errors, everything is sequential logically speaking. if ($error eq "ok") { $state = "done" if ($state eq "close"); $state = "close" if ($state eq "write"); $state = "write" if ($state eq "trunc"); $state = "trunc" if ($state eq "lock"); $state = "lock" if ($state eq "open"); $state = "open" if ($state eq "init"); } else { if (!defined $initial_error) { # Make a body for the e-mail explaining the problem. set_header( "Subject:", " REQUEST UPDATE COOKIE FAILED\n", $subscript ); set_header( "X-PerlJacket:", " Error: request can't write cookie file.\n", $subscript, -1 ); my $body = "Can't write cookie file, $error\n\n"; insert_body( $body ); $initial_error = $error; } $state = "done" if ($state eq "close"); $state = "close" if ($state eq "write") || ($state eq "trunc") || ($state eq "lock"); $state = "done" if ($state eq "open") || ($state eq "init"); # We're done. return 1 if ($state eq "done"); } }) { ($state eq "init") && do { # Just seein' if you're payin' attention! 1; } or ($state eq "open") && do { sysopen(COOKIE, "perljacket.cookie", O_WRONLY | O_CREAT) or $error = "open: $!"; 1; } or ($state eq "lock") && do { flock(COOKIE, LOCK_EX) or $error = "lock: $!"; 1; } or ($state eq "trunc") && do { truncate(COOKIE, 0) or $error = "trunc: $!"; 1; } or ($state eq "write") && do { print COOKIE join(",", $cookie, $time_to_live),"\n" or $error = "write: $!"; 1; } or ($state eq "close") && do { close(COOKIE) or $error = "close: $!"; 1; } or $error = "Bozo state = $state"; } # for $state loop # Make a new body with the contents of the current filter. Similar to what # I just did to write to a file, but this one occurs in multiple places, so I # made the state logic a subroutine, and I pass it handlers. Once a Pascal # programmer, always a Pascal programmer. read_from_file( { subscript => $subscript, filename => "perljacket.filter"} , sub { # Error Handler. my $rhArgs = shift; # Make a body for the e-mail explaining the problem. set_header( "Subject:", " REQUEST UPDATE COOKIE FAILED\n", $rhArgs->{"subscript"} ); set_header( "X-PerlJacket:", " Error: request can't read filter file.\n", $subscript, -1 ); my $error = $rhArgs->{"error"}; my $body = "Can't read filter file, $error\n\n"; insert_body( $body ); }, sub { # Reader. my $rhArgs = shift; my $body = "To delete a line, put the word DELETE at the beginning of the line.\n" . "To add a line, put the word ADD at the beginning. To change a line,\n" . "first DELETE it and then ADD it again.\n\n"; my $ffilter = $rhArgs->{"file"}; while ( my $line = <$ffilter> ) { if ( (my $expr, my $scope, my $header) = $line =~ m%(.*).*(.*).*
(.*)
%io) { $body .= "$expr\t$scope\t
$header
\n"; } } $body .= "\n"; insert_body( $body ); } ) or return 1; # Write a new Subject line. set_header( "Subject:", " UPDATE COOKIE=$cookie TTL=$time_to_live\n", $subscript ); # All done. return 1; } # BCC/Mailing List Add/Delete if (header_match( "Subject:", qr/^\s*Re: UPDATE COOKIE/io, $subscript )) { # Get the current body, munge some headers. set_header( "Content-Type:", " text/plain\n", $subscript ); my $body = get_body(); # Get the cookie from the subject line. my $subject = get_header( "Subject:", $subscript ); my $cookie; my $ttl; if (!(($cookie, $ttl) = ${$subject}[0] =~ /UPDATE COOKIE=(\d+)\s+TTL=(\d+)/io)) { set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $subscript); set_header( "X-PerlJacket:", " Error: update failure parsing subject line.\n", $subscript, -1 ); my $body = "Unable to successfully parse cookie and time to live from subject:\n" . ${$subject}[0] . "\n\n"; insert_body($body); return 1; } # Read what the cookie is supposed to be and the TTL from the file. my %args = ( subscript => $subscript, filename => "perljacket.cookie" ); read_from_file( \%args, sub { # Error Handler. my $rhArgs = shift; # Make a body for the e-mail explaining the problem. set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $rhArgs->{"subscript"} ); set_header( "X-PerlJacket:", " Error: update can't read cookie file.\n", $subscript, -1 ); my $error = $rhArgs->{"error"}; my $body = "Can't read cookie file, $error\n\n"; insert_body( $body ); }, sub { # Reader. my $rhArgs = shift; my $fcookie = $rhArgs->{"file"}; my $line = <$fcookie>; ( my $cookie, my $ttl ) = split /,/, $line, 2; $rhArgs->{"cookie"} = $cookie if (defined $cookie); $rhArgs->{"ttl"} = $ttl if (defined $ttl); } ) or return 1; # Validate the cookie. if (! ((defined $args{cookie}) && (defined $args{ttl})) ) { set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $subscript); set_header( "X-PerlJacket:", " Error: update failure parsing cookie file.\n", $subscript, -1 ); my $body = "Unable to successfully parse cookie and time to live from file.\n\n"; insert_body($body); return 1; } if ($args{cookie} != $cookie) { set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $subscript); set_header( "X-PerlJacket:", " Error: update failure matching cookie.\n", $subscript, -1 ); my $body = "Failed to match cookie in subject:\n" . ${$subject}[0] . "\n\n"; insert_body($body); return 1; } if (time() > $args{ttl}) { set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $subscript); set_header( "X-PerlJacket:", " Error: update TTL expired.\n", $subscript, -1 ); my $body = "Time To Live has expired.\n\n"; insert_body($body); return 1; } my $time_to_live = $args{ttl}; undef %args; # Build a list of the updates from the e-mail body. my @updates; foreach my $line (split /^/, $body) { if ((my $op, my $expr, my $scope, my $header) = $line =~ m%^(ADD|DELETE).*(.*).*\s*(.*).*
\s*(.*)
%io) { my @sorted; if ($scope !~ /=\w+=/o) { @sorted = sort {lc($a) cmp lc($b)} split /,\s*/o, $scope; } else { @sorted = ( $scope ); } push @updates, { oper => uc($op), expr => $expr, scope => \@sorted, header => $header, matchkey => join( "", $expr, @sorted, $header) }; } } # We're done with the original body now, so we can undef it to free up space. $body = ""; # Another state engine... pity .*n.*x doesn't have a file locking model like # oooohhhh, VMS. This one locks the file, reads it in, messes with it, then # writes it back out and finally unlocks it. my $error = "ok"; my $initial_error; my @filter; my %matchkeys; for( my $state = "init"; $state ne "done"; do { # No errors, everything is sequential logically speaking. if ($error eq "ok") { $state = "done" if ($state eq "close"); $state = "close" if ($state eq "trunc"); $state = "trunc" if ($state eq "write"); $state = "write" if ($state eq "seek"); $state = "seek" if ($state eq "update"); $state = "update" if ($state eq "read"); $state = "read" if ($state eq "lock"); $state = "lock" if ($state eq "open"); $state = "open" if ($state eq "init"); } else { if (!defined $initial_error) { # Make a body for the e-mail explaining the problem. set_header( "Subject:", " Re: UPDATE COOKIE FAILED\n", $subscript); set_header( "X-PerlJacket:", " Error: update failure updating filter file.\n", $subscript, -1 ); $body .= "Can't update filter file, $error\n\n"; insert_body( $body ); $initial_error = $error; } $state = "done" if ($state eq "close"); $state = "close" if ($state eq "trunc") || ($state eq "write") || ($state eq "seek") || ($state eq "update") || ($state eq "read") || ($state eq "lock"); $state = "done" if ($state eq "open") || ($state eq "init"); # We're done. return 1 if ($state eq "done"); } }) { ($state eq "init") && do { # Just seein' if you're payin' attention! 1; } or ($state eq "open") && do { sysopen(FILTER, "perljacket.filter", O_RDWR | O_CREAT) or $error = "open: $!"; 1; } or ($state eq "lock") && do { flock(FILTER, LOCK_EX) or $error = "lock: $!"; 1; } or ($state eq "read") && do { # Load the filter into memory. while ( my $line = ) { if ((my $expr, my $scope, my $header) = $line =~ m%(.*).*(.*).*
(.*)
%io) { my @sorted; if ($scope !~ /=\w+=/o) { @sorted = sort {lc($a) cmp lc($b)} split /,\s*/o, $scope; } else { @sorted = ( $scope ); } push @filter, { expr => $expr, scope => \@sorted, header => $header }; $matchkeys{join("", $expr, @sorted, $header)} = $#filter; } else { # Preserve comment lines. push @filter, { line => $line }; } } 1; } or ($state eq "update") && do { # Apply the updates, keeping track of the # success/failure of each change. my $update; foreach $update (@updates) { ($update->{oper} eq "ADD") && do { # Make sure it doesn't already exist. if (defined $matchkeys{ $update->{matchkey} }) { $body .= "duplicate record: " . "$update->{expr}\t" . "" . join( ",", @{$update->{scope}} ) . "\t" . "
$update->{header}
\n"; } else { # Add it to the filter. push @filter, { expr => $update->{expr}, scope => $update->{scope}, header => $update->{header} }; # Add it to the master hash. $matchkeys{ $update->{matchkey} } = $#filter; $body .= "added: " . "$update->{expr}\t" . "" . join( ",", @{$update->{scope}} ) . "\t" . "
$update->{header}
\n"; } 1; } or ($update->{oper} eq "DELETE") && do { # Delete it if it exists. if (defined $matchkeys{ $update->{matchkey} }) { # Mark the record as deleted, so it won't get written back # out. $filter[ $matchkeys{ $update->{matchkey} } ]{deleted} = 1; # Delete it from the master hash. delete $matchkeys{ $update->{matchkey} }; $body .= "deleted: " . "$update->{expr}\t" . "" . join( ",", @{$update->{scope}} ) . "\t" . "
$update->{header}
\n"; } else { $body .= "not found: " . "$update->{expr}\t" . "" . join( ",", @{$update->{scope}} ) . "\t" . "
$update->{header}
\n"; } 1; } or # Otherwise.. $error = "keyword unknown: $update->{oper}\n", last; } 1; } or ($state eq "seek") && do { seek( FILTER, 0, 0 ) or $error = "seek: $!"; 1; } or ($state eq "write") && do { # Write it back out. foreach my $line (@filter) { next if (defined $line->{deleted}); if (defined $line->{line}) { print FILTER $line->{line} or $error = "write: $!", last; } else { print FILTER "$line->{expr}\t" . "" . join( ",", @{$line->{scope}} ) . "\t" . "
$line->{header}
\n" or $error = "write: $!", last; } } 1; } or ($state eq "trunc") && do { truncate(FILTER, tell(FILTER)) or $error = "trunc: $!"; 1; } or ($state eq "close") && do { close(FILTER) or $error = "close: $!"; 1; } or $error = "Bozo state = $state"; } # for $state loop # Set an appropriate subject line. # No newline looks better here for some reason. FWM, 21-May-2004 set_header( "Subject:", " UPDATE STATUS: COOKIE=$cookie TTL=$time_to_live", $subscript ); # Make a new body with the status of the changes. insert_body( $body ); # All done. return 1; } # Return indicating that this wasn't a filter update. return 0; } # &update_filter # # # # # -=-=-=-=- Handler subroutines. # Called after all of the (global) mail headers have been loaded. sub at_Header($) { my $subscript = shift; # pointless, will always be zero... print "****** AT HEADER 0 ******\n" if ($Options{"d"}); # -=-=-=-=- BCC/Mailing List Maintenance -=-=-=-=- return if update_filter( $subscript ); # -=-=-=-=- BCC/Mailing List Filtering -=-=-=-=- # Read in the filter file. my %args = ( subscript => $subscript, filename => "perljacket.filter" ); read_from_file( \%args, sub { # Error Handler. my $rhArgs = shift; # Set a X-PerlJacket header explaining the problem, since we're going to # continue to process the message regardless. my $error = $rhArgs->{"error"}; set_header( "X-PerlJacket:", " Error: bcc filter can't read filter file.\n", $subscript, -1 ); # ' set_header( "X-PerlJacket:", " $error\n", $subscript, -1 ); }, sub { # Reader. my $rhArgs = shift; my @filter; my $ffilter = $rhArgs->{"file"}; while ( my $line = <$ffilter> ) { if ((my $expr, my $scope, my $header) = $line =~ m%^(.*).*(.*).*
(.*)
%io) { my @scope = (); if ($scope =~ /=(\w+)=/o) { no strict 'refs'; if (defined @{$1}) { push @scope, @{$1}; } else { set_header( "X-PerlJacket:", " Error: bcc filter '$1' not a recognized cliche.\n", $subscript, -1 ); } } else { @scope = split /,\s*/o, $scope; } push @filter, { expr => $expr, scope => \@scope, header => $header } if (($scope[0]) || ($expr eq '=All=')); # FWM, 19-May-2002 } } $rhArgs->{filter} = \@filter; } ) or do { delete $args{filter} }; # Calculate some special variables which can be interpolated into # the filter expressions. FWM, 20-Dec-2003 %Interpolations = (); # from_domain, FWM, 20-Dec-2003 build_interpolation( 'From:', $subscript, 'from_domain', sub { my $line = shift; (my $domain) = $line =~ m/\@([\w.-]*)/o; return undef unless (defined $domain); (my $tld) = $domain =~ m/([\w-]+\.[\w-]+\.\w\w)$/o; ($tld) = $domain =~ m/([\w-]*\.[\w-]*)$/o if (! defined $tld); return $tld; } ); # Apply the filter. if (defined $args{filter}) { foreach my $filter ( @{$args{filter}} ) { # Rewrite flag. FWM, 18-May-2002 my $matched = 0; if ($filter->{expr} eq '=All=') { $matched = 1; } else { (my $neg, my $expr) = $filter->{expr} =~ /(!?)(.*)/o; # Certain predefined values cause the expression to be substituted prior # to evaluation. FWM, 20-Dec-2003 $expr =~ s/\$(\w+)/defined($Interpolations{$1}) ? $Interpolations{$1} : $1/eg; # Compile and bozo-check the expression. my $re = eval { qr/$expr/i } || do { set_header( "X-PerlJacket:", " Error: bcc filter bad expression.\n", $subscript, -1 ); set_header( "X-PerlJacket:", " $filter->{expr}\n", $subscript, -1 ); 0; }; # Test for it. if ($re) { $matched = 1 if ( (((undef, undef) = self_match( $re, $subscript, @{$filter->{scope}} )) ? 1 : 0) ^ (($neg) ? 1 : 0) ); } } if ($matched) { if ($filter->{header} eq '=No-Rewrite=') { $Rewrite = 0; } elsif ($filter->{header} eq '=Rewrite=') { $Rewrite = 1; } else { set_header( $filter->{header}, " $filter->{expr}\n", $subscript, -1 ) } } } } undef %args; # -=-=-=-=- text/html Rewrite -=-=-=-=- if (($Rewrite) && (header_match( "Content-Type:", qr%^\s*text/html%io, $subscript ))) { # This will be our MIME boundary.. we will hope for no collisions. my $jacketMIME = "*##*perljacket*##*"; # Munge some headers. set_header( "Content-Type:", " multipart/mixed; boundary=\"$jacketMIME\"\n", $subscript ); set_header( "X-PerlJacket:", " text/html rewrite\n", $subscript, -1 ); delete_header( "Content-Disposition:", $subscript ); delete_header( "Content-Transfer-Encoding:", $subscript ); # Build some stuff to create headers for the text/html "attachment". my $a; $a = get_header( "Subject:", $subscript ); ${$a}[0] .= "no subject"; ${$a}[0] =~ m/^\W*(.*)/; (my $filename = $1) =~ s/\W/_/g; $filename = substr( $filename, 0, 10 ) . ".html"; $a = get_header( "Content-Type:", $subscript ); (my $charset) = ${$a}[0] =~ /charset=(\S*)/i; $charset = (defined $charset) ? "; charset=$charset" : ""; # FWM, 21-May-2002 my $content_type = "Content-Type: text/html; name=\"$filename\"$charset\n"; my $content_disposition = "Content-Disposition: attachment; filename=\"$filename\"\n"; my $content_transfer_encoding = ""; $a = get_header( "Content-Transfer-Encoding:", $subscript ); $content_transfer_encoding = "Content-Transfer-Encoding: ${$a}[0]" # FWM, 19-May-2002 if (defined $a); # Rewrite the body. my $body = "--$jacketMIME\n" . "Content-Type: text/plain\n\n" . "text/html body rewritten as attachment $filename.\n\n" . "\n--$jacketMIME\n" . $content_type . $content_transfer_encoding . $content_disposition . "\n" . get_body() . "\n--$jacketMIME--\n\n"; # Write the body back to the message. insert_body( $body ); } } # &at_Header # Called after all of the headers for a MIME multipart have been loaded. # Global mail headers may serve as a MIME multipart declaration, so this handler # may be called after the header for global mail headers. sub at_Multipart($$) { my $part = shift; my $subscript = shift; print "****** AT MULTIPART $part ******\n" if ($Options{"d"}); # -=-=-=-=- multipart/alternative|parallel Rewrite -=-=-=-=- if (($Rewrite) && (header_match( "Content-Type:", qr%^\s*multipart/(?:alternative|parallel)%io, $subscript ))) { my $a = get_header( "Content-Type:", $subscript ); (my $rewrite = ${$a}[0]) =~ s%(^\s*multipart/)(alternative|parallel)(.*)%$1mixed$3%i; set_header( "Content-Type:", $rewrite, $subscript ); set_header( "X-PerlJacket:", " multipart/alternative|parallel rewrite.\n", $subscript, -1 ); } } # &at_Multipart # Called after the headers for a (final) MIME entity have been loaded. sub at_Entity($$) { my $part = shift; my $subscript = shift; print "****** AT ENTITY $part ******\n" if ($Options{"d"}); # -=-=-=-=- text/html Part Of multipart/alternative Drop -=-=-=-=- if (($Rewrite) && (header_match( "Content-Type:", qr%^\s*text/html%io, $subscript ))) { if ((my $parent_subscript,undef) = parent_match( qr%^\s*multipart/alternative%io, $subscript, "Content-Type:" )) { if ((undef,undef) = sibling_match( qr%^\s*text/plain%io, $subscript, "Content-Type:" )) { # Can't set the parent's headers, because they've already been emitted. #set_header( "X-PerlJacket:", " dropped text/html part of multipart/alternative\n", # $parent_subscript, -1 ) if (defined $parent_subscript); my $dummy = get_body(); undef $dummy; # The following ends up being part of the previous part... print "**PerlJacket** deleted text/html part.\n\n"; # .. but this part will be deleted. delete_part(); } } } # -=-=-=-=- VCARD and Related Garbage Drop -=-=-=-=- my $vcard = 0; # FWM, 18-Aug-2002 if (($Rewrite) && ( ( header_match( "Content-Type:", qr%(?:application/ms-tnef|vcard)%io, $subscript ) ) || ( (undef,undef) = self_match( qr/winmail\.dat/io, $subscript, "Content-Type:", "Content-Disposition:" ) ) ) ) { # Can't retroactively rewrite message headers, because they've already been # emitted. Instead, we'll rewrite the part as text/plain and hope for the best # that our message gets through. #set_header( "X-PerlJacket:", " dropped VCARD, tnef, winmail.\n", # $HeaderBlocks{ parent( $part ) }, -1 ); # #delete_part(); # Nuke all headers for the part. Close your eyes, click your heels, and repeat: # "garbage collection... garbage collection... garbage collection..." # .. damn that just doesn't have the same ring as "gen-u-flect..gen-u-flect..gen-u-flect.." # Some people miss George Harrison, but I miss Tom Lehrer! $OrderedHeaders[$subscript] = [ "Content-Type:" ]; $CurrentHeaders[$subscript] = { "CONTENT-TYPE:" => [" text/plain\n"] }; # Reconstitute as a text/plain part with narrative... oh, wait a sec, we started that # above.. my $body = get_body(); $body = "**PerlJacket** VCARD removed.\n\n"; insert_body( $body ); $vcard = 1; # FWM, 18-Aug-2002 } # -=-=-=-=- Verify File Disposition of Everything but text/plain -=-=-=-=- if (($Rewrite) && (! $vcard) # FWM, 18-Aug-2002 && (! header_match( "Content-Type:", qr%^\s*text/plain%io, $subscript ))) { my $a = get_header( "Content-Disposition:", $subscript ); if (defined $a) { undef $a if (${$a}[0] !~ /^\s*attachment\s*;.*filename=/io); } if (! defined $a) { # Use any filename which is part of the Content-Type header. FWM, 18-May-2002 my $filename = undef; # Look for a name specified as part of the Content-Type header. Deprecated but # common. $a = get_header( "Content-Type:", $subscript ); if (defined $a) { ${$a}[0] =~ m/\s*name=("*)((?(?<=")[^"]*|[^\s;]*))/io; $filename = $2 if ($2); } # If no name has been found, dummy one up. if (! defined $filename) { $a = get_header( "Subject:", 0 ); ${$a}[0] .= "no subject"; ${$a}[0] =~ m/^\W*(.*)/; ($filename = $1) =~ s/\W/_/g; $filename = substr( $filename, 0, 10 ) . ".att"; } set_header( "Content-Disposition:", " attachment; filename=\"$filename\"\n", $subscript ); set_header("X-PerlJacket:", " disposition to file of non-text/plain part.\n", $subscript, -1 ); # After Content-Disposition. FWM, 22-May-2002 } } } # &at_Entity # # # # # # -=-=-=-=- Main processing. { # This should handle all of the "proper" contents of the message. if ($UseFile) { # FWM, 05-May-2004 open( INFILE, 'test.txt' ) or die "Cant open text.txt: $!\n"; } level(0,'0'); # Take care of anything after the toplevel final MIME marker. while (!end_of_file()) { ( my $line, undef ) = line(0,0); print $line; } if ($UseFile) { # FWM, 05-May-2004 close INFILE or die "Cant close test.txt: $!\n"; } exit(0); } __END__ =pod =head2 Mail-Based Filter Updates perljacket supports mail-based updates of the filter file. To use this feature, perljacked must be invoked with the -t option. To perform a mail-based update, you send a message with the subject line REQUEST UPDATE COOKIE and perljacket returns the message with a new subject line containing a cookie and time-to-live, and with the body of the message containing the current filters and instructions on how to add and delete filter records (basically you put ADD or DELETE at the beginning of the line). To apply the changes, you then send the e-mail back with a subject line of Re: UPDATE COOKIE... basically tacking "Re: " onto the front of the subject line of the message that perljacket sent to you. perljacket will then send a message detailing which changes were successfully applied and which ones were not. =head2 The perljacket.filter File perljacket uses a file containing rules to scan headers and add header lines to each message for rules which match. The file has the following format. =head3 Any line not conforming to the format of a filter definition is a comment. That should be self-explanatory! =head3 Format of filter definition lines Filter definition lines follow a modified XML syntax where the following three tags must occur in the order given, and all on one line: =over 4 =item EXPR The regular expression to be matched. A "!" at the beginning of the pattern negates it. =item SCOPE A comma-separated list of headers, or else a scope cliche (below). =item HEADER The header to be written. The EXPR will be used as the right-hand side of the header line. =back =head2 Special Expr substitutions These work more or less like ordinary Perl variables being interpolated into the expressions. =over 4 =item $from_domain Represents the domain of the sender as parsed from the From: line. For example, with the address C, $from_domain will interpolate C. =back =head2 Special Expr Cliches A special Expr cliche is provided primarily for use with turning on and off rewriting. =over 4 =item =All= Causes the rest of the rule to be honored for all messages, rather than based on an expression match. =back =head2 Special Scope Cliches Certain cliches are provided for commonly-used groups of headers: =over 4 =item =From= Matches From: Received: Message-ID: Resent-From: Resent-Sender: Sender: Reply-To: =item =To= Matches To: Cc: Bcc: Resent-To: Resent-Cc: Resent-Bcc: =back =head2 Special Header Cliches Two cliches are provided for the Header item which control rewriting (deletion of text/html parts, rewriting of alternative and parallel parts as attachments, etc.). These cliches are applied sequentially. For example if you say apply rewrites to all, and then have a line later in the filter file saying don't apply rewrites when "foo" matches, then the rewrite will be performed in all cases except where "foo" matches. However, if you first had a rule saying don't apply rewrites when "foo" matches, and then later had a line saying apply rewrites to all, then the rewrite would be performed regardless of whether or not "foo" matches. Basically, rewriting is a flag which toggles on and off as rules are evaluated. (The default state is no rewrites) =over 4 =item =Rewrite= Toggles the rewrite flag on. =item =No-Rewrite= Toggles the rewrite flag off. =back =head2 Modification History v2.00 FWM 18-May-2002 at_Header() Make the mail-based filtering stuff a subroutine to get it out of the main-line at_Header processing. v2.00 FWM 18-May-2002 at_Entity() Check the Content-Type for a name to use for attachments. v2.00 FWM 19-May-2002 at_Header(), at_Multipart(), at_Entity() Implement rewrite switch. v2.00 FWM 19-May-2002 at_Header() Get rid of extra newline during text/html rewrite. v2.01 FWM 21-May-2002 at_Header() text/html rewrite: get rid of some extra quotes around the Content-Type charset. v2.01 FWM 21-May-2002 level() MIME boundary detection: \G didn't seem to be reliable on some platforms with long multiline Content-Type headers. v2.02 FWM 22-May-2002 at_Entity() Write the X-PerlJacket header after the Content-Type header for superstitious reasons. v2.03 FWM 18-Aug-2002 at_Entity() Fix a problem with some tnefs still getting through. v2.04 FWM 14-Dec-2002 ... Technically, you cant exit out of a do.. construct with next. It looks good, and works fine, but in the interest of correctness I've gone through the places where I was using do.. to implement case structures and gone to something a little more couth. v2.04 FWM 23-Dec-2002 level() Stop it from eating lines which consist of nothing but two dashes. (Tip 'o th' hat to BLB of NLRC for bringing this to my attention) v2.05 FWM 20-Dec-2003 build_interpolation(),at_header() Implement ability to substitute certain Perlish symbols cadged from the headers into expressions, starting with from_domain. v2.06 FWM 05-May-2004 ... Implement -f which causes PJ to read from C instead of STDIN... useful if invoking with perl -d. v2.06 FWM 05-May-2004 line(),level() Silly rabbit, tricks are for pigs! v2.07 FWM 21-May-2004 read_from_file() Weird SuSE 8.2/Perl 5.8 thing. It wants to do some MIDI ioctl thing with ordinary files, and leaves $! set after successful opens. Do I know why? Do I care?? =head2 Author, Copyright, and Terms Of Use PerlJacket written by and (c) 2002-2004 Fred Morris, Seattle WA. USA e-mail: m3047@inwa.net telephone: 206.297.6344 You are granted a royalty-free, perpetual license as follows: You may use PerlJacket at your own risk, subject to your own determination that it performs acceptably for your needs. PerlJacket is provided as-is, and with no warranty as to performance or fitness for a particular use. PerlJacket does not conform strictly to the specifications for mail and MIME defined in the IETF RFCs; it performs as the author desires it to perform. If it does not perform as you wish it to, you must cease use, or modify it to suit your own needs. As a condition of this license you agree to hold the author and Fred Morris Consulting harmless for any and all damages, whether direct, consequential or arising from any other theory of law. You may copy, redistribute, modify or create derivative works provided that you give the author proper attribution and furthermore agree to indemnify him against any claims arising from its use by you or others. You have not been required to compensate the author or Fred Morris Consulting in consideration for this license, and you waive all rights to monetary compensation should a claim be upheld. You agree that any claims or other actions pertaining to PerlJacket, its use or this license will be brought in a court of competent jurisdiction in the State of Washington, USA. =cut