perljacket

A processor/filter callable from procmail to do a variety of things:

Global Variables

The following global variables may be useful to you when writing your own part handlers:

%HeaderBlocks
This hash provides subscripts into the following four arrays for block specifiers of the form 0 (for the mail headers), 0.1 for an entity at the top MIME level or a multipart header at that level, 0.1.1 for the first entity subpart of the first MIME part, 0.1.2 for the second sub part of the first MIME part, etc.

@BlockHeaders
Reverses the operation of %HeaderBlocks, providing the key values.

@OrigHeaders
Indexes hashes of the original (unmodified) headers of the part.

@CurrentHeaders
Indexes hashes of the edited headers of the part.

@OrderedHeaders
An array of arrays of the names of headers for the part, in the order encountered.

$Rewrite
If true, then content rewrites are performed. Toggled by the No-Rewrite and Rewrite header cliches.

%Interpolations
Contains strings extracted from headers which are available for substitution into headers.

@SeparatorRewrite
Some idiot spammers thnk it's cute to mangle MIME separators, so we mangle them some more.

Command Line Options

The program takes the following switches:

f
If supplied, then input is read from the file test.txt instead of from STDIN.

d
If supplied, then a debugging line is printed each time a handler is fired.

e
If supplied, then each line is echoed before being processed.

s integer
If supplied, then the value is used for additional randomization when generating verification cookies for mail-based filter updates.

t integer
The time to live in seconds for randomized verification cookies. If not supplied, then mail-based filter updates are disabled.

Part Handlers

Three part handlers are defined for processing mail. The firing of these handlers is (mostly) controlled by two special header item flags in perljacket.filter.

at_Header
This handler is called before the (global) mail headers are written back out.

at_Multipart
This handler is called before the headers for a multipart entity are written back out. This means that it may be called at the global level as well, after the at_Header handler.

at_Entity
This handler is called before the headers for a (final) entity are written back out. It is ONLY called for MIME final entities, it will NOT be called for the body of a mail message which is not a MIME part.

Routines for Writing part handlers

Some routines are supplied which should be handy for writing part handlers:

parent( entity-spec )
Returns a string containing the ``dotted'' parent of the passed string.

self_match, sibling_match, parent_match( matchexpr, subscript, header-list )
Each of these routines matches one or more headers, and returns the subscript where found, and header key. The functions do a bare return if there is no match.

The arguments are:

match expression
The regular expression to be matched.

subscript
The subscript of the caller, ``me''.

headers
Array-by-value of the headers to be searched.

get_body()
Returns a string with the body of the part at this level, consuming the part.

insert_body( new-body )
Inserts the string argument at the beginning of the body for this part.

delete_part()
Suppresses the part.

set_header( header, new-text, subscript, instance )
Sets the specified header. The instance is optional, and has the following semantics:
(-1)
Create a new instance.

(0)
Replace/create last instance.

(1..n)
Replace/create (n-1)th instance.

No instance
Equivalent to instance 0.

get_header( header, subscript )
Returns the header array reference for the specified header name and part.

delete_header( header, subscript )
Deletes all instances of the specified header.

read_from_file( arg-hash, error-handler, reader-handler )
Encapsulates reading from a file, with error handling and locking. The first argument is a hash, and read_from_file will expect to see:
  subscript     => $subscript
  filename      => name of the file to read

Additionally, the error handler can expect to see:

  error         => an error message
  
and the reader handler can expect to see:
  file          => a reference to the filehandle

Mail-Based Filter Updates

perljacket supports mail-based updates of the filter file. To use this feature, perljacked must be invoked with the -t option.

To perform a mail-based update, you send a message with the subject line

  REQUEST UPDATE COOKIE
  
and perljacket returns the message with a new subject line containing a cookie and
time-to-live, and with the body of the message containing the current filters and
instructions on how to add and delete filter records (basically you put ADD or DELETE
at the beginning of the line).

To apply the changes, you then send the e-mail back with a subject line of

  Re: UPDATE COOKIE...
  
basically tacking "Re: " onto the front of the subject line of the message that perljacket
sent to you. perljacket will then send a message detailing which changes were successfully
applied and which ones were not.

The perljacket.filter File

perljacket uses a file containing rules to scan headers and add header lines to each message for rules which match. The file has the following format.

Any line not conforming to the format of a filter definition is a comment.

That should be self-explanatory!

Format of filter definition lines

Filter definition lines follow a modified XML syntax where the following three tags must occur in the order given, and all on one line:

EXPR
The regular expression to be matched. A ``!'' at the beginning of the pattern negates it.

SCOPE
A comma-separated list of headers, or else a scope cliche (below).

HEADER
The header to be written. The EXPR will be used as the right-hand side of the header line.

Special Expr substitutions

These work more or less like ordinary Perl variables being interpolated into the expressions.

$from_domain
Represents the domain of the sender as parsed from the From: line. For example, with the address chuckie123@mail.somewhere.net, $from_domain will interpolate somewhere.net.

Special Expr Cliches

A special Expr cliche is provided primarily for use with turning on and off rewriting.

=All=
Causes the rest of the rule to be honored for all messages, rather than based on an expression match.

Special Scope Cliches

Certain cliches are provided for commonly-used groups of headers:

=From=
Matches From: Received: Message-ID: Resent-From: Resent-Sender: Sender: Reply-To:

=To=
Matches To: Cc: Bcc: Resent-To: Resent-Cc: Resent-Bcc:

Special Header Cliches

Two cliches are provided for the Header item which control rewriting (deletion of text/html parts, rewriting of alternative and parallel parts as attachments, etc.). These cliches are applied sequentially.

For example if you say apply rewrites to all, and then have a line later in the filter file saying don't apply rewrites when ``foo'' matches, then the rewrite will be performed in all cases except where ``foo'' matches.

However, if you first had a rule saying don't apply rewrites when ``foo'' matches, and then later had a line saying apply rewrites to all, then the rewrite would be performed regardless of whether or not ``foo'' matches.

Basically, rewriting is a flag which toggles on and off as rules are evaluated. (The default state is no rewrites)

=Rewrite=
Toggles the rewrite flag on.

=No-Rewrite=
Toggles the rewrite flag off.

Modification History

v2.00 FWM 18-May-2002 at_Header() Make the mail-based filtering stuff a subroutine to get it out of the main-line at_Header processing.

v2.00 FWM 18-May-2002 at_Entity() Check the Content-Type for a name to use for attachments.

v2.00 FWM 19-May-2002 at_Header(), at_Multipart(), at_Entity() Implement rewrite switch.

v2.00 FWM 19-May-2002 at_Header() Get rid of extra newline during text/html rewrite.

v2.01 FWM 21-May-2002 at_Header() text/html rewrite: get rid of some extra quotes around the Content-Type charset.

v2.01 FWM 21-May-2002 level() MIME boundary detection: \G didn't seem to be reliable on some platforms with long multiline Content-Type headers.

v2.02 FWM 22-May-2002 at_Entity() Write the X-PerlJacket header after the Content-Type header for superstitious reasons.

v2.03 FWM 18-Aug-2002 at_Entity() Fix a problem with some tnefs still getting through.

v2.04 FWM 14-Dec-2002 ... Technically, you cant exit out of a do.. construct with next. It looks good, and works fine, but in the interest of correctness I've gone through the places where I was using do.. to implement case structures and gone to something a little more couth.

v2.04 FWM 23-Dec-2002 level() Stop it from eating lines which consist of nothing but two dashes. (Tip 'o th' hat to BLB of NLRC for bringing this to my attention)

v2.05 FWM 20-Dec-2003 build_interpolation(),at_header() Implement ability to substitute certain Perlish symbols cadged from the headers into expressions, starting with from_domain.

v2.06 FWM 05-May-2004 ... Implement -f which causes PJ to read from test.txt instead of STDIN... useful if invoking with perl -d.

v2.06 FWM 05-May-2004 line(),level() Silly rabbit, tricks are for pigs!

v2.07 FWM 21-May-2004 read_from_file() Weird SuSE 8.2/Perl 5.8 thing. It wants to do some MIDI ioctl thing with ordinary files, and leaves $! set after successful opens. Do I know why? Do I care??

Author, Copyright, and Terms Of Use

PerlJacket written by and (c) 2002-2004 Fred Morris, Seattle WA. USA e-mail: m3047@inwa.net telephone: 206.297.6344

You are granted a royalty-free, perpetual license as follows:

You may use PerlJacket at your own risk, subject to your own determination that it performs acceptably for your needs. PerlJacket is provided as-is, and with no warranty as to performance or fitness for a particular use. PerlJacket does not conform strictly to the specifications for mail and MIME defined in the IETF RFCs; it performs as the author desires it to perform. If it does not perform as you wish it to, you must cease use, or modify it to suit your own needs. As a condition of this license you agree to hold the author and Fred Morris Consulting harmless for any and all damages, whether direct, consequential or arising from any other theory of law. You may copy, redistribute, modify or create derivative works provided that you give the author proper attribution and furthermore agree to indemnify him against any claims arising from its use by you or others. You have not been required to compensate the author or Fred Morris Consulting in consideration for this license, and you waive all rights to monetary compensation should a claim be upheld. You agree that any claims or other actions pertaining to PerlJacket, its use or this license will be brought in a court of competent jurisdiction in the State of Washington, USA.