Click here to get rid of any annoying frames

TOC


cmd:procmail


Info

http://directory.google.com/Top/Computers/Software/Internet/Clients/Mail/Unix/Procmail/

http://directory.google.com/Top/Computers/Software/Internet/Clients/Mail/Unix/Procmail/Tutorials/

[]Procmail Quick Reference Guide
http://www.ling.helsinki.fi/users/reriksso/procmail/quickref.html

http://www.sektorn.mooo.com/era/procmail/quickref.html

[]Debugging Procmail Recipes: Some Tips http://www.ling.helsinki.fi/users/reriksso/mail/procmail-debug.html

my quick help

    # ########################################### &-procmail_usage_hints ###
    # ** Delivering

    # There  are  two  kinds  of  recipes:  delivering  and   non-
    # delivering  recipes.   If  a  delivering  recipe is found to
    # match,  procmail  considers  the  mail  (you   guessed   it)
    # delivered  and will cease processing the rcfile after having
    # successfully executed the action line of the recipe.   If  a
    # non-delivering  recipe  is found to match, processing of the
    # rcfile will continue after the action line  of  this  recipe
    # has been executed.

    # Delivering recipes are those that cause header  and/or  body
    # of  the  mail to be: written into a file, absorbed by a pro-
    # gram or forwarded to a mailaddress.

    # Non-delivering recipes are: those that cause the output of a
    # program  or  filter to be captured back by procmail or those
    # that start a nesting block.

    # You can tell procmail to treat a delivering recipe as if  it
    # were  a  non-delivering recipe by specifying the `c' flag on
    # such a recipe.  This will make procmail  generate  a  carbon
    # copy  of  the mail by delivering it to this recipe, yet con-
    # tinue processing the rcfile.

    # ** Recipes

    # A line starting with ':' marks the beginning  of  a  recipe.
    # It has the following format:

    #  :0 [flags] [ : [locallockfile] ]
    #  <zero or more conditions (one per line)>
    #  <exactly one action line>

    # Conditions start with a leading `*', everything  after  that
    # character  is  passed  on  to  the internal egrep literally,
    # except for leading and trailing whitespace.

    # ** Quick Reference Guide
    # http://www.ling.helsinki.fi/users/reriksso/procmail/quickref.html

    # ** Debugging Procmail Recipes: Some Tips
    # http://www.ling.helsinki.fi/users/reriksso/mail/procmail-debug.html

Help

A line starting with `:' marks the beginning of a recipe. It has the following format:

    :0 [flags] [ : [locallockfile] ]
    <zero or more conditions (one per line)> <exactly one action line> 

Conditions start with a leading `*', everything after that character is passed on to the internal egrep literally, except for leading and trailing whitespace. These regular expressions are completely compatible to the normal egrep(1) extended regular expressions. See also Extended regular expressions.

Conditions are anded; if there are no conditions the result will be true by default.

Flags can be any of the following:

H

       Egrep the header (default). 
B 
       Egrep the body. 
D 
       Tell the internal egrep to distinguish between upper and lower case
       (contrary to the default which is to ignore case). 
A 
       This recipe will not be executed unless the conditions on the last
       preceding recipe (on the current blocknesting level) without the
       `A' or `a' flag matched as well. This allows you to chain actions
       that depend on a common condition. 
a 
       Has the same meaning as the `A' flag, with the additional condition
       that the immediately preceding recipe must have been successfully
       completed before this recipe is executed. 
E 
       This recipe only executes if the immediately preceding recipe was
       not executed. Execution of this recipe also disables any
       immediately following recipes with the `E' flag. This allows you to
       specify `else if' actions. 
e 
       This recipe only executes if the immediately preceding recipe
       failed (i.e. the action line was attempted, but resulted in an error). 
h 
       Feed the header to the pipe (default). 
b 
       Feed the body to the pipe (default). 
f 
       Consider the pipe as a filter. 
c 
       Generate a carbon copy of this mail. This only makes sense on
       delivering recipes. The only non-delivering recipe this flag has an
       effect on is on a nesting block, in order to generate a carbon copy
       this will clone the running procmail process (lockfiles will not be
       inherited), whereas the clone will proceed as usual and the parent
       will jump across the block. 
w 
       Wait for the filter or program to finish and check its exitcode
       (normally ignored); if the filter is unsuccessful, then the text will
       not have been filtered. 
W 
       Has the same meaning as the `w' flag, but will suppress any
       `Program failure' message. 
i 
       Ignore any write errors on this recipe (i.e. usually due to an early
       closed pipe). 
r 
       Raw mode, do not try to ensure the mail ends with an empty line,
       write it out as is. 

There are some special conditions you can use that are not straight regular expressions. To select them, the condition must start with:

!

       Invert the condition. 
$ 
       Evaluate the remainder of this condition according to sh(1)
       substitution rules inside double quotes, skip leading whitespace,
       then reparse it. 
? 
       Use the exitcode of the specified program. 
< 
       Check if the total length of the mail is shorter than the specified (in
       decimal) number of bytes. 

>
Analogous to `<'.

variablename ??

      Match the remainder of this condition against the value of this
      environment variable (which cannot be a pseudo variable). A special
      case is if variablename is equal to `B', `H', `HB' or `BH'; this
      merely overrides the default header/body search area defined by the
      initial flags on this recipe.

\

       To quote any of the above at the start of the line. 

Mail Management With Procmail

By Vikram Vaswani
January 16, 2003

Printed from DevShed.com
URL: http://www.devshed.com/Server_Side/Administration/Procmail


Speaking Geek

Let's start with a very basic question - what is procmail, and how is it going to make your life more fun?

Before I can answer that question, there are three acronyms you need to learn. Here they are:

MTA: An MTA, or mail transfer agent, is a program that routes mail from one host to another on the Internet. An MTA accepts email, looks at the destination address, and either passes the message on to another MTA or hands it off to an MDA for delivery to a local mailbox. Examples of common MTAs are sendmail and qmail.

MDA: An MDA, or mail delivery agent, is the program that accepts email from an MTA and actually delivers it to the recipient's mailbox. When an MDA works locally - that is, delivers mail to user mailboxes local to that host only - it is sometimes referred to as an LDA, or local delivery agent. Examples of common MDAs are procmail and mail.local.

MUA: An MUA, or mail user agent, is the program users interact with to view email messages, reply to or forward them, compose new messages and otherwise manipulate the contents of a mailbox. Examples of common MUAs are pine and mutt.

In a typical *NIX environment, the MTA takes care of accepting email messages and figuring out how to deliver them to their destination. Once the email message actually reaches the destination host, control passes to the LDA, which takes care of delivering the incoming message to the specified user's mailbox (or "mail spool") on that system. The user may then use any MUA to view and manipulate the mailbox and its contents.

Where does procmail fit into this picture? Developed by Stephen R. van den Berg in 1998, procmail is an LDA, commonly used to deliver messages to local mailboxes. Don't think of it as just a delivery boy, though - procmail's built-in constructs allow you to convert it into a primitive mail robot, automatically scanning and sorting received messages according to pre-defined rules.

These procmail rulesets (or "recipes") are very powerful - they allow you to do all kinds of magical things to your email, including automatically forwarding it elsewhere, starting special programs on your system for different types of email, filtering it into different mailboxes or - this is the part I like the most - automatically trashing spam without you ever having to see it.

The end result? Automatic, transparent mail management that allows you to zero in on important email, leaving the chaff for later.


Canning The Spam

As you might imagine from the preceding discussion, procmail is a great tool to use if you're concerned about spam clogging your mailbox. The simplest solution here is to add a series of recipes to the beginning of your ".procmailrc" file, which can scan incoming email for known spammer addresses and automatically filter those messages out. As an example, consider the following set of recipes:

    :0:
    * ^From:.*angie76@spammer.com
    SPAM

    :0:
    * ^From:.*@known-bad-domain.com
    SPAM

    :0:
    * ^Subject:.*make money fast
    SPAM

As I stated earlier, it's usually a good idea to place these recipes neat the top of your ".procmailrc" file, so that they are processed first.
If you get a large amount of spam, the technique above may seem inconvenient, as your ".procmailrc" file will rapidly grow in size as more and more spammers add your email address to their database. For convenience, you can either place these recipes in a separate file and include them into your ".procmailrc" (this technique is discussed a little further down) or you can create a so-called "black list" of known spammer addresses in a separate file, and have procmail scan through it looking for a match every time it receives email. This second technique is a little processor-intensive, but has the benefit of simplicity - it needs only a single recipe. Take a look:

    :0
    * ? egrep -is -f /home/me/black-list.txt
    SPAM

The file "black-list.txt" is a simple list of email addresses, like so:

    angie76@spammer.com
    clouded_mind_99@hotmail.com
    o@o.com

In this example, the "egrep" program is called by procmail to see if the message headers contain a known spammer's email address. If a match is found, the message is moved to the "SPAM" mailbox. This is a much simpler approach than the one described previously, as all you need to do is update your black list on a regular basis. Another technique of dealing with spam is so-called "reverse filtering", which uses a "white list" of known addresses. In this case, mail is only delivered to your mailbox if it matches an address in the white list; all non-matching email is treated as spam and either transferred to a spam folder for later review, or summarily deleted. The following example demonstrates:

    :0
    * !? egrep -is -f /home/me/white-list.txt
    SPAM

Robot

You've already seen how procmail can be set up to forward incoming email to another email address, or to append it to a mailbox. You might also remember that there's a third option - pipe the mail to an external program for further processing. More often that not, this program is "formail", a companion program that ships with the procmail distribution.

Formail is a mail formatter, used primarily to extract or manipulate the headers of an email message. Its applications include creating new email messages on the fly, checking the value of a header in a received message, rewriting exiting headers with new data, removing unwanted headers, forcing messages into a standard mailbox format and splitting up a message digest or mailbox into individual messages.

Formail makes it possible to do some very creative things with your email - and one of its most common applications includes creating a simple auto-responder for your email when you're away from your computer. Consider the following recipe, which demonstrates:

    :0
    | (/usr/bin/formail -r ; cat autoresponse.txt) | /usr/sbin/sendmail -oi
    | -t

In this case, every message received is piped through formail, which uses the contents of the file "autoresponse.txt" to generate a reply back to the original sender.

It should be noted that the recipe above is purely illustrative and should not be used in a live environment, as it does not include error handling for email loops or messages from mailing lists; refer to the procmail manual for a more comprehensive example.

You can also use procmail in combination with the SpamAssassin spam filter ([11]http://www.spamassassin.org/) to automatically detect and block spam. All you need to do is pipe each message through SpamAssassin, and let it check to see if the message matches any of its spam rulesets. Based on the results of its heuristic tests, SpamAssassin automatically tags each message with an "X-Spam-Status" header indicating whether or not it is spam; this header can then be used by procmail to filter spam out of your regular mail spool into your "SPAM" mailbox, or send it straight to the trash can. Here's how this might work.

    :0fw:
    | /usr/local/bin/spamassassin

    :0:
    * ^X-Spam-Status: Yes
    SPAM

More information on how SpamAssassin can be used with procmail is available at [12]http://www.spamassassin.org/


Tweaking The Engine

In the initial stages of setting up your procmail recipes, you'll need to keep an eye on what's happening so that you don't accidentally lose mail in case one of your recipes is a little off. To assist in the process, procmail comes with powerful logging capabilities, which allow you to see exactly what's happening with your mail messages.

This log is activated via the special "LOGFILE" and "VERBOSE" variables in your ".procmailrc" file, which specify the name of the log and the extent of detail in it, respectively. Consider the following example:

    LOGFILE=$HOME/procmail.log
    VERBOSE = yes

You can summarize the contents of this log file using the "mailstat" command, which also ships with the procmail distribution - take a look at the mailstat manual page for information on how to use the procmail logs to build different types of reports. Procmail typically looks for mailboxes in your home directory. This doesn't usually work for me, since all my mailboxes are in a folder named "mail" under my home directory. If your situation is similar, consider telling procmail to adjust its default mailbox search path via the "MAILDIR" variable.

    MAILDIR=$HOME/mail

Finally, if you find it somewhat unsystematic to keep all your recipes in a single file, you can even split them up into separate files and merge them into your ".procmailrc" file via the "INCLUDERC" variable, as below:

    INCLUDERC = spam.procmailrc
    INCLUDERC = lists.procmailrc

If you have a lot of recipes, modularizing them in this manner makes them more manageable, and it also becomes easier to selectively include or exclude them from your ".procmailrc" file.


Closing Time

And that's about all we have time for. In this article, I introduced you to procmail, one of the more powerful mail processors on the *NIX platform, and demonstrated the process of compiling and installing it to your box. I showed you the basics of procmail recipes, explaining how they can be used to filter messages on the basis of specific characteristics, and send them to a mailbox, another email address or an external program. I also explained how procmail's recipes can be used to counter spam, to automatically organize messages from mailing lists into different mailboxes and to create email robots that reformat, reprocess and automatically respond to messages on the fly.

Of course, in all this, I barely scratched the tip of the iceberg - procmail is so versatile that its applications are almost infinite, and listing them all in a single place is well nigh impossible. The

following links should provide good starting points, however: The procmail mailing list, at
[15]http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail/

The procmail FAQ, at
[16]http://www.iki.fi/era/procmail/mini-faq.html

Sample recipes, at

   [17]http://www.math.fu-berlin.de/%7Eguckes/procmail/             and

[18]http://www.uwasa.fi/~ts/info/proctips.html

This article copyright [13]Melonfire 2000-2002. All rights reserved.

>> 2000.03.14 Tue 13:42:16


Cmd:procmail


variable assignment

Note well that an action that is a variable assignment always has to go inside a set of braces:

               { VAR=value }

Just VAR=value without the braces would save to a folder named VAR=value. (Yes, that is a valid file name under Unix.)


Dry Runtesting

It means that you call your procmail test script directly with sample test mail

             % procmail $HOME/pm/pm-test.rc < $HOME/tmp/test-mail.txt

Remember that you can define environment variables as well in the dry run call. Here's an example where procmail just executes the script and does nothing fancy.

             % procmail VERBOSE=on DEFAULT=/dev/null \
                 ~/pm/pm-test.rc < ~/txt/test-mail.txt

Suppose the script prints something to logfiles, but you'd instead like to get it all dumped to screen. No problem

% procmail VERBOSE=on DEFAULT=/dev/null LOGFILE=`tty`


Testing variables

       If possible, perform positive tests, rather than negative, like below: 

             * ! TEST_FLAG ?? yes

       Alternative with a positive test: 

             *  TEST_FLAG ?? no

[procmail] test on empty variable

here is a summary:

> How do you check if a var is empty in an procmail recipe?

Timo Salmi writes:

    * PWM ?? ^^^^

    #Test for an empty or missing subject
    :0:
    * SUBJ_ ?? ^^^^

Henning C. Nielsen wrote:

    Here is my suggestion:

    :0[...]
    * ? test -z "$var"

    Actually, a simple 

    * ! PWM ?? ^$

will also work.


Timo's procmail tips and recipes

http://www.uwasa.fi/~ts/info/proctips.html


how do I make "or" rules?

#Accept email from Era Eriksson, the author of the major procmail FAQ
:0:
* ^From:.*reriksso@([-a-z0-9_]+\.)*helsinki\.fi|\
  ^From:.*era@iki\.fi
${DEFAULT}

There are alternatives. Scoring could be used for the same purpose

    :0:
    * 1^0 ^From:.*reriksso@([-a-z0-9_]+\.)*helsinki\.fi
    * 1^0 ^From:.*era@iki\.fi
    ${DEFAULT}

procmail tips

http://pm-doc.sourceforge.net/pm-tips-body.html


Disabling a recipe temporarily

If you have a recipe that you would like to disable for a while, there is an easy way. Just add the "false" condition line before any other conditions. The "!" also nicely visually flags that "this recipe is NOT used".

      #  This recipe stops at "!" and doesn't get past it.

      :0
      * !
      * condition
      * condition
      {
          ...
      }

The order of the flags

The Order of the flags does not matter in practice, but here is one stylistic suggestion. The idea here is that the most important flags are put to the left, like giving priority 1 for aAeE, which affect the recipe immediately. Priority 2 is given to flag f, which tells if a recipe filters something. Also (h)eader and (b)ody should immediately follow f, this is considered priority 3. In the middle there are other flags, and last flag is c, which ends the recipe, or allows it to continue. In addition according to [david]: "...I'm quite sure that putting anything other than the opening colon and the number to the left of AaEe will cause an error."

      :0 aAeE HBD fhb wWir c: LOCKFILE
         |    |   |   |    |
         |    |   |   |    (c)ontinue or (c)lone flag last.
         |    |   |   (w)ait and other flags
         |    |   (f)ilter flag and to filter what: (h)ead or (b)ody
         |    (H)eader and (B)ody match, possibly case sensitive (D)
         The `process' flags first. (A)nd or (E)lse recipe

You can write the flags side by side

      :0Afhw:$MYLOCK$LOCKEXT

Or, as suggested, leave flags in their own slot for more distinctive separation. Note that $LOCKEXT must be next to $MYLOCK, because it contains string ".lock".

      :0 A fhw: $MYLOCK$LOCKEXT

Flags aAeE tutorial

[david] AaEe are mutually exclusive and no more than one should ever appear on a single recipe. [philip] Actually, this is not true. e does not work with E or a (and procmail gives a warning if you try), and A is redundant if a is given, but at least some of the other combination make sense and work.

These mnemonics might help:

# [philip] demonstrates `e'

      :0 :            # match, but action fails
      /etc/hosts/foo
          :0 A        # no match
          * -1^0
          /dev/null

      :0 e # this is skipped because the last tried recipe didn't match
      {
          ...whatever
      }

How they interact with one another when used consecutively has not been fully tested to my knowledge. Consider this:

      :0
      * conditions
      non-delivering-action1

          :0 a
          action2

      :0 e
      action3

Is action3 done if action2 failed or if action1 failed (or perhaps in both situations)? [philip] Action 3 is only done if action2 failed.

If the answer is action2, does this work to get action3 done if action1 failed? I think it does, but does it also run action3 if the conditions didn't match on the first recipe? [philip] Yes, and yes.

      :0             #   [david]
      * conditions
      non-delivering action1

          :0a
          action2

      :0E
      action3

[philip] If that's not what you want, combine some flags:

      :0
      * conditions
      non-delivering action1

          :0 Ae
          action3

      :0 a
      action2

If the conditions match, action1 will be executed. action3 will then execute if action1 failed, otherwise action2 will be executed [if action1 succeeded].

[david] I know what this structure does because I use it:

      :0
      * conditions
      non-delivering action1
          :0A
          action2

      :0E
      non-delivering action3
          :0A
          action 4

If the conditions match, action1 and action2 are performed and action4 is not (of course action3 is not either), even if action2 is non-delivering; if they fail, action3 and action4 are performed. The A on the fourth recipe refers back to the third and no farther. But I don't know about this:

      :0
      * conditions
      non-delivering action1
          :0A
          * more conditions
          action2

      :0E
      non-delivering action3
          :0A
          action 4

Now, suppose the conditions on the first recipe match but those on the second recipe do not match. Would the third recipe (and thus the fourth one) be attempted? I would expect so. [philip] Yes. The last tried recipe didn't match, therefore the E flag will be triggered.

If that isn't what you want, you can prevent it this way:

      :0
      * conditions
      {
          :0
          non-delivering-action1

          :0
          * more-conditions
          action2
      }

      :0 E # ignores mismatch inside braces, looks only at same level
      non-delivering action3

      :0 A
      action4

If that is what you want, you can be positive this way:

      # if action2 is non-delivering or vulnerable to error that
      # would cause fall-through

      DID2         # Kill variable

      :0
      * conditions
      non-delivering-action1

          :0 A
          action3

      :0
      * ! DID2 ?? (.)
      non-delivering-action3

          :0 A
          action4

      # if action2 is delivering and sure to succeed
      :0
      * conditions
      non-delivering-action1

          :0 A
          * more-conditions
          action2

      :0
      non-delivering-action3

          :0 A
          action4

[philip] or those who are interested, I'll note that there are only 3 combinations of the a, A, e, and E flags that aren't either illegal or redundant. They are Ae, aE, and AE. I've shown a use for Ae up above. Here's an example of AE:

      :0
      * condition1
      non-delivering action1

          :0 A
          * condition2
          non-delivering action2

      :0 AE
      action3

action3 will only be executed if condition1 matched but condition2 didn't match. Without the A flag, action3 would be executed if either of them failed. This can also be done with a instead of A with analogous results.

Procmail's "flow-control" flags may not be particularly easy to describe in straight terms (and this can all be made more complicated by throwing in a more varied mix of delivering vs non-delivering recipes), but I've found that it usually does what I expect it to do, and when it doesn't or I'm in doubt or I want to be particularly clear, I can always fall-back to doing it explicitly via nesting blocks. Pick your poison...


procmail: forward header & first 10 lines of the body

:Are there some easy ways to forword first 10 lines of message body to
:another account while preserving all the message headers? -- I.e,
:one the receiving side, I want to see all the message travel history
:and only first 10 line of the message body.

That makes an interesting exercise for my Procmail FAQ. The task is not trivial, since when you forward, the original message headers will be replaced by your forwarding headers. Therefore, you'll have to see to preserving also the original headers. Below is how I would solve the problem.

# A trick to extract the subject into a variable
SUBJ_=`formail -c -xSubject: | sed -e 's/^[ ]*//g' -e 's/[ ]*$//g'`

# The actual recipe to solve the exercise starts here
:0
* Whatever condition(s) you wish to select the messages for forwarding
# Avoid email loops
* ! ^X-Loop: myid@myhost\.mydom
{
  :0c:   #If you want to, preserve a full copy of the email, else omit
  ${DEFAULT}
  :0fwb  #Truncate the body of the message to ten lines
  | head -10
  :0 fwh #Insert a blank line at the beginning of the body for clarity
  | cat - ; echo ""
  :0fwh  #Store the original headers, quoting them to avoid problems
  | sed -e 's/^/\> /'
  :0fwh  #Insert some of your own information before forwarding
  | formail -A"X-Loop: myid@myhost.mydom" \
            -i"Subject: $SUBJ_ (fwd)"
  # Forward the email
  :0
  !my2dnId@myhost.mydom
}

All the best, Timo

Prof. Timo Salmi ftp & http://garbo.uwasa.fi/ archives 193.166.120.5


procmail: forward header & first 10 lines of the body

Here is the preliminary outline. The trick (to make it easier) is to use "head" rather than "sed" as you tried to.

:0
* Whatever rule you wish for choosing the messages
# Avoid email loops
* ! ^X-Loop: myid@myhost\.mydom
{
  :0c:   #If you want to, preserve a full copy of the email, else omit
  ${DEFAULT}
  :0fwh  #Adjust some headers before forwarding
  | formail -A"X-Loop: myid@myhost.mydom" \
            -A"X-From-Origin: ${FROM_}" \  
            -i"Subject: $SUBJ_ (fwd)" \
            -i"Content-Length:"
  :0fwb  #Truncate the body of the message
  | head -10
  # Forward the email
  :0
  !my2ndId@myhost.mydom
}

All the best, Timo


procmail: forward header & first 10 lines of the body

> N=10
> # sed ${N}q if you don't have head
> :0 bi
> {
> toplines = `head -$N`
> }
[...]

> The only problem I'm having now is that the rule I chose for the
> "toplines" is bi, (and it was originally Bi), but both gave me top 10
> lines of header instead of body.

I think the backticks (``) always pass the whole message to the "head -$N" command, regardless of flags on the recipe.

Collin Park


procmail: forward header & first 10 lines of the body

| N=10
| # sed ${N}q if you don't have head
| :0 bi
| {
| toplines = `head -$N`
| }

`b' won't carry into the braces. If you want the top ten lines of the body,

    :0bi
    toplines=| head -10 # or sed 10q

or if you prefer to put the number into a variable,

    LINECOUNT=10
    :0bi
    toplines=| head -$LINECOUNT # or sed ${LINECOUNT}q

David W. Tamkin

<< 2000.08.07 Mon 11:36:07


Filtering Mail

>Is there a way to filter out e-mail that is simultaneously posted to a
>newsgroup? I hate getting e-mail that is simultaneously posted to a
>newsgroup. Especially if the header doesn't say it is.

How about setting up a procmail rule such as:

    :0
    * ^X-Also-Posted-To: 
    {
      # junk this mail:
      DELIVERED=yes
    }

Ken Pizzini


:spam

http://www.eskimo.com/~parents/filter/


procmail: WHat is the standard idiom for bouncing mail?

>^From.*spammer@spam.org
>{
> EXITCODE=77
> HOST=done
>}
>
> What exactly is the significance of the 77?

That's an EX_NOPERM error in <sysexits.h>. For bouncing mail you might try EX_NOUSER (67) instead.

Ken Pizzini


Any file size limit for procmail?

Newsgroups: comp.mail.misc
Date: Sun, 16 Mar 2003 20:42:25 GMT

I am trying to filter big html mails down to text only. but, I am experiencing a very strange error of my procmail script. Took me nearly 3 hours to trace down to the simpliest question -- how big a file procmail can handle?

(extremely simplified) test case like this:

:0 H
* ^From: somewhere
{
  # message truncate
  :0 fbw
  | sed '30q' 
}

cat msg.wka | formail -s procmail VERBOSE=on ~/bin/.procmailrc.mf

It gave me the following error:

[...]
procmail: Executing "sed,30q"
procmail: Error while writing to "sed"
procmail: Rescue of unfiltered data succeeded
procmail: No match on ! "^$"
[...]

Then I tried to shrink it down:

sed '300q' < msg.wka > msg.wka.sht/

-rw-------    1    35400 Mar  5 02:15 msg.wka
-rw-rw----    1    12915 Mar 15 23:46 msg.wka.sht/

but the problem is still the same. Only after I do further shrinking did it pass through happily:

sed '100q' < msg.wka > msg.wka.sht/

-rw-------    1    35400 Mar  5 02:15 msg.wka
-rw-rw----    1     4857 Mar 15 23:47 msg.wka.sht/
$ cat msg.wka.sht/ | formail -s procmail VERBOSE=on ~/bin/.procmailrc.mf
[...]
procmail: Executing "sed,30q"
procmail: No match on ! "^$"
[...]

So, I concluded that my procmail has a sever limit on file size.

How can I make procmail handle big files, like in this example?


Any file size limit for procmail?

> procmail: Executing "sed,30q"
> procmail: Error while writing to "sed"

You want it to ignore this "Error while writing", so set the "i" flag in the recipe, as described here:

       i    Ignore  any write errors on this recipe (i.e. usually
            due to an early closed pipe).

In other words, something like this (untested!)

    :0 H
    * ^From: somewhere
    {
      # message truncate
      :0 fbwi
      | sed '30q' 
    }

Take a look at Jari Aalto's procmail tips page for more info.

Collin Park


Any file size limit for procmail?

> :0 H
> * ^From: somewhere
> {
> # message truncate
> :0 fbwi
> | sed '30q'
> }
>
> Take a look at Jari Aalto's procmail tips page for more info.

Another thing to do is to change the first line from

:0 H

to

:0

since H is part of the default flags (along with hb) and there's a bug in 3.22 related to using it.

Hope this helps,
Nancy

PROCMAIL <http://www.ii.com/internet/robots/procmail/qs/>;


Any file size limit for procmail?

> > since H is part of the default flags (along with hb) and there's
> > a bug in 3.22 related to using it.
>
> Can you point me to where I can read more about this?

<http://www.ii.com/internet/robots/procmail/qs/#anatomy>;

where I say:

If no flags are specified, Procmail uses the Hhb flags. It is usually best to not explicitly specify the default flags. For more about this, see the message Cannot get recipes to work properly in the Procmail mailing list. This describes a bug with the H and B flags and how to work around it.

and include links to more details.

> Pedantically,
>
> Wouldn't I get a bit of efficiency to use ':0 H', because egrep only
> works on the header, instead of the whole message, as ':O' does?

In theory

:0

is equivalent to

:0 H

and also equivalent to

:0 Hhb