Just for fun: Rewriting e-mails on the fly with LLMs, AWK and OpenSMTPD

Based on a recent interaction with a friend, I jokingly said “Usually I’m negative towards people using AI to write their emails, but in your case it might be relevant to make an exception. Maybe you should have an SMTP filter that wraps your email with ‘could you write this in a kinder way?’, and send it to an LLM”.

Then I thought, hey, that sounds like a perfectly fine weekend project, just for fun!

Let’s look into writing filters for OpenSMTPD, an SMTP server part of OpenBSD, which I use on my own mail servers.

Overview⌗

In short, these are the steps needed to achieve what we want:

Detect if the sender is a sender that needs their emails rewritten in a kinder way
Intercept the email as it is being sent
Extract the email contents
Send the contents to an LLM, together with instructions to rewrite it in a more kind way
Replace the original email with the kind version
Send it as usual

This should of course be done transparently, without the user knowing.

Also, just to be clear, this is of course just a fun weekend project, which will also be reflected in the code quality below. It is not exactly production-ready :).

Some background about SMTP filters⌗

OpenSMTPD supports filters that can interact with e-mails either when received, or when sent. In this case, we want to hijack the e-mail as it is being sent.

OpenSMTPD has a text-based protocol where you can add filters to various phases of the email flow. Being an OpenBSD project, the documentation is excellent as usual, and I really recommend reading the details about the protocol: https://man.openbsd.org/smtpd-filters. In short, the filtering can be done by an external process, written in any language, as long as it follows the protocol.

In our case, we will listen for the mail-from and data-line events, which corresponds to sender email address and the actual email contents, respectively.

As said before, the filter can be written in any language, but I’ve decided to write it in AWK, since I think it is a super fun language for these kind of experiments.

I also got a lot of inspiration from http://blog.0xpebbles.org/Simple-OpenSMTPD-filter-example-in-awk, which has written another SMTP filter in awk, using DNS blocklists to block certain IP addresses from connecting.

The LLM filter in AWK⌗

The major part of this effort is to write the filter itself. In short, it will detect if the sender is someone who’s emails need to be rewritten, and in that case collect the full email, send it to an LLM wrapper shell script (more on this later), and then send back the changed text. There’s also some temporary files involved which stores the email on disk etc.

I would just recommend reading the code, it should be well-commented enough :).

#!/usr/bin/awk -f

# Note that you need to change the path for the llm_wrapper.sh later in the script.

# Usage in smtpd.conf:
#   filter <filter-name> proc-exec "/path/to/llm_filter.awk"

BEGIN {
    FS = "|"
    OFS = "|"
}

"config|ready" == $0 {
    # tell opensmtpd we need both mail contents and sender address.
    print("register|filter|smtp-in|data-line") > "/dev/stdin"
    print("register|filter|smtp-in|mail-from") > "/dev/stdin"
    print("register|ready") > "/dev/stdin"
    next
}

"filter" == $1 && "mail-from" == $5 && $8 ~ /rude-person@example.com/ {
    # start of new e-mail where sender is a rude person, store their e-mail in
    # a temporary file, so we can pass it on to the LLM.
    sess_id = $6
    cmd_mktemp = "mktemp"
    cmd_mktemp | getline temp_mail
    close(cmd_mktemp)
    tokens[sess_id] = temp_mail
    print "debug: matched rude person, added sess_id to tokens: " sess_id > "/dev/stderr"
    print "filter-result", $6, $7, "proceed" > "/dev/stdin"
    next
}

"filter" == $1 && "mail-from" == $5 {
    # start of new e-mail for non-rude person, just pass it along.
    print "filter-result", $6, $7, "proceed" > "/dev/stdin"
    next
}

"filter" == $1 && "data-line" == $5 && "." == $8 && $6 in tokens {
    # end marker for rude person's data, we've now collected all lines and can
    # pass them to the LLM and ask it to niceify the message.
    sess_id = $6
    resp_token = $7
    temp_mail = tokens[sess_id]

    print "debug: calling llm with temp mail " temp_mail > "/dev/stderr"
    cmd_llm = "/home/linus/llm_wrapper.sh " temp_mail  # TODO: change path
    while ((cmd_llm | getline line) > 0) {
        print "filter-dataline", sess_id, resp_token, line > "/dev/stdin"
        #print "debug: new line is" line > "/dev/stderr"
    }
    print "filter-dataline", sess_id, resp_token, "." > "/dev/stdin" # finalize e-mail
    close(cmd_llm)
    close(temp_mail)
    delete tokens[sess_id]

    system("rm " temp_mail)
    next
}

"filter" == $1 && "data-line" == $5 && $6 in tokens {
    # just keep collecting the rude person's message until the final dot.
    # We need to send the message in one piece to the LLM.
    sess_id = $6
    resp_token = $7

    temp_mail = tokens[sess_id]
    print $8 >> temp_mail  # TODO: will fail if e-mail contains |
    fflush()
    next
}

"filter" == $1 && "data-line" == $5 {
    # non-rude person's e-mail is just passed through unaltered.
    sess_id = $6
    resp_token = $7
    print "filter-dataline", sess_id, resp_token, $8 > "/dev/stdin" # TODO: will fail if e-mail contains |
}

The small LLM wrapper⌗

The AWK script above called a small script llm_wrapper.sh. It might be possible to write everything in AWK, but I was just too lazy so I wrote this small part as a shell script. As a bonus, this also means that it is easy to change to another LLM.

I’ve used Google AI Studio with Gemini for this example, since it had simple free tier to use. Let’s just say I don’t believe enough in my idea to actually spend money on it ;).

The script takes as the first and only argument the temporary file with the original email, and then prints the new email to stdout.

The system instructions sent to the LLM are: “You will receive a full email with both headers and text in RFC 2822 format. Modify the text part of the email and make it sound nicer, if it is currently a bit rude. The output should also be in RFC 2822 format, with the headers unchanged. You should only rewrite the text part, and maybe the Subject part, but no other headers should be changed. Respond in the same language as the query. Respond only with the modified e-mail, nothing else.”

#!/bin/sh

GEMINI_API_KEY="USE_YOUR_OWN_KEY"
json="$(jq --arg email "$(cat "$1")" -n '{system_instruction:{parts:[{text:"You will receive a full email with both headers and text in RFC 2822 format. Modify the text part of the email and make it sound nicer, if it is currently a bit rude. The output should also be in RFC 2822 format, with the headers unchanged. You should only rewrite the text part, and maybe the Subject part, but no other headers should be changed. Respond in the same language as the query. Respond only with the modified e-mail, nothing else."}]},contents:[{parts:[{text:$email}]}]}')"
curl -s "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=${GEMINI_API_KEY}" --json "${json}" | jq -r '.candidates[0].content.parts[0].text'

Configuring OpenSMTPD to use the filter⌗

The final thing remaining is to configure OpenSMTPD to actually use the filter.

This is done by adding one line, and modifying another in the OpenSMTPD configuration file.

# Add this one to describe the filter.
filter niceify proc-exec "/home/linus/llm_filter.awk"

# Add filter niceify to the end of an existing listen directive.
listen on all port submission tls-require pki smtp.example.com auth <users> filter niceify
# ... rest of config file and other listen directives etc.

End result⌗

This shows the rude email as it is being composed in Thunderbird.

This is what the recipient sees.

Final remarks⌗

It would be super depressing if someone actually think this is a good solution, instead of just working on their anger management skills :|