Automatically convert .msg → .eml

Too many folks these days rely on some sort of Cloud hosted e-mail through one of the Big Five. Mail files are thus no longer .eml files, but .msg – a Microsoft Outlook proprietary format. Irony has it, that you will share your mail with 766 (or so) external partners by using Outlook – but .msg is not an interoperable file format. How to proceed? I built a simple script to automatically convert .msg server side.

First of all, I wanted to make the conversion automatic. I.e. when you save an .msg file on the server, the accompanying .eml appears instantly. To do stuff instantly after a file appears, Linux has the inotify kernel API and an the inotifywait and/or fsnotifywait utilities.

The inotifywait utility will only watch the very directory or files that you let it point to. Having a server with 15K+ directories may be a bit too much. However, the newer fsnotifywait can watch a whole file system at once for changes. That is what we want, so we’ll use fsnotifywait.

As for which file operations to watch: a msg-conversion should take place immediately after a file has been written. A file that just has been written will be closed with a close_write call, so we will watch for that.

So our setup starts with:

fsnotifywait -q -S --format '%w%f' -e close_write -m --includei 'msg$' /home/

This spits out filenames that just have been closed after a write. We can read the filenames and convert their contents. However, there are a few caveats. First of all, not many people seem to realize that a filename under Linux may contain almost any character, including a newline. So ending your fsnotify output with newlines (as fsnotify will normally do) is not such a good idea. Let’s use a modified statement that ends with <00> characters:

fsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' /home/

Now we can feed that into msgconvert, like this:

#!/bin/bash
fsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' /home/|while read -d $'\x00' fname; do
[ -e "${fname%msg}eml" ] || msgconvert --outfile "${fname%msg}eml" "$fname"; done

Looks good, doesn’t it? Just one more issue to tackle. Using a pipe between fsnotifywait and read will buffer output from fsnotifywait, i.e. output will sit inside the | character for a while. Thus, a file that is closed and notified, could very well be not converted until the next bunch of files is added to the filesystem. We need a way to make the IO-buffer for stdout zero. That is what stdbuf will do for us:

#!/bin/bash
stdbuf -o 0 fsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' /home/|while read -d $'\x00' fname; do
[ -e "${fname%msg}eml" ] || msgconvert --outfile "${fname%msg}eml" "$fname"; done

Run this inside a screen session and you’re done: anytime anyone writes out an .msg file to the filesystem, an accompanying .eml file is written out to the filesystem.

Leave a Reply

Your email address will not be published. Required fields are marked *