{"id":1095,"date":"2024-04-06T08:57:14","date_gmt":"2024-04-06T06:57:14","guid":{"rendered":"https:\/\/valentijn.sessink.nl\/?p=1095"},"modified":"2024-04-06T09:03:34","modified_gmt":"2024-04-06T07:03:34","slug":"automatically-convert-msg-%e2%86%92-eml","status":"publish","type":"post","link":"https:\/\/valentijn.sessink.nl\/?p=1095","title":{"rendered":"Automatically convert .msg \u2192 .eml"},"content":{"rendered":"\n<p>Too many folks these days rely on some sort of Cloud hosted e-mail through one of the Big Five. Mail files are thus no longer .eml files, but .msg &#8211; a Microsoft Outlook proprietary format. Irony has it, that you will share your mail with 766 (or so) external partners by using Outlook &#8211; but .msg is not an interoperable file format. How to proceed? I built a simple script to automatically convert .msg server side.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>First of all, I wanted to make the conversion automatic. I.e. when you save an .msg file on the server, the accompanying .eml appears instantly. To do stuff instantly after a file appears, Linux has the inotify kernel API and an the inotifywait and\/or fsnotifywait utilities.<\/p>\n\n\n\n<p>The <code>inotifywait<\/code> utility will only watch the very directory or files that you let it point to. Having a server with 15K+ directories may be a bit too much. However, the newer <code>fsnotifywait<\/code> can watch a <em>whole file system at once<\/em> for changes. That is what we want, so we&#8217;ll use <code>fsnotifywait<\/code>. <\/p>\n\n\n\n<p>As for which file operations to watch: a msg-conversion should take place immediately after a file has been written. A file that just has been written will be closed with a <code>close_write<\/code> call, so we will watch for that.<\/p>\n\n\n\n<p>So our setup starts with:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>fsnotifywait -q -S --format '%w%f' -e close_write -m --includei 'msg$' \/home\/<\/code><\/pre>\n\n\n\n<p>This spits out filenames  that just have been closed after a write. We can read the filenames and convert their contents. However, there are a few caveats. First of all, not many people seem to realize that a filename under Linux may contain almost any character, <em>including a newline<\/em>. So ending your fsnotify output with newlines (as fsnotify will normally do) is not such a good idea. Let&#8217;s use a modified statement that ends with <code>&lt;00><\/code> characters:<\/p>\n\n\n\n<pre id=\"block-6679d20d-1483-4bdd-acc0-eabb9d919605\" class=\"wp-block-code\"><code>fsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' \/home\/<\/code><\/pre>\n\n\n\n<p>Now we can feed that into <code>msgconvert<\/code>, like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/bin\/bash\nfsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' \/home\/|while read -d $'\\x00' fname; do\n&#91; -e \"${fname%msg}eml\" ] || msgconvert --outfile \"${fname%msg}eml\" \"$fname\"; done<\/code><\/pre>\n\n\n\n<p>Looks good, doesn&#8217;t it? Just one more issue to tackle. Using a pipe between fsnotifywait and read will buffer output from fsnotifywait, i.e. output will sit inside the <code>|<\/code> character for a while. Thus, a file that is closed and notified, could very well be not converted until the next bunch of files is added to the filesystem. We need a way to make the IO-buffer for stdout zero. That is what <code>stdbuf<\/code> will do for us:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/bin\/bash\nstdbuf -o 0 fsnotifywait -q -S --format '%w%f%0' --no-newline -e close_write -m --includei 'msg$' \/home\/|while read -d $'\\x00' fname; do\n&#91; -e \"${fname%msg}eml\" ] || msgconvert --outfile \"${fname%msg}eml\" \"$fname\"; done<\/code><\/pre>\n\n\n\n<p>Run this inside a screen session and you&#8217;re done: anytime anyone writes out an .msg file to the filesystem, an accompanying .eml file is written out to the filesystem.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Too many folks these days rely on some sort of Cloud hosted e-mail through one of the Big Five. Mail files are thus no longer .eml files, but .msg &#8211; a Microsoft Outlook proprietary format. Irony has it, that you will share your mail with 766 (or so) external partners by using Outlook &#8211; but&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[193,9,189,192,191,190],"class_list":["post-1095","post","type-post","status-publish","format-standard","hentry","category-happy-hacking","tag-automatic-msg-conversion","tag-linux","tag-msg","tag-msg-conversion","tag-msg-to-eml","tag-outlook"],"_links":{"self":[{"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/posts\/1095","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1095"}],"version-history":[{"count":7,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/posts\/1095\/revisions"}],"predecessor-version":[{"id":1115,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=\/wp\/v2\/posts\/1095\/revisions\/1115"}],"wp:attachment":[{"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1095"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1095"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/valentijn.sessink.nl\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1095"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}