blah.... blah.... blah...

My blah....blah....blah....
My Photo
Location: Delhi, Delhi, India

I'm a hacker, a free software advocate, and a student.

26 March 2006


Outlook Express to mbox conversion

Yesterday, I was at my dad's office. He recently migrated some of his office PCs to Linux. One of the big problems encountered after transition to Linux is migration of e-mails from his previous e-mail client Microsoft Outlook Express 6 (on Microsoft Windows). I'm already using Evolution since November' 2005 (since I got an internet connection at my place :-D). I've heard of Outport for export of mails from Microsoft Outlook Express. So, I downloaded and installed it on his Windows OS. And then started it but it terminated with an error saying Microsoft Outlook is not installed. Oops, but we're not using Microsoft Outlook. So, all the hope of exporting mail from Outlook Express was lost till I got an idea

The idea to export a folder was to select all mails in any folder, and then using DnD feature provided by Windows (drag-and-drop), drag them to a Explorer window and drop them (a CommonDialog will work too but I've not tried that). This will copy all mails in as .EML files in the current directory in Explorer window. The good thing about these .EML files are they're plain text files representing a mail in MIME format which mbox file uses.

An mbox file looks like below. When mbox is passed to file utility for recognition it recognizes as ASCII mail text, with very long lines. Whenever a new mail arrives it is appended to mbox with a starting line similar to From tycoon@someorganization.tld Tue Jul 31 13:21:11 2008. This line is not part of MIME specification.

From Tue Sep 25 07:45:12 2001
Return-Path: <>
Received: from (IDENT:mail@localhost []) by (8.9.3/8.9.3) with ESMTP id HAA20680; Tue, 25 Sep 2001
        07:45:12 -0400
Received: from ( []) by (8.9.3/8.9.3) with ESMTP id HAA20659 for
        <>; Tue, 25 Sep 2001 07:45:10 -0400
Received: (qmail 5610 invoked from network); 25 Sep 2001 11:45:02 -0000
Received: from (HELO localhost) ( by with SMTP; 25 Sep 2001 11:45:02 -0000
From: "The Evolution Team" <<
To: Evolution Users <>
Content-Type: multipart/related; type="multipart/alternative"; boundary="=-t4dRE6cqcdSBHOrMdTQ1"
X-Mailer: Evolution/1.1.99 (Preview Release)
Date: 7 September 2005 14:45:00 +0300
Message-Id: <1001418302.27070.20.camel@spectrolite>
Mime-Version: 1.0
Subject: Welcome to Evolution!

So, now you know where to hack. ;-). Just wrote a simple shell script (or an operating system driver ;-) ) that concatenates each file and delimits them with the line similar to From tycoon@someorganization.tld Tue Jul 31 12:40:33 2008. Most of the mail programs doesn't consider this line to be meaningful (I think so) and consider it as a mail delimiter. So a fake line need resembling above needs to added before each mail entry. The from address and date are instead extracted from From: and Date: MIME headers. So basic pseudocode is

  1. Create a empty [mbox file]
  2. Initialize default delimiter [line]
  3. For each [file] in [list of files] do {
  4. Echo [line] to the [mbox file] in append mode1.
  5. Type [file] to the [mbox file] in append mode1.
  6. }

I'm not providing any shell script to do this job for you. Since one of the major pillars in GNU/Linux is Software Toolbox philosphy. And if you're learning these utilities, this might be a good project for you. If you're lucky enough somebody might have posted the script as comment in my blog. ;-)

1HINT: cat abcd.txt >>file_opened_in_append_mode.


A useful thought

The only key to success (or precisely optimised life) is precision. It doesn't matter with how much hardwork you achieved that precision. If you never achieved that precision you're not living optimally. In field of algorithms analysis, it is known as tightly bounding, running time of algorithm. In troubleshooting field, it means accurately identifying problems. Optimization of code depends on precise definition of what is the objective of code. Different words in different fields but the same meaning.


11 March 2006


an AWKward day

A day towards learning AWK. Although I've not planned my day to learn AWK, but it happened accidentally (or by chance ;-)). Today, I was in my practical class, and there I saw an old machine is running Fedora Core 1. I thought what I can do with that system, since the system doesn't has any development tools. It has tools without documentation. Then I thought AWK might be there so, why not learn AWK. Because, I've tried learning AWK previously too many times, but wasn't successful (it was not tough, but because I don't know what to do with that language). So today, I thought why not XMLify, the /etc/passwd (one of the primary target of awk tutorials) and /etc/group. And then, I opened its infopages by executing info awk. but oops info is displaying its manpages. Then I recalled that on GNU/Linux machines AWK comes GAWK, so I did info gawk. And started reading Getting started.

So, within half-an hour I've produced my AWK script to XMLify /etc/passwd and /etc/group. Here I'm giving my AWK script to /etc/passwd.

# passwd2xml.awk: An AWK script to transform /etc/passwd file into passwd.xml
 print "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<passwd>";
/:/  {
 printf "\t<user id=\"%s\" uid=\"%s\" gid=\"%s\" home=\"%s\" comment=\"%s\" password=\"%s\" shell=\"%s\"/>\n", $1, $3, $4, $6, $5, $2, $7;
 print "</passwd>";

You can execute this script and then pipe that output to xmllint to check for well-formedness of the document as shown below:

[wahjava@pc awk]$ awk -f passwd2xml.awk /etc/passwd |xmllint -

GAWK also comes for Windows and available here.

BTW, this script is not the correct way to XMLify the /etc/passwd since XMLifying needs entitifying some characters e.g. <, > etc.

06 March 2006


Your own Google Logo !!

Create your own customized search page including customized Google logo from here. BTW, this service is not provided by Google


200601   200602   200603   200604   200605  

This page is powered by Blogger. Isn't yours?

There are some of my webpages tooo...

This blog is [ INVALID XHTML v1.0 ] [ INVALID CSS v2.0 ]