summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn Ankarström <john@ankarstrom.se>2021-05-20 03:28:57 +0200
committerJohn Ankarström <john@ankarstrom.se>2021-05-20 03:28:57 +0200
commiteb5b7969f4d455307f5e3d82807c461ebe18fb86 (patch)
tree4661d319ebbf57878d23468a32d3d3d216e07cfd
parentcdf4df2126bb0fe5c33bcb77992f3877da290c17 (diff)
downloadmum-eb5b7969f4d455307f5e3d82807c461ebe18fb86.tar.gz
Add Message-Length header, remove custom headers from mbox
This way, the mbox file follows the traditional mbox format.
-rw-r--r--doc/mum.ms139
-rwxr-xr-xsrc/pop26
2 files changed, 128 insertions, 37 deletions
diff --git a/doc/mum.ms b/doc/mum.ms
index 358bcab..78acf58 100644
--- a/doc/mum.ms
+++ b/doc/mum.ms
@@ -1,44 +1,147 @@
+.de Q
+\\$3\*Q\\$1\*U\\$2
+..
+.de EX
+.LD
+.ft C
+..
+.de EE
+.ft
+.DE
+..
.TL
mum \(en modern UNIX mail interface
.AU
John Ankarström
-.SH
+.SH \" -----------------------------------------------------------------
Introduction
.LP
Mum is a text-based e-mail client for UNIX and UNIX-like operating
systems that supports both plain-text and HTML e-mail.
-It introduces a new method for local storage of e-mail, called
-indexed mbox.
-Furthermore, it uses
-.I views
+It introduces a couple of innovations to the landscape of UNIX
+e-mail clients:
+.IP \h'2n'1.
+Reasonable support for HTML e-mail out of the box.
+.IP \h'2n'2.
+A new method for local storage of e-mail, called
+.I "indexed mbox" .
+.IP \h'2n'3.
+.I Views
\(en simple scripts that filter messages \(en instead of folders.
-.PP
+.LP
In this document, the fundamental concepts of mum are explained.
-.SH
+.SH \" -----------------------------------------------------------------
The indexed mbox format
.LP
There are two popular methods for local storage of e-mail on UNIX
systems: mbox and Maildir.
Maildir is a powerful but complicated solution, while mbox is a
-simple but inefficient solution. The "indexed mbox" format introduced
-by mum builds on the mboxcl2 format, but enhances it with an
-additional file called an
+simple but inefficient solution.
+.PP
+The
+.Q "indexed mbox"
+format introduced by mum builds on the traditional mbox format, but
+enhances it with an additional file called an
.I index ,
which carries the same name as the mbox plus the extension
.I .i .
The index contains all headers from the mbox file, including the
.I From_
line, without the actual contents of the corresponding messages.
-Each block of headers contains an additional header called
+Further, each block of headers contains three additional headers:
+.IP \h'2n'1.
+.I UID ,
+containing the unique identifier of the message provided by the
+mail server (optional).
+.IP \h'2n'2.
.I Offset ,
-which contains the position of the corresponding message in the
-mbox file, described as a byte offset.
-Additionally, a
-.I Content-Length
-header is included in both mbox and mbox.i.
-(Note further that the mbox and mbox.i files are append-only.)
-.PP
+containing the starting position of the corresponding message in
+the original mbox file, described as a byte offset.
+(It is important to note that the mbox and mbox.i files are
+append-only.)
+.IP \h'2n'3.
+.I Message-Length ,
+containing the length of the entire message in the mbox file,
+including both headers and body, in number of bytes.
+.LP
Mum and its associated view scripts use the index for most operations.
Whenever it is time to read the actual contents of a message, the
message is retrieved from the mbox using the offset specified in
the index.
+.SH \" -----------------------------------------------------------------
+Retrieval methods
+.LP
+Being extensible by nature, mum supports a potentially infinite
+number of methods for e-mail retrieval.
+By default, included with mum is a script called
+.I pop ,
+which downloads messages from a mail server via POP3, simultaneously
+creating an index for them.
+The Post Office Protocol or POP is the recommended e-mail retrieval
+method for mum.
+.PP
+If you want to save a copy of sent messages on the server, you can
+use IMAP instead of POP.
+The
+.I imap
+script, included with mum, synchronizes the mbox file with the mail
+server in an intelligent way:
+.IP \h'2n'a)
+For any new messages in the remote INBOX folder, it downloads and
+appends them to the mbox file.
+.IP \h'2n'b)
+For any messages in the mbox file that are sent by your own e-mail
+address, it uploads them to the remote Sent folder.
+.LP
+The default
+.I imap
+script does not support any folders other than INBOX and Sent, as
+mum eschews the concept of folders for scriptable views.
+.PP
+Additionally, mum can be used with locally stored mbox files.
+The default mum distribution includes the
+.I index
+script, which builds an index from a pre-existing mbox file.
+It supports a variety of mbox formats.
+.SH \" -----------------------------------------------------------------
+Views
+.LP
+What mum calls
+.Q views
+are simple scripts that filter the messages in the mbox index
+according to some criteria.
+The IMAP protocol, along with many e-mail clients, has a concept
+of folders: incoming mail is put in the Inbox folder, outgoing mail
+in the Sent folder, junk mail in the Junk folder and so forth.
+In mum, views serve the same purpose: a script named
+.I inbox
+extracts all mail sent from e-mail addresses other than your own,
+a script named
+.I sent
+all mail sent from your own e-mail address, a script named
+.I junk
+all mail with a certain header indicating that it is junk, and so
+forth.
+.PP
+However, because views are scripts, they are much more powerful and
+dynamic.
+One might have a script called
+.I amazon
+that extracts all mail sent from Amazon, or even a script called
+.I services
+that extracts all mail sent from a range of companies and services.
+The author of this document, for example, uses plus-addressing to
+separate mail sent from different vendors.
+With that assumption in mind, a
+.I services
+script might look like the following:
+.EX
+.in +5n
+#!/usr/bin/perl -00 -n
+print if /^Delivered-To: [^@]+\\+(amazon|apple|ebay|...)\\@/m
+.EE
+.LP
+On the author's system, this script takes circa 0.07 seconds to
+filter through an mbox index with 2000 messages (or, in other words,
+slightly less than the average time it takes for the Python interpreter
+just to start).
diff --git a/src/pop b/src/pop
index 3134c14..8151010 100755
--- a/src/pop
+++ b/src/pop
@@ -96,25 +96,11 @@ for my $id (@ids) {
$from = 'MAILER-DAEMON@' . hostname if not $from;
my $from_ = "From $from $date";
- # Add UID header
- unshift @msg, "UID: $uids{$id}"; $j++;
-
- # Add Content-Length header
- my ($header_length, $body_length, $content_length);
- $header_length += length($_)+1 for (@msg[0..$j-1]);
+ # Calculate message length
+ my ($head_length, $body_length, $message_length);
+ $head_length += length($_)+1 for (@msg[0..$j-1]);
$body_length += length($_)+1 for (@msg[$j..$#msg]);
- $content_length = length($from_) + 1 + $header_length + $body_length;
-
- # - Add length of Content-Length header to Content-Length
- my $new = $content_length;
- my $prev = 0;
- until ($new == $prev) {
- $prev = $new;
- $new = $content_length + length "Content-Length: $content_length\n";
- }
- $content_length = $new;
-
- unshift @msg, "Content-Length: $content_length"; $j++;
+ $message_length = length($from_) + 1 + $head_length + $body_length;
# Append message to mbox and index files
local $" = "\n";
@@ -126,7 +112,9 @@ $from_
MBOX
print $index <<INDEX;
$from_
+UID: $uids{$id}
Offset: $offset
+Message-Length: $message_length
@msg[0..$j-1]
INDEX
@@ -134,7 +122,7 @@ INDEX
exit 130 if $sigint;
# Set offset for next message
- $offset += $content_length + 1;
+ $offset += $message_length + 1;
}
print STDERR "\n";