Description
Postsack currently uses mbox-reader for MBOX parsing, but it doesn't properly implement the standard. It only checks for the FROM
string at the beginning of a line which means any email containing a newline with a FROM
somewhere in the body is regarded as two different emails. The correct way to detect a new email in MBOX according to the RFC 4155 is:
Each message in the mbox database MUST be immediately preceded
by a single separator line, which MUST conform to the following
syntax:
The exact character sequence of "From";
a single Space character (0x20);
the email address of the message sender (as obtained from the message envelope or other authoritative source), conformant with the "addr-spec" syntax from RFC 2822;
a single Space character;
a timestamp indicating the UTC date and time when the message was originally received, conformant with the syntax of the traditional UNIX 'ctime' output sans timezone (note that the use of UTC precludes the need for a timezone indicator);
an end-of-line marker.