Implementor's note: It cannot be stressed enough that applications
using this standard should follow MIME's suggestion that you "be
conservative in what you generate, and liberal in what you accept."
In this particular case it means it would be wise for an
implementation to accept messages with any content-transfer-
encoding, but restrict generation to the 7-bit format required by
this memo. This will allow future compatibility in the event the
Internet SMTP framework becomes 8-bit friendly.
"The Robustness Principle" -- Internet RFC 2015 and many others ...
Many organizations would like to use email to provide services
and keep their members informed on a periodic basis.
Spurred by the popularity of HTML as a document interchange format and
aided by the MIME standards for multimedia-enriched email, some of
these same organizations would like to take advantage of the design
possibilities of HTML and its hyperlink facility to send more useful
and exciting messages to their members.
At first glance, nothing could seem simpler than sending HTML format
content to users by email. Just put your HTML content into a message
body, set the content type as "text/html," and send it. If you are
feeling virtuous, add a parameter of "charset=iso-8859-1" to the
content type. The miracle of Internet standards does the rest. The
recipient gets a beautiful, glossy web page delivered fresh into her
email inbox in the morning.
Unfortunately this is more difficult than it appears.
The Nitty Gritty World of HTML Email
Who Is Reading Your Email, And With What?
It turns out there are no Internet standards for
sending and displaying email that contains HTML code. This is why, for
example, you will rarely see relative URLs in HTML email hyperlinks
that you receive; in the absence of unambiguous standards for embedded
HTML, no one is sure how to make them work. The closest thing is the
MIME standard which describes how to encapsulate and identify the
message content types and encodings.
Most of the popular email reading clients have some capabilities to
display HTML. But the level of HTML they can render varies widely, from
the ability to run Internet Explorer as the rendering engine (Eudora
4.x) to being able to handle nothing more than perhaps a <P>
paragraph break and hyperlinks (AOL).
For example, ArsDigita client Away.com (http://www.away.com) sends a daily
email newsletter to around 600,000 subscribers. About 1,000 new users
register at the site every day, and more than half of them subscribe
to the newsletter on registering. The format of the newsletter is the
user's choice of either plain text or a MIME multipart/alternative
message, with a special version formatted for AOL users as described
below. The publisher made a policy decision to default the email type
to HTML for new subscribers, although that can be changed at the
user's preference.
As a result of this policy decision, almost all of users subscribed
get the HTML format newsletter. Based on the number of complaints
received by customer service over a several month period, we tried to
estimate the fraction of users who have problems reading the HTML mail
properly. If we assume that the fraction of users who would actually
bother to send a complaint if they have difficulty is just one in a
hundred, then we estimated that greater than 99.8% of the users were
able to read their HTML newsletters satisfactorily.
While your primary goal should be to form a message that as many users
as possible can view correctly, you will find you have to make
tradeoffs. For example, you may decide it is not worth trying to get
your mail to display correctly on WebTV, if that means restricting the
HTML features you use too severely. If there is a large population
tied to some platform with severe limitations, you may wish to send
out separately formatted email to them. The Away.com site sends out
newsletters to AOL users in a restricted subset of HTML which matches
the AOL email reader capabilities. Note that this kind of selective
formatting is really only feasible when you can easily distinguish the
target users, perhaps by matching their email domain names or by
convincing them to give you an explict preference.
In varying proportions you will find the following classes of
email clients, arranged here in a roughly observed order:
- Microsoft system software: Outlook, Outlook Express, Exchange
- Windows-based non-Microsoft products: Eudora, Lotus Notes, etc.
- Webmail servers: Yahoo, Hotmail, AltaVista, Netscape, ACS Webmail
(see http://www.arsdigita.com/asj/webmail), etc.
- Everyone else: Unix emacs (rmail, vm, gnus), pine, elm; Macintosh Eudora or Outlook
The Robustness Principle tells us to be conservative in what we
generate. Restrict yourself to the minumal set of HTML
directives you find necessary. Each fancy feature that you incorporate
(tables, embedded images or the trouble-prone JavaScript) will
inevitably cause some email reader to fail to render the
message properly. You need to weigh how much you want those fancy web
pages versus how many people you are willing to exclude from the
content.
The rest of this article will discuss some specific issues with encoding
of HTML content as legal MIME messages and suggest some procedures to
make this easier and more robust.
Standards for Email Message Encoding, or What Was That RFC Again?
An SMTP email message is composed of three parts
- The Header. This is the descriptive information associated with a
letter, also known as the "inside address" since the purpose is to
allow final the recipient to better understand the letter.
The email delivery systems are not
supposed to look at the header. The address in the header has no
direct connection to how the letter is delivered.
RFC-822 and its descendents have defined standard fields in the header
of the letter to allow programs to better process the messages. Since
the header information is only to provide information to the mail
reading program, a message can still be delivered without any header
at all even if it confuses the user by not having "from" and "to"
information.
- The Envelope. The envelope is less visible in an Internet message
since it is only used in transmitting the message from one system to
another and then discarded. The envelope itself contains important routing data, such
as to whom the mail is supposed to be delivered and who originated it.
This is the data which is exchanged in the SMTP control commands
such as [RFC-821] "MAIL FROM: ," as opposed to the message payload,
which is the content that is sent following the SMTP "DATA" command.
- The Body. This is the message itself.
Creating an email message requires some headers and some content. Let's
look at the headers first.
The To: Field
One of the most important headers is the recipient's email address.
This may seem quite simple, but you really should make sure you are
sending to a valid email address.
In ArsDigita Community System installations with hundreds of thousands
of users, we have seen what seems like every possible ASCII string
entered as an email address. There are mostly correct ones, and then
obviously bogus ones like *#$'12828 xxasdfM
, and then
there are the ones which look like they could work if you massaged
them a little like "Mary Smith @ cnn .com."
Although we don't want to have the email system try to 'fix' email
addresses that are not standards-compliant, we recommend that the
system implementors take some effort to try to pre-screen or verify
users' email addresses when they are entered, so as to have a lower
incidence of ill-formed email addresses in the database. Stripping
whitespace is a simple heuristic to fix many of the user entry errors.
The From: and Reply-To: Fields
[RFC-822] has this to say about the From: field
4.4.1. FROM / RESENT-FROM
This field contains the identity of the person(s) who wished
this message to be sent. The message-creation process should
default this field to be a single, authenticated machine
address, indicating the AGENT (person, system or process)
entering the message. If this is not done, the "Sender" field
MUST be present. If the "From" field IS defaulted this way,
the "Sender" field is optional and is redundant with the
"From" field. In all cases, addresses in the "From" field
must be machine-usable (addr-specs) and may not contain named
lists (groups).
For email messages that go out to a wide audience, put an email
address in the From field that you expect to get a lot of replies to. The Reply-To
field is supported by most email clients these days,
so it is safe to include a Reply-To address that is separate from
your From address. However, we don't recommend this, because some email clients
will inevitably not respect the Reply-To field, and you will have people replying
to the From address. Or some people may not hit "reply," but copy the From address
maually when sending a reply.
Assume that the From
and Reply-To fields will be treated interchangeably by the user's mail reader,
and expect replies to one or the other to be equally likely.
Special Note on Envelope Return Paths
Many people are confused about the differences between the From field
and the envelope return-path. The issue arises when you want to know
what happens when an email message bounces.
The key point to understand is that mail system errors and notifications such as bounce
notices are sent back to the sender address in the envelope return path, not to the From
field of the message.
The built-in AOLserver mail routine ns_sendmail
treats the message From field and the envelope sender as the same
address, but that is not what you want for real automated
mail-handling production systems. The ACS bulkmail package, for
example, creates a special unique sender address for each outgoing
message which contains encoded information about to whom the mail was
sent and from which module and mailing run. That way the system can
automatically and unambiguously parse returned mail and match it with
the user email address it was sent to. It can then do useful things
like updating the user's email_bouncing_p flag in the
database.
This is vital to being able to automatically maintain a clean mailing list
with hundreds of thousands or millions of users.
|
SMTP Compliance
There isn't much you have to worry about in terms of SMTP compliance; that
should all be taken care of by your email sending routine. However it
is worth noting the following.
The SMTP transport protocol [RFC-821] states, for maximum line length:
text line
The maximum total length of a text line including the
<CRLF> is 1000 characters (but not counting the leading
dot duplicated for transparency).
****************************************************
* *
* TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION *
* TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH *
* OF THESE OBJECTS SHOULD BE USED. *
* *
****************************************************
So they are saying keep your lines under 1000 characters in
length. However they also say that implementors of MTAs who make this
as a built-in limit are being stupid. In practice, you can probably
send arbitrarily long lines to most email systems. However some may
give errors if you do. Some modern firewall-based virus detectors can
be triggered by overly long lines. If you must have very long lines,
use quoted-printable encoding, or some other form of content encoding.
MIME Headers and Encoding
In order to send a MIME message, the standards say you must use at least the following
headers:
- Mime-Version (currently "Mime-Version: 1.0" is the only one supported)
- Content-Type
Building a Simple MIME Message
We've decided to send an email message with Content-Type of
"text/html." However, there are a couple of choices to be made as to
how we structure and encode this message.
The simplest structure for the message would be of the form:
To: Mary_Smith@foo.com
Subject: Great Deals on English Muffins
From: info@bar.com
MIME-Version: 1.0
Content-Type: text/html; charset="us-ascii"
<h1>Great Deals<h1>
There are some <i>great deals</i> on English Muffins
today.
Note: According to the RFC's:
Content-type: text/plain; charset=us-ascii (comment)
and
Content-type: text/plain; charset="us-ascii"
are completely equivalent.
I have seen examples of this kind of message sent in bulk mailings, such
as the American Express example at http://www.arsdigita.com/asj/mime/mime-examples/amex.txt.
There are two disadvantages to the simple structure and encoding methods used
above:
- The use of a "top level" Content-Type header of "text/html" means that
the entire message contains only HTML. If the recipient's email
client is incapable of rendering HTML, they will not be able to
read the message. You can alleviate this concern by using the "multipart"
MIME encoding described in the next section.
- Although many mailers in use today safely handle 8-bit character data,
using a character set of "us-ascii" may possibly cause
some characters with their high-bit set to be mangled by an email server
which is not "8-bit clean." This can be solved by either making sure you
have only 7-bit characters in your content (i.e., ASCII value less than 127) or
by using a Content-Transfer-Encoding like quoted-printable. The finer points of
this are also discussed in the next section.
Multipart MIME Messages
The MIME format allows you to create multipart messages, which
can contain multiple content parts with different content types.
For example, you can send an email message which contains a copy of
your newsletter in both "text/plain" and "text/html" formats.
Sending both plain-text and HTML versions of the message is a good option,
because it allows for a graceful degradation of the appearance of the
message for users whose email clients do not really support HTML but are
MIME-aware.
To send a multipart message, use the Content-Type "multipart." There
are a number of subtypes that can be used to modify this, but the two
we will consider are "multipart/mixed" or "multipart/alternative."
The multipart/mixed content type is used to assert that the message contains
several parts, all of which should be presented to the user.
The multipart/alternative asserts that the message contains several
representations of the same content, and the user's mail client should attempt to
show them the "best" one it can. In practice that means that a mail reader
with a text/plain and text/html part for a multipart/alternative message will
preferably display the text/html. If it is unable to display the text/html message,
it should gracefully degrade to a type it can render.
The following example of a multipart MIME message is taken from [RFC-2049]
MIME-Version: 1.0
From: Nathaniel Borenstein
To: Ned Freed
Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT)
Subject: A multipart example
Content-Type: multipart/mixed;
boundary=unique-boundary-1
This is the preamble area of a multipart message.
Mail readers that understand multipart format
should ignore this preamble.
If you are reading this text, you might want to
consider changing to a mail reader that understands
how to properly display multipart messages.
--unique-boundary-1
... Some text appears here ...
[Note that the blank between the boundary and the start
of the text in this part means no header fields were
given and this is text in the US-ASCII character set.
It could have been done with explicit typing as in the
next part.]
--unique-boundary-1
Content-type: text/plain; charset=US-ASCII
This could have been part of the previous part, but
illustrates explicit versus implicit typing of body
parts.
--unique-boundary-1
Content-Type: multipart/parallel; boundary=unique-boundary-2
--unique-boundary-2
Content-Type: audio/basic
Content-Transfer-Encoding: base64
... base64-encoded 8000 Hz single-channel
mu-law-format audio data goes here ...
--unique-boundary-2
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
... base64-encoded image data goes here ...
--unique-boundary-2--
--unique-boundary-1
Content-type: text/enriched
This is enriched.
as defined in RFC 1896
Isn't it
cool?
--unique-boundary-1
Content-Type: message/rfc822
From: (mailbox in US-ASCII)
To: (address in US-ASCII)
Subject: (subject in US-ASCII)
Content-Type: Text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: Quoted-printable
... Additional text in ISO-8859-1 goes here ...
--unique-boundary-1--
Assume we are a sending a multipart/alternative message. We still get
a choice of how to encode the content in each part, and which order to
put the parts in the mail message.
The encoding of the contents of a MIME part are specified by the
Content-Transfer-Encoding header [RFC-2045]. The encodings you
can rely on working are "7bit," "8bit," "quoted-printable," or "base64."
The quoted-printable encoding is generally considered the best way to
encode HTML or other content which is primarily legal ASCII text.
Quoted-printable provides some protection against some of the
errors that can be introduced by MTAs along the way, such as deletion
of whitespace or truncation of characters' high bits. It also leaves
"vanilla" ASCII text alone for the most part, so the message is still
mostly readable even when encoded, which is a big help for debugging
mail transport errors.
Given all of the previous warnings about email client capabilities,
you might have some concern that there are mail readers
that cannot properly decode the quoted-printable encoding. However, it is
generally safe to assume that any mail client that can render HTML
correctly can probably decode quoted-printable correctly. In fact, if a mail
reader exists that can render HTML but cannot decode quoted-printable, the affected
users should probably upgrade immediately.
The [RFC-2045] has this to say about line lengths and QP encoding:
(5) (Soft Line Breaks) The Quoted-Printable encoding
REQUIRES that encoded lines be no more than 76
characters long. If longer lines are to be encoded
with the Quoted-Printable encoding, "soft" line breaks
must be used. An equal sign as the last character on a
encoded line indicates such a non-significant ("soft")
line break in the encoded text.
So the recommendation is to keep QP-encoded lines to less than 77 columns.
This is very good advice, you would be well-adviced to take it.
At one point, before
we started QP encoding the HTML, and the publisher was routinely including
lines of length 1000 or more, some recipients had trouble, usually from
an overly vigilant firewall virus detector. There are apparently a number
of security holes in Windows mail readers, and long line lengths in MIME
messages can be used as an exploit in some of them.
There are numerous finer points about encoding a MIME message
in a standards compliant way. They will not addressed here further.
Rather, the discussion of structuring a compliant
message will be deferred to the JavaMail section below.
Embedded Images in HTML
One of the first things that publishers seem to want to do is put
images into the HTML they send in their email. This opens up a host
of issues and problems.
Currently there is no supported standard for embeddeding inline images
into MIME HTML messages. There is a new proposal [RFC-2557] "MIME
Encapsulation of Aggregate Documents, such as HTML (MHTML)," but
it is not clear that any major email clients support it yet.
Another striking issue with sending images with every message is the excessive
bandwidth that will be used;
the images will usually contain far more data than the text portion.
What does tend to be
supported, though by no means universally, is plain-old IMG tags
using live URL links. That is, you can put an absolute URL in the
body of your HTML message, such as
<IMG src="http://www.techrepublic.com/images/trlogo94_60.gif">
and many email readers will fetch the image and render it inline when
the user viewing the message.
Note however that this doesn't work if the user is offline! There
are users who have programs that dial up, grab their email, then
disconnect. So they will not see the pretty pictures in their
mail, and may in fact see ugly holes in the formatting and collapsed
layout where the images were supposed to go.
If you feel you must use inline images in your HTML mail,
remember also that every image will have to be retrieved from your
server. One ArsDigita client sent out 750,000 newsletters overnight,
each containing twenty or thirty images. The next morning, their
server was a lot less responsive than usual!
Other Considerations
While not part of the encoding process, you should consider some other issues
with the content of your newsletters.
You should always provide a way for people to unsubscribe themselves from
a mailing list. They may have forgotten how they subscribed, or someone
may have maiciously subscribed them. It is best to make sure there are multiple
ways for the user to stop receiving the mail. The From and Reply-To
addresses should support email requests to unsubscribe. The message
content should have explicit instructions on how to unsubscribe as well, along
with an email address, URL, and maybe even a phone number. There is nothing
more frustrating than not being able to stop unwanted email from being sent to you!
The publisher should also provide an easy way for the recipients to
set their email type preference, i.e., plain-text or html mail. If
you want to be conservative, you can default to sending new users
only plain text unless they explicitly specify otherwise. If you default
new users to HTML content, make sure you have obvious instructions for
them to set their preferences to plain text content.
I would also encourage the publisher to add a link to a copy of the newsletter
content on their web site, so people whose email readers are hopelessly
inept can still view the content via a web browser.
Letting Someone Else Do The Hard Work
You can write all the code to build compliant MIME messages yourself,
or you can try to find code that is already written which helps takes
care of the composition.
One option, following Jin Choi's Webmail example at http://www.arsdigita.com/asj/webmail/, is to use the JavaMail library
in the construction of a standards-compliant email message. The following
example Oracle/Java code constructs a multipart mime message containing plain text
and HTML parts. While the initial learning curve is somewhat steep (you need to
figure out how to load and call Java inside Oracle), it is
very nice to be able to offload the complexities of composing a message
onto a standard library. This way, if the MIME standard is enhanced or otherwise
changed, you will not have to rewrite much code.
The code below assumes that there is a database table spam_history
with a row containing the plain text and HTML versions of the message to be
sent. The code uses the JavaMail API to construct a MIME message and then
inserts the complete message back into the database. Since
this message is designed to go out to millions of users, it is actually
constructed as a template, with the To: and Reply-To: headers containing
placeholder values. This message template can then be passed to a
bulk mailer module that will efficiently send it to a large mailing-list.
The message parts are encoded using quoted-printable encoding, and the
entire message is given a multipart/alternative content type. It is at this
point that you would also add directives to set the content-type
parameters. For example, you might want to specify a charset if you were sending content that was not ASCII or ISO-8859-1
compatible, such as Japanese text.
Sending Email Directly From Java
Note: you could actually send this mail directly from Java, using the JavaMail
Transport API. Example code to do this is at http://www.arsdigita.com/asj/mime/java-send. It
is not clear that this is something you want to do for high volume mailings
using the default JavaMail transport code, however.
|
// SpamMessageComposer.sqlj
// originally part of the webmail ACS module
// written by Jin Choi
// hacked by hqm@arsdigita.com to generate SPAM MIME newsletter messages from
// the spam_history table
// 2000-03-27
// This class implements some static methods for composing MIME plaintext and
// mixed text/html messages for the spam system
package com.arsdigita.mail;
import oracle.sql.*;
import oracle.sqlj.runtime.Oracle;
import java.sql.*;
import java.io.*;
import javax.mail.*;
import javax.mail.internet.*;
import java.util.*;
import javax.activation.*;
public class SpamMessageComposer {
protected static Session s = null;
public static void composeHTMLMimeMessage(int msgId)
throws MessagingException, IOException, SQLException {
Vector parts = new Vector(); // vector of data handlers
CLOB bodyPlainText = null;
CLOB bodyHTMLText = null;
#sql { select body_plain, body_html, subject into :bodyPlainText, :bodyHTMLText, :msgSubject from
spam_history where spam_id = :msgId };
//Use Jin's winning CLOBDataSource to grab message from database.
if (bodyPlainText != null && bodyPlainText.length() > 0) {
ClobDataSource cds = new ClobDataSource(bodyPlainText, "text/plain", null);
parts.addElement(new DataHandler(cds));
}
if (bodyHTMLText != null && bodyHTMLText.length() > 0) {
ClobDataSource cds = new ClobDataSource(bodyHTMLText, "text/html", "newsletter.html");
parts.addElement(new DataHandler(cds));
} else {
System.err.println("SpamMessageComposer.composeHTMLMimeMessage: bodyHTMLText is null!");
}
// Create new MimeMessage.
if (s == null) {
Properties props = new Properties();
s = Session.getDefaultInstance(props, null);
}
MimeMessage msg = new MimeMessage(s);
String from = "newsletter@away.com";
String sendTo = "%%_TO_ADDR_%%";
String replyTo = "%%_REPLY_TO_ADDR_%%";
String subject = msgSubject;
// Add the headers.
msg.setFrom(new InternetAddress(from));
InternetAddress[] address = {new InternetAddress(sendTo)};
msg.setRecipients(Message.RecipientType.TO, address);
msg.setSubject(subject);
msg.addHeader("Reply-To", replyTo);
// Add the attachments.
addParts(msg, parts);
// Synchronize the headers to reflect the contents.
msg.saveChanges();
CLOB composedMessage = null;
// Grab the CLOB we're going to stuff it in and write the composed message to it.
#sql { update spam_history set mime_html = empty_clob() where spam_id = :msgId };
#sql { select mime_html into :composedMessage from spam_history where spam_id = :msgId };
msg.writeTo(composedMessage.getAsciiOutputStream());
}
protected static void addParts(MimeMessage msg, Vector parts)
throws MessagingException, IOException {
if (parts.size() == 0) {
// This should never happen.
return;
}
if (parts.size() > 1) {
//Make this a mutlipart/alternative message
MimeMultipart msgMultiPart = new MimeMultipart("alternative");
Enumeration e = parts.elements();
while (e.hasMoreElements()) {
DataHandler dh = (DataHandler) e.nextElement();
String filename = dh.getName();
MimeBodyPart bp = new MimeBodyPart();
//Use quoted-printable encoding on the parts
bp.setDataHandler(dh);
bp.setHeader("Content-Transfer-Encoding", "quoted-printable");
if (filename != null) {
bp.setFileName(dh.getName());
}
msgMultiPart.addBodyPart(bp);
}
msg.setContent(msgMultiPart);
} else {
// There is only one element.
DataHandler dh = (DataHandler) parts.elementAt(0);
String filename = dh.getName();
if (filename != null) {
msg.setFileName(dh.getName());
}
msg.setHeader("Content-Transfer-Encoding", "quoted-printable");
msg.setDataHandler(dh);
}
}
}
The Oracle PL/SQL wrapper for this looks like
create or replace procedure spam_test_message (spam_id IN NUMBER)
as language java
name 'com.arsdigita.mail.SpamMessageComposer.composeHTMLMimeMessage(int)';
/
call spam_test_message(3677)
/
International Character Set Encodings
Given that the Internet is a global community, you may want to send
email in other character set encodings than US-ASCII or ISO-8859-1.
For message body content, you should generally only need to add a charset
parameter to the
Content-Type header. However encoding of non-US charset info in the headers
can be somewhat more involved.
Consider this header, which encodes the subject field in two different
character sets. The MIME spec provides support for "encoded words"
for specifying character sets and encodings within strings in header
fields:
From: =?US-ASCII?Q?Keith_Moore?=
To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?=
CC: =?ISO-8859-1?Q?Andr=E9?= Pirard
Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
=?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
The headers above show examples of encoding strings in US-ASCII, ISO-8859-1, and ISO-8859-2,
using the Q (quoted-printable) and B (binary) encodings.
For more information see [I18N-MAIL], i18n and Multilingual support in Internet mail, at http://www.terena.nl/multiling/ml-mua/mldoc-review.html.
Analyzing What Went Wrong
When a user reports that the newsletter "is broken", it is often
remarkably difficult to figure out what is going on. Email readers
can do so much silent damage to a message when trying to display it
that it is often impossible to figure out what they are finally seeing
in their mail reader window. Many users have no idea how their email
works, and thus cannot describe to you a reasonable model of what may
be happening. They simply see something incomprehensible on their
screen. Other times the reports are somewhat succint, and indicate
that the mail client refuses to launch a browser when a hyperlink is
clicked, indicating that at least the links are displaying, although
they may be corrupted in some way. At least with some of the webmail
services, it is easy to verify if they can correctly handle a MIME
HTML enclosure, whereas if someone is using Lotus Notes on a Windows
3.1 machine it is pretty hopeless trying to help them. The best thing
is to tell them to switch to the plain text version of the newsletter
(which you are providing, right?)
Perhaps not surprisingly, the greatest number of problems I have seen have
have been on Microsoft Outlook and Exchange. This may be due to the fact that
the user base for these programs is larger than for other email clients, or it may be due to the non-robust
nature of Microsoft software, especially in relation to Internet standards.
To illustrate some of the difficulties of debugging email viewing
problems from users, here are some real-life examples of bug reports
you can expect to receive. These examples are the entire bug
report messages, not just excerpts. You can see how much debugging
information the typical user will include in their reports.
The users
often do forward back a copy of the message with their mail, but it is
invariably so chewed up as to be practically unrecognizable. In
practically no cases have I ever gotten back a copy of the original
newsletter message that was was viewable in its intended form. The implication
is that most email systems that cannot display the message will also
transform it in a destructive way if the user tries to forward it.
"I can never read your e-mails - is there some way to make them so I can
read them?"
"Why is the writing in the e-mail so small? Please enlarge the
articles printing."
"To whom it may concern, Unfortunately I'm unable to open your sites, that
you send me daily
Any assistance would be appreciated"
"Your email is coming out as HTML code.
Too bad because I was going to forward this to someone who may go to
Scotland this summer."
"For some reason, I'm not receiving this properly (see below)......."
"
Hi,
I tried to download the image & your links
don't work (any of them).
"
"Is this the way this is supposed to look?"
"Please advise....I've recieved your Daily Escape for months and months
through my email address at xxx@yyy.net and always recieved
beautiful and interesting photographs. HOWEVER, since I've switched to
ComuServe I re-registered with you for the Daily Escapes to be sent to my
new
email address at: xxx@xx.com and am not getting photos with the
Daiily Escape. Did I sign up incorrectly or ask for the wrong subscription?
Help please Thanks
"I receive you e-mails with instructions to click on the underlined blue
highlighted words; however when I do, nothing happens. Are you aware of
this fact?
If you do not have the ability to transmit the appropriate communication
signals, please delete me from any further e-mails. Otherwise, I look
forward to your improved communications.
Thank you!"
"Hi. I have not been able to click on anything in the past few messages I
received from you. Certain things are underlined in blue or say click here
for more details but I can't. Is there anything you can do to help?
Thanks"
Often it
is next to impossible to figure out where the difficulty might be
arising. When trying to debug the situation, one approach is to ask the user
user "Are there any other HTML newsletters which you receive correctly?"
to which the answer is often "no". In this case it is probably
a problem with their mail reader's inability to format HTML, rather than our MIME
encoding of the messages. Sometimes they say yes, but it turns out they
are receiving mail with a subset of HTML which has no images or no tables.
Some mail readers can format simple HTML, but not tables, inline
images, or other fancy features. This is an argument in favor of
using a simplified subset of HTML when composing your messages.
Examples
You can find some real-world examples of HTML format mail that I have
received at http://www.arsdigita.com/asj/mime/mime-examples/.
Note the wide spectrum of encoding methods used. It is hard to say which
of these formats is the most likely to be readable on the maximum
number of mail clients, but it is interesting to note the spectrum of
MIME encoding features used (e.g., QP vs 7bit, multipart vs single
part).
Final Notes
The use of HTML in email messages is not yet a universally
supported standard. Thus, you cannot hope to make something that uses
the latest whiz-bang HTML formatting and is reliably readable on every
mail client. So you have to ask, for a given feature set, what is an
acceptable percentage of messages "unreadable" to customers to aim
for? 1%? 0.1%? It is really a judgement call for the publisher. What
the world needs is a clearinghouse of client capabilities so that programmers can know
what HTML subset is rendered acceptably on what fractions of users'
email clients. Without that knowledge, publishers should be
conservative about what they try to send as HTML.
References
[US-ASCII] Coded Character Set--7-Bit American Standard Code for
Information Interchange, ANSI X3.4-1986.
[ISO-2022] International Standard--Information Processing--ISO 7-bit
and 8-bit coded character sets--Code extension techniques, ISO
2022:1986.
[ISO-8859] Information Processing -- 8-bit Single-Byte Coded Graphic
Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part
2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet
No. 3, ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4,
1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6:
Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: Latin/Greek
alphabet, ISO 8859-7, 1987. Part 8: Latin/Hebrew alphabet, ISO
8859-8, 1988. Part 9: Latin alphabet No. 5, ISO 8859-9, 1990.
[ISO-646] International Standard--Information Processing--ISO 7-bit
coded character set for information interchange, ISO 646:1983.
[X400] Schicker, Pietro, "Message Handling Systems, X.400", Message
Handling Systems and Distributed Applications, E. Stefferud, O-j.
Jacobsen, and P. Schicker, eds., North-Holland, 1989, pp. 3-41.
[I18N-MAIL] (http://www.terena.nl/multiling/ml-mua/mldoc-review.html) Yuri Demchenko, TERENA ,
"I18N and Multilingual support in Internet mail, Standards Overview"
Multilingual Mail Users Agents, TERENA Pilot Project Homepage: http://park.kiev.ua/multiling/ml-mua/
[RFC-821] (http://www.faqs.org/rfcs/rfc821.html) Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
821, USC/Information Sciences Institute, August 1982.
[RFC-822] (http://www.faqs.org/rfcs/rfc822.html)
Crocker, D., "Standard for the Format of ARPA Internet Text
Messages", STD 11, RFC 822, UDEL, August 1982.
[RFC-934] (http://www.faqs.org/rfcs/rfc934.html) Rose, M., and E. Stefferud, "Proposed Standard for Message
Encapsulation", RFC 934, Delaware and NMA, January 1985.
[RFC-1049] (http://www.faqs.org/rfcs/rfc1049.html) Sirbu, M., "Content-Type Header Field for Internet
Messages", STD 11, RFC 1049, CMU, March 1988.
[RFC-1154] (http://www.faqs.org/rfcs/rfc1154.html)Robinson, D. and R. Ullmann, "Encoding Header Field for
Internet Messages", RFC 1154, Prime Computer, Inc., April 1990.
[RFC-1341] (http://www.faqs.org/rfcs/rfc1341.html) Borenstein, N., and N. Freed, "MIME (Multipurpose Internet
Mail Extensions): Mechanisms for Specifying and Describing the Format
of Internet Message Bodies", RFC 1341, Bellcore, Innosoft, June 1992.
[RFC-1342] (http://www.faqs.org/rfcs/rfc1342.html) Moore, K., "Representation of Non-Ascii Text in Internet
Message Headers", RFC 1342, University of Tennessee, June 1992.
[RFC-1343] (http://www.faqs.org/rfcs/rfc1343.html)
Borenstein, N., "A User Agent Configuration Mechanism for
Multimedia Mail Format Information", RFC 1343, Bellcore, June 1992.
[RFC-1344] (http://www.faqs.org/rfcs/rfc1344.html) Borenstein, N., "Implications of MIME for Internet
Mail Gateways", RFC 1344, Bellcore, June 1992.
[RFC-1345] (http://www.faqs.org/rfcs/rfc1345.html)
Simonsen, K., "Character Mnemonics & Character Sets", RFC 1345, Rationel Almen Planlaegning, June 1992.
[RFC-1426] (http://www.faqs.org/rfcs/rfc1426.html) Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M.,
Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIME
transport", RFC 1426, United Nations Universit, Innosoft, Dover Beach
Consulting, Inc., Network Management Associates, Inc., The Branch
Office, February 1993.
[RFC-1522] (http://www.faqs.org/rfcs/rfc1521.html) Borenstein, N., "
MIME (Multipurpose Internet Mail Extensions) Part One:
Mechanisms for Specifying and Describing
the Format of Internet Message Bodies" RFC 1521, Innosoft, September 1993.
[RFC-1522] (http://www.faqs.org/rfcs/rfc1522.html) Moore, K., "Representation of Non-Ascii Text in Internet
Message Headers," RFC 1522, University of Tennessee, September 1993.