From the the two methods of encoding 8-bit data as human-readable ASCII, for a time, uuencode format was more popular. USENET 'binaries' groups were filled with uuencoded posts with whatever goodies were shared. The format was quite robust, insensitive to line breaks (if your mail program reflowed the text, for uuencode you could still decode the file) and the uuencode/uudecode programs were quite user-friendly.
Base64 was not nearly so well liked. Some people would post base64-encoded binaries, arousing mild ire from these, who didn't have decoders. It was sensitive to formatting and white spaces. I'm not entirely sure, but I think it generated a little bigger output too.
Then I was off-the-loop for a time, and when I came back to the Unix and Linux world, uuencode was dead, and wherever 7-bit was still needed, Base64 ruled, and rules to this day.
What happened? What events led to base64 winning the format war?
23.3k4 gold badges93 silver badges150 bronze badges
asked Jun 1, 2017 at 14:40
13
I’m not sure about specific events, but I think the main reason Base64 “won” is that it’s one of the binary encodings supported by MIME, and MIME took over.
So perhaps the question then becomes two-fold:
- Why did MIME pick Base64 over uuencode? Possibly because Base64 is actually more resilient than uuencode: it only uses alphanumeric characters plus two other characters to encode content (
+
and/
in MIME), and one character for padding (=
). - Why did MIME become the dominant mail/news content wrapper? I guess it boils down to convenience, especially once most MUAs and news agents supported it (ah, the days of
slrn
and Forte Agent…).
answered Jun 1, 2017 at 14:53
Stephen KittStephen Kitt
126k17 gold badges523 silver badges480 bronze badges
4
The problem with uuencode is that the format was not robust in the face of some of the really crufty mail software and gateways into and out of proprietary non-SMTP and non-ASCII mail systems of the day. Just to liven things up further, there were multiple EBCDIC variants which had different code points for some ASCII characters used by uuencode, opening up another route for data corruption. For example, the character $
has code point 74 in code page 285 used in the UK, but code point 91 in code page 037 used in the USA.
This corruption would have been one of the driving forces behind the design of MIME, and its character set would have been carefully chosen to minimise problems with such gateways.
answered Jun 1, 2017 at 18:30
pndcpndc
11.6k3 gold badges42 silver badges65 bronze badges
8
Base64 is slightly more compact as it does not use a character indicating line length at the beginning of each line:
% dd bs=1k count=1024 < /dev/urandom | uuencode /dev/stdout | wc -с
1444736
% dd bs=1k count=1024 < /dev/urandom | uuencode -m /dev/stdout | wc -c
1421440
Overall, Base64 is about 1.5% better.
answered Jun 1, 2017 at 15:03
Leo B.Leo B.
20.6k5 gold badges49 silver badges157 bronze badges
3
Some of the reasons base64 was disliked was because uuencode stored the original file name and file mode of the encoded data. Also, uuencode had been around longer and was more established, which meant that many people had a uudecode program available but they did not have a base64 decoder. Keep in mind at that time, many people were using systems that did not have a C compiler (the C compiler was often sold as an expensive add-on if it was available at all, and this was before GCC was widely available) so acquiring and compiling their own base64 decoder was a significant effort.
But in certain contexts you didn't need a file name or mode (e.g. inline encoding of the body of an email message), and uuencoded data was particularly vulnerable to corruption because at that time it was not uncommon for a mail gateway somewhere along the line to insert an unwanted newline somewhere within your message, or for character set translation to corrupt something. The extra newlines were usually easy to fix, and the uuencode format made it easy to see where they had occurred, but corruption due to character set translation was much harder to fix (sometimes impossible without trial-and-error testing). Base64 encoding solved these problems and was therefore a better choice for use within the MIME email encoding standard.
The decline in popularity of terminal-mode access compared to GUI access is what really killed uuencode. Users who were using graphical email clients on a PC or on a Workstation or X Terminal, found base64-encoded MIME attachments more convenient than uuencoding, and web browsers allowed you to download files without needing any encoding at all (shifting the common method of binary file transfer away from mail and news, towards the use of FTP and HTTP instead). Uuencoding is still an easy way to send a file when both the sender and receiver are using text-only terminals and can't use FTP, but today this is almost never the case.
answered Jun 2, 2017 at 13:52
Ken GoberKen Gober
11.4k1 gold badge41 silver badges56 bronze badges
5
I can't definitively say whether it is cause or effect so am somewhat chancing my arm by promoting it to an answer but: the only way of forming a data URL (i.e. one that has the data directly within it†) is as base64.
Since all moderately substantial application environments supports URLs, even if they don't explicitly support base64 encoding and decoding then they at least support decoding just by forming the data URL. So it's just really easy for developers to support.
Therefore I think base64's usage in URLs may have contributed to its ascension, in the same way that its use in the IBM PC helped the x86 — it's not where the thing came from or why it was designed, but it led to a substantial propagation.
† e.g. this tiny document icon that I cribbed from this site, which doesn't identify a remote resource but itself contains a local resource. You might need to copy and paste it into your browser bar if yours is anything like mine, as trying to follow it like a link from here inevitably leads to the error that it's not a functioning link. Which is the point.
answered Jun 1, 2017 at 16:10
TommyTommy
37.9k3 gold badges126 silver badges176 bronze badges
10
This also may have a history with multi-byte vs. unicode character. Multi-byte was a stop-gap support for unicode a while back when uuencode was invented. Supporting multi-byte is not really needed unless you have some really old backups.
answered Dec 29, 2020 at 20:59
4
You must log in to answer this question.
Not the answer you're looking for? Browse other questions tagged
.
Not the answer you're looking for? Browse other questions tagged
.