1 \input texinfo @c -*-texinfo-*-
3 @setfilename emacs-mime
4 @settitle Emacs MIME Manual
9 @c * Emacs MIME: (emacs-mime). The MIME de/composition library.
14 @setchapternewpage odd
18 This file documents the Emacs MIME interface functionality.
20 Copyright (C) 1996 Free Software Foundation, Inc.
22 Permission is granted to make and distribute verbatim copies of
23 this manual provided the copyright notice and this permission notice
24 are preserved on all copies.
27 Permission is granted to process this file through Tex and print the
28 results, provided the printed document carries copying permission
29 notice identical to this one except for the removal of this paragraph
30 (this paragraph not being relevant to the printed manual).
33 Permission is granted to copy and distribute modified versions of this
34 manual under the conditions for verbatim copying, provided also that the
35 entire resulting derived work is distributed under the terms of a
36 permission notice identical to this one.
38 Permission is granted to copy and distribute translations of this manual
39 into another language, under the above conditions for modified versions.
45 @title Emacs MIME Manual
47 @author by Lars Magne Ingebrigtsen
50 @vskip 0pt plus 1filll
51 Copyright @copyright{} 1998 Free Software Foundation, Inc.
53 Permission is granted to make and distribute verbatim copies of
54 this manual provided the copyright notice and this permission notice
55 are preserved on all copies.
57 Permission is granted to copy and distribute modified versions of this
58 manual under the conditions for verbatim copying, provided that the
59 entire resulting derived work is distributed under the terms of a
60 permission notice identical to this one.
62 Permission is granted to copy and distribute translations of this manual
63 into another language, under the above conditions for modified versions.
73 This manual documents the libraries used to compose and display
76 This is not a manual meant for users; it's a manual directed at people
77 who want to write functions and commands that manipulate @sc{mime}
80 @sc{mime} is short for @dfn{Multipurpose Internet Mail Extensions}.
81 This standard is documented in a number of RFCs; mainly RFC2045 (Format
82 of Internet Message Bodies), RFC2046 (Media Types), RFC2047 (Message
83 Header Extensions for Non-ASCII Text), RFC2048 (Registration
84 Procedures), RFC2049 (Conformance Criteria and Examples). It is highly
85 recommended that anyone who intends writing @sc{mime}-compliant software
86 read at least RFC2045 and RFC2047.
89 * Interface Functions:: An abstraction over the basic functions.
90 * Basic Functions:: Utility and basic parsing functions.
91 * Decoding and Viewing:: A framework for decoding and viewing.
92 * Standards:: A summary of RFCs and working documents used.
93 * Index:: Function and variable index.
97 @node Interface Functions
98 @chapter Interface Functions
99 @cindex interface functions
102 The @code{mail-parse} library is an abstraction over the actual
103 low-level libraries that are described in the next chapter.
105 Standards change, and so programs have to change to fit in the new
106 mold. For instance, RFC2045 describes a syntax for the
107 @code{Content-Type} header that only allows ASCII characters in the
108 parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme
109 for continuation headers and non-ASCII characters.
111 The traditional way to deal with this is just to update the library
112 functions to parse the new syntax. However, this is sometimes the wrong
113 thing to do. In some instances it may be vital to be able to understand
114 both the old syntax as well as the new syntax, and if there is only one
115 library, one must choose between the old version of the library and the
116 new version of the library.
118 The Emacs MIME library takes a different tack. It defines a series of
119 low-level libraries (@file{rfc2047.el}, @file{rfc2231.el} and so on)
120 that parses strictly according to the corresponding standard. However,
121 normal programs would not use the functions provided by these libraries
122 directly, but instead use the functions provided by the
123 @code{mail-parse} library. The functions in this library are just
124 aliases to the corresponding functions in the latest low-level
125 libraries. Using this scheme, programs get a consistent interface they
126 can use, and library developers are free to create write code that
127 handles new standards.
129 The following functions are defined by this library:
132 @item mail-header-parse-content-type
133 @findex mail-header-parse-content-type
134 Parse a @code{Content-Type} header and return a list on the following
139 (attribute1 . value1)
140 (attribute2 . value2)
147 (mail-header-parse-content-type
148 "image/gif; name=\"b980912.gif\"")
149 @result{} ("image/gif" (name . "b980912.gif"))
152 @item mail-header-parse-content-disposition
153 @findex mail-header-parse-content-disposition
154 Parse a @code{Content-Disposition} header and return a list on the same
155 format as the function above.
157 @item mail-content-type-get
158 @findex mail-content-type-get
159 Takes two parameters---a list on the format above, and an attribute.
160 Returns the value of the attribute.
163 (mail-content-type-get
164 '("image/gif" (name . "b980912.gif")) 'name)
165 @result{} "b980912.gif"
168 @item mail-header-remove-comments
169 @findex mail-header-remove-comments
170 Return a comment-free version of a header.
173 (mail-header-remove-comments
174 "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
175 @result{} "Gnus/5.070027 "
178 @item mail-header-remove-whitespace
179 @findex mail-header-remove-whitespace
180 Remove linear white space from a header. Space inside quoted strings
181 and comments is preserved.
184 (mail-header-remove-whitespace
185 "image/gif; name=\"Name with spaces\"")
186 @result{} "image/gif;name=\"Name with spaces\""
189 @item mail-header-get-comment
190 @findex mail-header-get-comment
191 Return the last comment in a header.
194 (mail-header-get-comment
195 "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
196 @result{} "Finnish Landrace"
199 @item mail-header-parse-address
200 @findex mail-header-parse-address
201 Parse an address and return a list containing the mailbox and the
205 (mail-header-parse-address
206 "Hrvoje Niksic <hniksic@@srce.hr>")
207 @result{} ("hniksic@@srce.hr" . "Hrvoje Niksic")
210 @item mail-header-parse-addresses
211 @findex mail-header-parse-addresses
212 Parse a string with list of addresses and return a list of elements like
213 the one described above.
216 (mail-header-parse-addresses
217 "Hrvoje Niksic <hniksic@@srce.hr>, Steinar Bang <sb@@metis.no>")
218 @result{} (("hniksic@@srce.hr" . "Hrvoje Niksic")
219 ("sb@@metis.no" . "Steinar Bang"))
222 @item mail-header-parse-date
223 @findex mail-header-parse-date
224 Parse a date string and return an Emacs time structure.
226 @item mail-narrow-to-head
227 @findex mail-narrow-to-head
228 Narrow the buffer to the header section of the buffer. Point is placed
229 at the beginning of the narrowed buffer.
231 @item mail-header-narrow-to-field
232 @findex mail-header-narrow-to-field
233 Narrow the buffer to the header under point.
235 @item mail-encode-encoded-word-region
236 @findex mail-encode-encoded-word-region
237 Encode the non-ASCII words in the region. For instance,
238 @samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}.
240 @item mail-encode-encoded-word-buffer
241 @findex mail-encode-encoded-word-buffer
242 Encode the non-ASCII words in the current buffer. This function is
243 meant to be called narrowed to the headers of a message.
245 @item mail-encode-encoded-word-string
246 @findex mail-encode-encoded-word-string
247 Encode the words that need encoding in a string, and return the result.
250 (mail-encode-encoded-word-string
251 "This is naïve, baby")
252 @result{} "This is =?iso-8859-1?q?na=EFve,?= baby"
255 @item mail-decode-encoded-word-region
256 @findex mail-decode-encoded-word-region
257 Decode the encoded words in the region.
259 @item mail-decode-encoded-word-string
260 @findex mail-decode-encoded-word-string
261 Decode the encoded words in the string and return the result.
264 (mail-decode-encoded-word-string
265 "This is =?iso-8859-1?q?na=EFve,?= baby")
266 @result{} "This is naïve, baby"
271 Currently, @code{mail-parse} is an abstraction over @code{ietf-drums},
272 @code{rfc2047} and @code{rfc2231}. These are documented in the
277 @node Basic Functions
278 @chapter Basic Functions
280 This chapter describes the basic, ground-level functions for parsing and
281 handling. Covered here is parsing @code{From} lines, removing comments
282 from header lines, decoding encoded words, parsing date headers and so
283 on. High-level functionality is dealt with in the next chapter
284 (@pxref{Decoding and Viewing}).
287 * rfc2231:: Parsing @code{Content-Type} headers.
288 * ietf-drums:: Handling mail headers defined by RFC822bis.
289 * rfc2047:: En/decoding encoded words in headers.
290 * time-date:: Functions for parsing dates and manipulating time.
291 * qp:: Quoted-Printable en/decoding.
292 * base64:: Base64 en/decoding.
293 * binhex:: Binhex decoding.
294 * uudecode:: Uuencode decoding.
295 * rfc1843:: Decoding HZ-encoded text.
296 * mailcap:: How parts are displayed is specified by the @file{.mailcap} file
303 RFC2231 defines a syntax for the @code{Content-Type} and
304 @code{Content-Disposition} headers. Its snappy name is @dfn{MIME
305 Parameter Value and Encoded Word Extensions: Character Sets, Languages,
308 In short, these headers look something like this:
311 Content-Type: application/x-stuff;
312 title*0*=us-ascii'en'This%20is%20even%20more%20;
313 title*1*=%2A%2A%2Afun%2A%2A%2A%20;
317 They usually aren't this bad, though.
319 The following functions are defined by this library:
322 @item rfc2231-parse-string
323 @findex rfc2231-parse-string
324 Parse a @code{Content-Type} header and return a list describing its
328 (rfc2231-parse-string
329 "application/x-stuff;
330 title*0*=us-ascii'en'This%20is%20even%20more%20;
331 title*1*=%2A%2A%2Afun%2A%2A%2A%20;
332 title*2=\"isn't it!\"")
333 @result{} ("application/x-stuff"
334 (title . "This is even more ***fun*** isn't it!"))
337 @item rfc2231-get-value
338 @findex rfc2231-get-value
339 Takes one of the lists on the format above and return
340 the value of the specified attribute.
348 @dfn{drums} is an IETF working group that is working on the replacement
351 The functions provided by this library include:
354 @item ietf-drums-remove-comments
355 @findex ietf-drums-remove-comments
356 Remove the comments from the argument and return the results.
358 @item ietf-drums-remove-whitespace
359 @findex ietf-drums-remove-whitespace
360 Remove linear white space from the string and return the results.
361 Spaces inside quoted strings and comments are left untouched.
363 @item ietf-drums-get-comment
364 @findex ietf-drums-get-comment
365 Return the last most comment from the string.
367 @item ietf-drums-parse-address
368 @findex ietf-drums-parse-address
369 Parse an address string and return a list that contains the mailbox and
372 @item ietf-drums-parse-addresses
373 @findex ietf-drums-parse-addresses
374 Parse a string that contains any number of comma-separated addresses and
375 return a list that contains mailbox/plain text pairs.
377 @item ietf-drums-parse-date
378 @findex ietf-drums-parse-date
379 Parse a date string and return an Emacs time structure.
381 @item ietf-drums-narrow-to-header
382 @findex ietf-drums-narrow-to-header
383 Narrow the buffer to the header section of the current buffer.
391 RFC2047 (Message Header Extensions for Non-ASCII Text) specifies how
392 non-ASCII text in headers are to be encoded. This is actually rather
393 complicated, so a number of variables are necessary to tweak what this
396 The following variables are tweakable:
399 @item rfc2047-default-charset
400 @vindex rfc2047-default-charset
401 Characters in this charset should not be decoded by this library.
402 This defaults to @code{iso-8859-1}.
404 @item rfc2047-header-encoding-list
405 @vindex rfc2047-header-encoding-list
406 This is an alist of header / encoding-type pairs. Its main purpose is
407 to prevent encoding of certain headers.
409 The keys can either be header regexps, or @code{t}.
411 The values can be either @code{nil}, in which case the header(s) in
412 question won't be encoded, or @code{mime}, which means that they will be
415 @item rfc2047-charset-encoding-alist
416 @vindex rfc2047-charset-encoding-alist
417 RFC2047 specifies two forms of encoding---@code{Q} (a
418 Quoted-Printable-like encoding) and @code{B} (base64). This alist
419 specifies which charset should use which encoding.
421 @item rfc2047-encoding-function-alist
422 @vindex rfc2047-encoding-function-alist
423 This is an alist of encoding / function pairs. The encodings are
424 @code{Q}, @code{B} and @code{nil}.
426 @item rfc2047-q-encoding-alist
427 @vindex rfc2047-q-encoding-alist
428 The @code{Q} encoding isn't quite the same for all headers. Some
429 headers allow a narrower range of characters, and that is what this
430 variable is for. It's an alist of header regexps / allowable character
433 @item rfc2047-encoded-word-regexp
434 @vindex rfc2047-encoded-word-regexp
435 When decoding words, this library looks for matches to this regexp.
439 Those were the variables, and these are this functions:
442 @item rfc2047-narrow-to-field
443 @findex rfc2047-narrow-to-field
444 Narrow the buffer to the header on the current line.
446 @item rfc2047-encode-message-header
447 @findex rfc2047-encode-message-header
448 Should be called narrowed to the header of a message. Encodes according
449 to @code{rfc2047-header-encoding-alist}.
451 @item rfc2047-encode-region
452 @findex rfc2047-encode-region
453 Encodes all encodable words in the region specified.
455 @item rfc2047-encode-string
456 @findex rfc2047-encode-string
457 Encode a string and return the results.
459 @item rfc2047-decode-region
460 @findex rfc2047-decode-region
461 Decode the encoded words in the region.
463 @item rfc2047-decode-string
464 @findex rfc2047-decode-string
465 Decode a string and return the results.
473 While not really a part of the @sc{mime} library, it is convenient to
474 document this library here. It deals with parsing @code{Date} headers
475 and manipulating time. (Not by using tesseracts, though, I'm sorry to
478 These functions convert between five formats: A date string, an Emacs
479 time structure, a decoded time list, a second number, and a day number.
481 The functions have quite self-explanatory names, so the following just
482 gives an overview of which functions are available.
485 (parse-time-string "Sat Sep 12 12:21:54 1998 +0200")
486 @result{} (54 21 12 12 9 1998 6 nil 7200)
488 (date-to-time "Sat Sep 12 12:21:54 1998 +0200")
489 @result{} (13818 19266)
491 (time-to-seconds '(13818 19266))
492 @result{} 905595714.0
494 (seconds-to-time 905595714.0)
495 @result{} (13818 19266 0)
497 (time-to-day '(13818 19266))
500 (days-to-time 729644)
501 @result{} (961933 65536)
503 (time-since '(13818 19266))
506 (time-less-p '(13818 19266) '(13818 19145))
509 (subtract-time '(13818 19266) '(13818 19145))
512 (days-between "Sat Sep 12 12:21:54 1998 +0200"
513 "Sat Sep 07 12:21:54 1998 +0200")
516 (date-leap-year-p 2000)
519 (time-to-day-in-year '(13818 19266))
524 And finally, we have @code{safe-date-to-time}, which does the same as
525 @code{date-to-time}, but returns a zero time if the date is
526 syntactically malformed.
533 This library deals with decoding and encoding Quoted-Printable text.
535 Very briefly explained, qp encoding means translating all 8-bit
536 characters (and lots of control characters) into things that look like
537 @samp{=EF}; that is, an equal sign followed by the byte encoded as a hex
540 The following functions are defined by the library:
543 @item quoted-printable-decode-region
544 @findex quoted-printable-decode-region
545 QP-decode all the encoded text in the specified region.
547 @item quoted-printable-decode-string
548 @findex quoted-printable-decode-string
549 Decode the QP-encoded text in a string and return the results.
551 @item quoted-printable-encode-region
552 @findex quoted-printable-encode-region
553 QP-encode all the encodable characters in the specified region. The third
554 optional parameter @var{fold} specifies whether to fold long lines.
555 (Long here means 72.)
557 @item quoted-printable-encode-string
558 @findex quoted-printable-encode-string
559 QP-encode all the encodable characters in a string and return the
569 Base64 is an encoding that encodes three bytes into four characters,
570 thereby increasing the size by about 33%. The alphabet used for
571 encoding is very resistant to mangling during transit.
573 The following functions are defined by this library:
576 @item base64-encode-region
577 @findex base64-encode-region
578 base64 encode the selected region. Return the length of the encoded
579 text. Optional third argument @var{no-line-break} means do not break
580 long lines into shorter lines.
582 @item base64-encode-string
583 @findex base64-encode-string
584 base64 encode a string and return the result.
586 @item base64-decode-region
587 @findex base64-decode-region
588 base64 decode the selected region. Return the length of the decoded
589 text. If the region can't be decoded, return @code{nil} and don't
592 @item base64-decode-string
593 @findex base64-decode-string
594 base64 decode a string and return the result. If the string can't be
595 decoded, @code{nil} is returned.
606 @code{binhex} is an encoding that originated in Macintosh environments.
607 The following function is supplied to deal with these:
610 @item binhex-decode-region
611 @findex binhex-decode-region
612 Decode the encoded text in the region. If given a third parameter, only
613 decode the @code{binhex} header and return the filename.
623 @code{uuencode} is probably still the most popular encoding of binaries
624 used on Usenet, although @code{base64} rules the mail world.
626 The following function is supplied by this package:
629 @item uudecode-decode-region
630 @findex uudecode-decode-region
631 Decode the text in the region.
641 RFC1843 deals with mixing Chinese and ASCII characters in messages. In
642 essence, RFC1843 switches between ASCII and Chinese by doing this:
645 This sentence is in ASCII.
646 The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye.
649 Simple enough, and widely used in China.
651 The following functions are available to handle this encoding:
654 @item rfc1843-decode-region
655 Decode HZ-encoded text in the region.
657 @item rfc1843-decode-string
658 Decode a HZ-encoded string and return the result.
666 The @file{~/.mailcap} file is parsed by most @sc{mime}-aware message
667 handlers and describes how elements are supposed to be displayed.
668 Here's an example file:
672 audio/wav; wavplayer %s
675 This says that all image files should be displayed with @samp{xv}, and
676 that realaudio files should be played by @samp{rvplayer}.
678 The @code{mailcap} library parses this file, and provides functions for
682 @item mailcap-mime-data
683 @vindex mailcap-mime-data
684 This variable is an alist of alists containing backup viewing rules.
691 @item mailcap-parse-mailcaps
692 @findex mailcap-parse-mailcaps
693 Parse the @code{~/.mailcap} file.
695 @item mailcap-mime-info
696 Takes a @sc{mime} type as its argument and returns the matching viewer.
703 @node Decoding and Viewing
704 @chapter Decoding and Viewing
706 This chapter deals with decoding and viewing @sc{mime} messages on a
709 The main idea is to first analyze a @sc{mime} article, and then allow
710 other programs to do things based on the list of @dfn{handles} that are
711 returned as a result of this analysis.
714 * Dissection:: Analyzing a @sc{mime} message.
715 * Handles:: Handle manipulations.
716 * Display:: Displaying handles.
723 The @code{mm-dissect-buffer} is the function responsible for dissecting
724 a @sc{mime} article. If given a multipart message, it will recursively
725 descend the message, following the structure, and return a tree of
726 @sc{mime} handles that describes the structure of the message.
732 A @sc{mime} handle is a list that fully describes a @sc{mime}
735 The following macros can be used to access elements in a handle:
738 @item mm-handle-buffer
739 @findex mm-handle-buffer
740 Return the buffer that holds the contents of the undecoded @sc{mime}
744 @findex mm-handle-type
745 Return the parsed @code{Content-Type} of the part.
747 @item mm-handle-encoding
748 @findex mm-handle-encoding
749 Return the @code{Content-Transfer-Encoding} of the part.
751 @item mm-handle-undisplayer
752 @findex mm-handle-undisplayer
753 Return the object that can be used to remove the displayed part (if it
756 @item mm-handle-set-undisplayer
757 @findex mm-handle-set-undisplayer
758 Set the undisplayer object.
760 @item mm-handle-disposition
761 @findex mm-handle-disposition
762 Return the parsed @code{Content-Disposition} of the part.
764 @item mm-handle-disposition
765 @findex mm-handle-disposition
766 Return the description of the part.
768 @item mm-get-content-id
769 Returns the handle(s) referred to by @code{Content-ID}.
777 Functions for displaying, removing and saving.
780 @item mm-display-part
781 @findex mm-display-part
785 @findex mm-remove-part
786 Remove the part (if it has been displayed).
789 @findex mm-inlinable-p
790 Say whether a @sc{mime} type can be displayed inline.
792 @item mm-automatic-display-p
793 @findex mm-automatic-display-p
794 Say whether a @sc{mime} type should be displayed automatically.
796 @item mm-destroy-part
797 @findex mm-destroy-part
798 Free all resources occupied by a part.
802 Offer to save the part in a file.
806 Offer to pipe the part to some process.
808 @item mm-interactively-view-part
809 @findex mm-interactively-view-part
810 Prompt for a mailcap method to use to view the part.
818 The Emacs @sc{mime} library implements handling of various elements
819 according to a (somewhat) large number of RFCs, drafts and standards
820 documents. This chapter lists the relevant ones. They can all be
821 fetched from @samp{http://www.stud.ifi.uio.no/~larsi/notes/}.
826 Standard for the Format of ARPA Internet Text Messages.
829 Standard for Interchange of USENET Messages
832 Format of Internet Message Bodies
838 Message Header Extensions for Non-ASCII Text
841 Registration Procedures
844 Conformance Criteria and Examples
847 MIME Parameter Value and Encoded Word Extensions: Character Sets,
848 Languages, and Continuations
851 HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and
854 @item draft-ietf-drums-msg-fmt-05.txt
855 Draft for the successor of RFC822
858 The Multipart/Report Content Type for the Reporting of Mail System
859 Administrative Messages