X-Git-Url: http://cgit.sxemacs.org/?a=blobdiff_plain;f=texi%2Femacs-mime.texi;h=7b5cce0c2660616c97944410980b5cf62f221fb2;hb=09aa9ca79e19fff4972a3e59dc18c7998873c65d;hp=c8a2f46439527d2b25fba30126bbc691dce36adc;hpb=fff2f025a5cd42a85ac87b86b6e0d74aaba47d68;p=gnus diff --git a/texi/emacs-mime.texi b/texi/emacs-mime.texi index c8a2f4643..7b5cce0c2 100644 --- a/texi/emacs-mime.texi +++ b/texi/emacs-mime.texi @@ -1,11 +1,11 @@ -\input texinfo @c -*-texinfo-*- +\input texinfo @setfilename emacs-mime @settitle Emacs MIME Manual @synindex fn cp @synindex vr cp @synindex pg cp -@dircategory Editors +@dircategory Emacs @direntry * Emacs MIME: (emacs-mime). The MIME de/composition library. @end direntry @@ -18,7 +18,8 @@ This file documents the Emacs MIME interface functionality. -Copyright (C) 1998,99,2000 Free Software Foundation, Inc. +Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003 + Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or @@ -47,7 +48,8 @@ license to the document, as described in section 6 of the license. @page @vskip 0pt plus 1filll -Copyright @copyright{} 1998,99,2000 Free Software Foundation, Inc. +Copyright @copyright{} 1998, 1999, 2000, 2001, 2002, 2003 Free Software +Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or @@ -74,1190 +76,1640 @@ license to the document, as described in section 6 of the license. @top Emacs MIME This manual documents the libraries used to compose and display -@sc{mime} messages. +@acronym{MIME} messages. -This is not a manual meant for users; it's a manual directed at people -who want to write functions and commands that manipulate @sc{mime} -elements. +This manual is directed at users who want to modify the behaviour of +the @acronym{MIME} encoding/decoding process or want a more detailed +picture of how the Emacs @acronym{MIME} library works, and people who want +to write functions and commands that manipulate @acronym{MIME} elements. -@sc{mime} is short for @dfn{Multipurpose Internet Mail Extensions}. +@acronym{MIME} is short for @dfn{Multipurpose Internet Mail Extensions}. This standard is documented in a number of RFCs; mainly RFC2045 (Format of Internet Message Bodies), RFC2046 (Media Types), RFC2047 (Message -Header Extensions for Non-ASCII Text), RFC2048 (Registration +Header Extensions for Non-@acronym{ASCII} Text), RFC2048 (Registration Procedures), RFC2049 (Conformance Criteria and Examples). It is highly -recommended that anyone who intends writing @sc{mime}-compliant software +recommended that anyone who intends writing @acronym{MIME}-compliant software read at least RFC2045 and RFC2047. @menu +* Decoding and Viewing:: A framework for decoding and viewing. +* Composing:: @acronym{MML}; a language for describing @acronym{MIME} parts. * Interface Functions:: An abstraction over the basic functions. * Basic Functions:: Utility and basic parsing functions. -* Decoding and Viewing:: A framework for decoding and viewing. -* Composing:: MML; a language for describing MIME parts. * Standards:: A summary of RFCs and working documents used. * Index:: Function and variable index. @end menu -@node Interface Functions -@chapter Interface Functions -@cindex interface functions -@cindex mail-parse +@node Decoding and Viewing +@chapter Decoding and Viewing -The @code{mail-parse} library is an abstraction over the actual -low-level libraries that are described in the next chapter. +This chapter deals with decoding and viewing @acronym{MIME} messages on a +higher level. -Standards change, and so programs have to change to fit in the new -mold. For instance, RFC2045 describes a syntax for the -@code{Content-Type} header that only allows ASCII characters in the -parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme -for continuation headers and non-ASCII characters. +The main idea is to first analyze a @acronym{MIME} article, and then allow +other programs to do things based on the list of @dfn{handles} that are +returned as a result of this analysis. -The traditional way to deal with this is just to update the library -functions to parse the new syntax. However, this is sometimes the wrong -thing to do. In some instances it may be vital to be able to understand -both the old syntax as well as the new syntax, and if there is only one -library, one must choose between the old version of the library and the -new version of the library. +@menu +* Dissection:: Analyzing a @acronym{MIME} message. +* Non-MIME:: Analyzing a non-@acronym{MIME} message. +* Handles:: Handle manipulations. +* Display:: Displaying handles. +* Display Customization:: Variables that affect display. +* Files and Directories:: Saving and naming attachments. +* New Viewers:: How to write your own viewers. +@end menu -The Emacs MIME library takes a different tack. It defines a series of -low-level libraries (@file{rfc2047.el}, @file{rfc2231.el} and so on) -that parses strictly according to the corresponding standard. However, -normal programs would not use the functions provided by these libraries -directly, but instead use the functions provided by the -@code{mail-parse} library. The functions in this library are just -aliases to the corresponding functions in the latest low-level -libraries. Using this scheme, programs get a consistent interface they -can use, and library developers are free to create write code that -handles new standards. -The following functions are defined by this library: +@node Dissection +@section Dissection -@table @code -@item mail-header-parse-content-type -@findex mail-header-parse-content-type -Parse a @code{Content-Type} header and return a list on the following -format: +The @code{mm-dissect-buffer} is the function responsible for dissecting +a @acronym{MIME} article. If given a multipart message, it will recursively +descend the message, following the structure, and return a tree of +@acronym{MIME} handles that describes the structure of the message. + +@node Non-MIME +@section Non-MIME +@vindex mm-uu-configure-list + +Gnus also understands some non-@acronym{MIME} attachments, such as +postscript, uuencode, binhex, yenc, shar, forward, gnatsweb, pgp, +diff. Each of these features can be disabled by add an item into +@code{mm-uu-configure-list}. For example, @lisp -("type/subtype" - (attribute1 . value1) - (attribute2 . value2) - ...) +(require 'mm-uu) +(add-to-list 'mm-uu-configure-list '(pgp-signed . disabled)) @end lisp -Here's an example: - -@example -(mail-header-parse-content-type - "image/gif; name=\"b980912.gif\"") -@result{} ("image/gif" (name . "b980912.gif")) -@end example +@table @code +@item postscript +@findex postscript +Postscript file. + +@item uu +@findex uu +Uuencoded file. + +@item binhex +@findex binhex +Binhex encoded file. + +@item yenc +@findex yenc +Yenc encoded file. + +@item shar +@findex shar +Shar archive file. + +@item forward +@findex forward +Non-@acronym{MIME} forwarded message. + +@item gnatsweb +@findex gnatsweb +Gnatsweb attachment. + +@item pgp-signed +@findex pgp-signed +@acronym{PGP} signed clear text. + +@item pgp-encrypted +@findex pgp-encrypted +@acronym{PGP} encrypted clear text. + +@item pgp-key +@findex pgp-key +@acronym{PGP} public keys. + +@item emacs-sources +@findex emacs-sources +@vindex mm-uu-emacs-sources-regexp +Emacs source code. This item works only in the groups matching +@code{mm-uu-emacs-sources-regexp}. + +@item diff +@vindex diff +@vindex mm-uu-diff-groups-regexp +Patches. This is intended for groups where diffs of committed files +are automatically sent to. It only works in groups matching +@code{mm-uu-diff-groups-regexp}. -@item mail-header-parse-content-disposition -@findex mail-header-parse-content-disposition -Parse a @code{Content-Disposition} header and return a list on the same -format as the function above. +@end table -@item mail-content-type-get -@findex mail-content-type-get -Takes two parameters---a list on the format above, and an attribute. -Returns the value of the attribute. +@node Handles +@section Handles -@example -(mail-content-type-get - '("image/gif" (name . "b980912.gif")) 'name) -@result{} "b980912.gif" -@end example +A @acronym{MIME} handle is a list that fully describes a @acronym{MIME} +component. -@item mail-header-encode-parameter -@findex mail-header-encode-parameter -Takes a parameter string and returns an encoded version of the string. -This is used for parameters in headers like @code{Content-Type} and -@code{Content-Disposition}. +The following macros can be used to access elements in a handle: -@item mail-header-remove-comments -@findex mail-header-remove-comments -Return a comment-free version of a header. +@table @code +@item mm-handle-buffer +@findex mm-handle-buffer +Return the buffer that holds the contents of the undecoded @acronym{MIME} +part. -@example -(mail-header-remove-comments - "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)") -@result{} "Gnus/5.070027 " -@end example +@item mm-handle-type +@findex mm-handle-type +Return the parsed @code{Content-Type} of the part. -@item mail-header-remove-whitespace -@findex mail-header-remove-whitespace -Remove linear white space from a header. Space inside quoted strings -and comments is preserved. +@item mm-handle-encoding +@findex mm-handle-encoding +Return the @code{Content-Transfer-Encoding} of the part. -@example -(mail-header-remove-whitespace - "image/gif; name=\"Name with spaces\"") -@result{} "image/gif;name=\"Name with spaces\"" -@end example +@item mm-handle-undisplayer +@findex mm-handle-undisplayer +Return the object that can be used to remove the displayed part (if it +has been displayed). -@item mail-header-get-comment -@findex mail-header-get-comment -Return the last comment in a header. +@item mm-handle-set-undisplayer +@findex mm-handle-set-undisplayer +Set the undisplayer object. -@example -(mail-header-get-comment - "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)") -@result{} "Finnish Landrace" -@end example +@item mm-handle-disposition +@findex mm-handle-disposition +Return the parsed @code{Content-Disposition} of the part. -@item mail-header-parse-address -@findex mail-header-parse-address -Parse an address and return a list containing the mailbox and the -plaintext name. +@item mm-handle-disposition +@findex mm-handle-disposition +Return the description of the part. -@example -(mail-header-parse-address - "Hrvoje Niksic ") -@result{} ("hniksic@@srce.hr" . "Hrvoje Niksic") -@end example +@item mm-get-content-id +Returns the handle(s) referred to by @code{Content-ID}. -@item mail-header-parse-addresses -@findex mail-header-parse-addresses -Parse a string with list of addresses and return a list of elements like -the one described above. +@end table -@example -(mail-header-parse-addresses - "Hrvoje Niksic , Steinar Bang ") -@result{} (("hniksic@@srce.hr" . "Hrvoje Niksic") - ("sb@@metis.no" . "Steinar Bang")) -@end example -@item mail-header-parse-date -@findex mail-header-parse-date -Parse a date string and return an Emacs time structure. +@node Display +@section Display -@item mail-narrow-to-head -@findex mail-narrow-to-head -Narrow the buffer to the header section of the buffer. Point is placed -at the beginning of the narrowed buffer. +Functions for displaying, removing and saving. -@item mail-header-narrow-to-field -@findex mail-header-narrow-to-field -Narrow the buffer to the header under point. +@table @code +@item mm-display-part +@findex mm-display-part +Display the part. -@item mail-encode-encoded-word-region -@findex mail-encode-encoded-word-region -Encode the non-ASCII words in the region. For instance, -@samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}. +@item mm-remove-part +@findex mm-remove-part +Remove the part (if it has been displayed). -@item mail-encode-encoded-word-buffer -@findex mail-encode-encoded-word-buffer -Encode the non-ASCII words in the current buffer. This function is -meant to be called narrowed to the headers of a message. +@item mm-inlinable-p +@findex mm-inlinable-p +Say whether a @acronym{MIME} type can be displayed inline. -@item mail-encode-encoded-word-string -@findex mail-encode-encoded-word-string -Encode the words that need encoding in a string, and return the result. +@item mm-automatic-display-p +@findex mm-automatic-display-p +Say whether a @acronym{MIME} type should be displayed automatically. -@example -(mail-encode-encoded-word-string - "This is naïve, baby") -@result{} "This is =?iso-8859-1?q?na=EFve,?= baby" -@end example +@item mm-destroy-part +@findex mm-destroy-part +Free all resources occupied by a part. -@item mail-decode-encoded-word-region -@findex mail-decode-encoded-word-region -Decode the encoded words in the region. +@item mm-save-part +@findex mm-save-part +Offer to save the part in a file. -@item mail-decode-encoded-word-string -@findex mail-decode-encoded-word-string -Decode the encoded words in the string and return the result. +@item mm-pipe-part +@findex mm-pipe-part +Offer to pipe the part to some process. -@example -(mail-decode-encoded-word-string - "This is =?iso-8859-1?q?na=EFve,?= baby") -@result{} "This is naïve, baby" -@end example +@item mm-interactively-view-part +@findex mm-interactively-view-part +Prompt for a mailcap method to use to view the part. @end table -Currently, @code{mail-parse} is an abstraction over @code{ietf-drums}, -@code{rfc2047}, @code{rfc2045} and @code{rfc2231}. These are documented -in the subsequent sections. - +@node Display Customization +@section Display Customization -@node Basic Functions -@chapter Basic Functions - -This chapter describes the basic, ground-level functions for parsing and -handling. Covered here is parsing @code{From} lines, removing comments -from header lines, decoding encoded words, parsing date headers and so -on. High-level functionality is dealt with in the next chapter -(@pxref{Decoding and Viewing}). +@table @code -@menu -* rfc2045:: Encoding @code{Content-Type} headers. -* rfc2231:: Parsing @code{Content-Type} headers. -* ietf-drums:: Handling mail headers defined by RFC822bis. -* rfc2047:: En/decoding encoded words in headers. -* time-date:: Functions for parsing dates and manipulating time. -* qp:: Quoted-Printable en/decoding. -* base64:: Base64 en/decoding. -* binhex:: Binhex decoding. -* uudecode:: Uuencode decoding. -* rfc1843:: Decoding HZ-encoded text. -* mailcap:: How parts are displayed is specified by the @file{.mailcap} file -@end menu +@item mm-inline-media-tests +@vindex mm-inline-media-tests +This is an alist where the key is a @acronym{MIME} type, the second element +is a function to display the part @dfn{inline} (i.e., inside Emacs), and +the third element is a form to be @code{eval}ed to say whether the part +can be displayed inline. +This variable specifies whether a part @emph{can} be displayed inline, +and, if so, how to do it. It does not say whether parts are +@emph{actually} displayed inline. -@node rfc2045 -@section rfc2045 +@item mm-inlined-types +@vindex mm-inlined-types +This, on the other hand, says what types are to be displayed inline, if +they satisfy the conditions set by the variable above. It's a list of +@acronym{MIME} media types. -RFC2045 is the ``main'' @sc{mime} document, and as such, one would -imagine that there would be a lot to implement. But there isn't, since -most of the implementation details are delegated to the subsequent -RFCs. +@item mm-automatic-display +@vindex mm-automatic-display +This is a list of types that are to be displayed ``automatically'', but +only if the above variable allows it. That is, only inlinable parts can +be displayed automatically. -So @file{rfc2045.el} has only a single function: +@item mm-automatic-external-display +@vindex mm-automatic-external-display +This is a list of types that will be displayed automatically in an +external viewer. -@table @code -@item rfc2045-encode-string -@findex rfc2045-encode-string -Takes a parameter and a value and returns a @samp{PARAM=VALUE} string. -@var{value} will be quoted if there are non-safe characters in it. -@end table +@item mm-keep-viewer-alive-types +@vindex mm-keep-viewer-alive-types +This is a list of media types for which the external viewer will not +be killed when selecting a different article. +@item mm-attachment-override-types +@vindex mm-attachment-override-types +Some @acronym{MIME} agents create parts that have a content-disposition of +@samp{attachment}. This variable allows overriding that disposition and +displaying the part inline. (Note that the disposition is only +overridden if we are able to, and want to, display the part inline.) -@node rfc2231 -@section rfc2231 +@item mm-discouraged-alternatives +@vindex mm-discouraged-alternatives +List of @acronym{MIME} types that are discouraged when viewing +@samp{multipart/alternative}. Viewing agents are supposed to view the +last possible part of a message, as that is supposed to be the richest. +However, users may prefer other types instead, and this list says what +types are most unwanted. If, for instance, @samp{text/html} parts are +very unwanted, and @samp{text/richtext} parts are somewhat unwanted, +you could say something like: -RFC2231 defines a syntax for the @code{Content-Type} and -@code{Content-Disposition} headers. Its snappy name is @dfn{MIME -Parameter Value and Encoded Word Extensions: Character Sets, Languages, -and Continuations}. +@lisp +(setq mm-discouraged-alternatives + '("text/html" "text/richtext") + mm-automatic-display + (remove "text/html" mm-automatic-display)) +@end lisp -In short, these headers look something like this: +@item mm-inline-large-images +@vindex mm-inline-large-images +When displaying inline images that are larger than the window, XEmacs +does not enable scrolling, which means that you cannot see the whole +image. To prevent this, the library tries to determine the image size +before displaying it inline, and if it doesn't fit the window, the +library will display it externally (e.g. with @samp{ImageMagick} or +@samp{xv}). Setting this variable to @code{t} disables this check and +makes the library display all inline images as inline, regardless of +their size. -@example -Content-Type: application/x-stuff; - title*0*=us-ascii'en'This%20is%20even%20more%20; - title*1*=%2A%2A%2Afun%2A%2A%2A%20; - title*2="isn't it!" -@end example +@item mm-inline-override-types +@vindex mm-inline-override-types +@code{mm-inlined-types} may include regular expressions, for example to +specify that all @samp{text/.*} parts be displayed inline. If a user +prefers to have a type that matches such a regular expression be treated +as an attachment, that can be accomplished by setting this variable to a +list containing that type. For example assuming @code{mm-inlined-types} +includes @samp{text/.*}, then including @samp{text/html} in this +variable will cause @samp{text/html} parts to be treated as attachments. -They usually aren't this bad, though. +@item mm-text-html-renderer +@vindex mm-text-html-renderer +This selects the function used to render @acronym{HTML}. The predefined +renderers are selected by the symbols @code{w3}, +@code{w3m}@footnote{See @uref{http://emacs-w3m.namazu.org/} for more +information about emacs-w3m}, @code{links}, @code{lynx}, +@code{w3m-standalone} or @code{html2text}. If @code{nil} use an +external viewer. You can also specify a function, which will be +called with a @acronym{MIME} handle as the argument. + +@item mm-inline-text-html-with-images +@vindex mm-inline-text-html-with-images +Some @acronym{HTML} mails might have the trick of spammers using +@samp{} tags. It is likely to be intended to verify whether you +have read the mail. You can prevent your personal informations from +leaking by setting this option to @code{nil} (which is the default). +It is currently ignored by Emacs/w3. For emacs-w3m, you may use the +command @kbd{t} on the image anchor to show an image even if it is +@code{nil}.@footnote{The command @kbd{T} will load all images. If you +have set the option @code{w3m-key-binding} to @code{info}, use @kbd{i} +or @kbd{I} instead.} + +@item mm-w3m-safe-url-regexp +@vindex mm-w3m-safe-url-regexp +A regular expression that matches safe URL names, i.e. URLs that are +unlikely to leak personal information when rendering @acronym{HTML} +email (the default value is @samp{\\`cid:}). If @code{nil} consider +all URLs safe. + +@item mm-inline-text-html-with-w3m-keymap +@vindex mm-inline-text-html-with-w3m-keymap +You can use emacs-w3m command keys in the inlined text/html part by +setting this option to non-@code{nil}. The default value is @code{t}. + +@item mm-external-terminal-program +@vindex mm-external-terminal-program +The program used to start an external terminal. -The following functions are defined by this library: +@end table + +@node Files and Directories +@section Files and Directories @table @code -@item rfc2231-parse-string -@findex rfc2231-parse-string -Parse a @code{Content-Type} header and return a list describing its -elements. -@example -(rfc2231-parse-string - "application/x-stuff; - title*0*=us-ascii'en'This%20is%20even%20more%20; - title*1*=%2A%2A%2Afun%2A%2A%2A%20; - title*2=\"isn't it!\"") -@result{} ("application/x-stuff" - (title . "This is even more ***fun*** isn't it!")) -@end example +@item mm-default-directory +@vindex mm-default-directory +The default directory for saving attachments. If @code{nil} use +@code{default-directory}. -@item rfc2231-get-value -@findex rfc2231-get-value -Takes one of the lists on the format above and returns -the value of the specified attribute. +@item mm-tmp-directory +@vindex mm-tmp-directory +Directory for storing temporary files. -@item rfc2231-encode-string -@findex rfc2231-encode-string -Encode a parameter in headers likes @code{Content-Type} and -@code{Content-Disposition}. +@item mm-file-name-rewrite-functions +@vindex mm-file-name-rewrite-functions +A list of functions used for rewriting file names of @acronym{MIME} +parts. Each function is applied successively to the file name. +Ready-made functions include +@table @code +@item mm-file-name-delete-control +@findex mm-file-name-delete-control +Delete all control characters. + +@item mm-file-name-delete-gotchas +@findex mm-file-name-delete-gotchas +Delete characters that could have unintended consequences when used +with flawed shell scripts, i.e. @samp{|}, @samp{>} and @samp{<}; and +@samp{-}, @samp{.} as the first character. + +@item mm-file-name-delete-whitespace +@findex mm-file-name-delete-whitespace +Remove all whitespace. + +@item mm-file-name-trim-whitespace +@findex mm-file-name-trim-whitespace +Remove leading and trailing whitespace. + +@item mm-file-name-collapse-whitespace +@findex mm-file-name-collapse-whitespace +Collapse multiple whitespace characters. + +@item mm-file-name-replace-whitespace +@findex mm-file-name-replace-whitespace +@vindex mm-file-name-replace-whitespace +Replace whitespace with underscores. Set the variable +@code{mm-file-name-replace-whitespace} to any other string if you do +not like underscores. @end table +The standard Emacs functions @code{capitalize}, @code{downcase}, +@code{upcase} and @code{upcase-initials} might also prove useful. -@node ietf-drums -@section ietf-drums +@item mm-path-name-rewrite-functions +@vindex mm-path-name-rewrite-functions +List of functions used for rewriting the full file names of @acronym{MIME} +parts. This is used when viewing parts externally, and is meant for +transforming the absolute name so that non-compliant programs can find +the file where it's saved. -@dfn{drums} is an IETF working group that is working on the replacement -for RFC822. +@end table -The functions provided by this library include: +@node New Viewers +@section New Viewers -@table @code -@item ietf-drums-remove-comments -@findex ietf-drums-remove-comments -Remove the comments from the argument and return the results. +Here's an example viewer for displaying @code{text/enriched} inline: -@item ietf-drums-remove-whitespace -@findex ietf-drums-remove-whitespace -Remove linear white space from the string and return the results. -Spaces inside quoted strings and comments are left untouched. +@lisp +(defun mm-display-enriched-inline (handle) + (let (text) + (with-temp-buffer + (mm-insert-part handle) + (save-window-excursion + (enriched-decode (point-min) (point-max)) + (setq text (buffer-string)))) + (mm-insert-inline handle text))) +@end lisp -@item ietf-drums-get-comment -@findex ietf-drums-get-comment -Return the last most comment from the string. +We see that the function takes a @acronym{MIME} handle as its parameter. It +then goes to a temporary buffer, inserts the text of the part, does some +work on the text, stores the result, goes back to the buffer it was +called from and inserts the result. -@item ietf-drums-parse-address -@findex ietf-drums-parse-address -Parse an address string and return a list that contains the mailbox and -the plain text name. +The two important helper functions here are @code{mm-insert-part} and +@code{mm-insert-inline}. The first function inserts the text of the +handle in the current buffer. It handles charset and/or content +transfer decoding. The second function just inserts whatever text you +tell it to insert, but it also sets things up so that the text can be +``undisplayed'' in a convenient manner. -@item ietf-drums-parse-addresses -@findex ietf-drums-parse-addresses -Parse a string that contains any number of comma-separated addresses and -return a list that contains mailbox/plain text pairs. -@item ietf-drums-parse-date -@findex ietf-drums-parse-date -Parse a date string and return an Emacs time structure. +@node Composing +@chapter Composing +@cindex Composing +@cindex MIME Composing +@cindex MML +@cindex MIME Meta Language -@item ietf-drums-narrow-to-header -@findex ietf-drums-narrow-to-header -Narrow the buffer to the header section of the current buffer. +Creating a @acronym{MIME} message is boring and non-trivial. Therefore, +a library called @code{mml} has been defined that parses a language +called @acronym{MML} (@acronym{MIME} Meta Language) and generates +@acronym{MIME} messages. -@end table +@findex mml-generate-mime +The main interface function is @code{mml-generate-mime}. It will +examine the contents of the current (narrowed-to) buffer and return a +string containing the @acronym{MIME} message. +@menu +* Simple MML Example:: An example @acronym{MML} document. +* MML Definition:: All valid @acronym{MML} elements. +* Advanced MML Example:: Another example @acronym{MML} document. +* Encoding Customization:: Variables that affect encoding. +* Charset Translation:: How charsets are mapped from @sc{mule} to @acronym{MIME}. +* Conversion:: Going from @acronym{MIME} to @acronym{MML} and vice versa. +* Flowed text:: Soft and hard newlines. +@end menu -@node rfc2047 -@section rfc2047 -RFC2047 (Message Header Extensions for Non-ASCII Text) specifies how -non-ASCII text in headers are to be encoded. This is actually rather -complicated, so a number of variables are necessary to tweak what this -library does. +@node Simple MML Example +@section Simple MML Example -The following variables are tweakable: +Here's a simple @samp{multipart/alternative}: -@table @code -@item rfc2047-default-charset -@vindex rfc2047-default-charset -Characters in this charset should not be decoded by this library. -This defaults to @code{iso-8859-1}. +@example +<#multipart type=alternative> +This is a plain text part. +<#part type=text/enriched> +
This is a centered enriched part
+<#/multipart> +@end example -@item rfc2047-header-encoding-list -@vindex rfc2047-header-encoding-list -This is an alist of header / encoding-type pairs. Its main purpose is -to prevent encoding of certain headers. +After running this through @code{mml-generate-mime}, we get this: -The keys can either be header regexps, or @code{t}. +@example +Content-Type: multipart/alternative; boundary="=-=-=" -The values can be either @code{nil}, in which case the header(s) in -question won't be encoded, or @code{mime}, which means that they will be -encoded. -@item rfc2047-charset-encoding-alist -@vindex rfc2047-charset-encoding-alist -RFC2047 specifies two forms of encoding---@code{Q} (a -Quoted-Printable-like encoding) and @code{B} (base64). This alist -specifies which charset should use which encoding. +--=-=-= -@item rfc2047-encoding-function-alist -@vindex rfc2047-encoding-function-alist -This is an alist of encoding / function pairs. The encodings are -@code{Q}, @code{B} and @code{nil}. -@item rfc2047-q-encoding-alist -@vindex rfc2047-q-encoding-alist -The @code{Q} encoding isn't quite the same for all headers. Some -headers allow a narrower range of characters, and that is what this -variable is for. It's an alist of header regexps / allowable character -ranges. +This is a plain text part. -@item rfc2047-encoded-word-regexp -@vindex rfc2047-encoded-word-regexp -When decoding words, this library looks for matches to this regexp. +--=-=-= +Content-Type: text/enriched -@end table -Those were the variables, and these are this functions: +
This is a centered enriched part
-@table @code -@item rfc2047-narrow-to-field -@findex rfc2047-narrow-to-field -Narrow the buffer to the header on the current line. +--=-=-=-- +@end example -@item rfc2047-encode-message-header -@findex rfc2047-encode-message-header -Should be called narrowed to the header of a message. Encodes according -to @code{rfc2047-header-encoding-alist}. -@item rfc2047-encode-region -@findex rfc2047-encode-region -Encodes all encodable words in the region specified. +@node MML Definition +@section MML Definition -@item rfc2047-encode-string -@findex rfc2047-encode-string -Encode a string and return the results. +The @acronym{MML} language is very simple. It looks a bit like an SGML +application, but it's not. -@item rfc2047-decode-region -@findex rfc2047-decode-region -Decode the encoded words in the region. +The main concept of @acronym{MML} is the @dfn{part}. Each part can be of a +different type or use a different charset. The way to delineate a part +is with a @samp{<#part ...>} tag. Multipart parts can be introduced +with the @samp{<#multipart ...>} tag. Parts are ended by the +@samp{<#/part>} or @samp{<#/multipart>} tags. Parts started with the +@samp{<#part ...>} tags are also closed by the next open tag. -@item rfc2047-decode-string -@findex rfc2047-decode-string -Decode a string and return the results. +There's also the @samp{<#external ...>} tag. These introduce +@samp{external/message-body} parts. -@end table +Each tag can contain zero or more parameters on the form +@samp{parameter=value}. The values may be enclosed in quotation marks, +but that's not necessary unless the value contains white space. So +@samp{filename=/home/user/#hello$^yes} is perfectly valid. +The following parameters have meaning in @acronym{MML}; parameters that have no +meaning are ignored. The @acronym{MML} parameter names are the same as the +@acronym{MIME} parameter names; the things in the parentheses say which +header it will be used in. -@node time-date -@section time-date +@table @samp +@item type +The @acronym{MIME} type of the part (@code{Content-Type}). -While not really a part of the @sc{mime} library, it is convenient to -document this library here. It deals with parsing @code{Date} headers -and manipulating time. (Not by using tesseracts, though, I'm sorry to -say.) +@item filename +Use the contents of the file in the body of the part +(@code{Content-Disposition}). -These functions convert between five formats: A date string, an Emacs -time structure, a decoded time list, a second number, and a day number. +@item charset +The contents of the body of the part are to be encoded in the character +set specified (@code{Content-Type}). @xref{Charset Translation}. -The functions have quite self-explanatory names, so the following just -gives an overview of which functions are available. +@item name +Might be used to suggest a file name if the part is to be saved +to a file (@code{Content-Type}). -@example -(parse-time-string "Sat Sep 12 12:21:54 1998 +0200") -@result{} (54 21 12 12 9 1998 6 nil 7200) +@item disposition +Valid values are @samp{inline} and @samp{attachment} +(@code{Content-Disposition}). -(date-to-time "Sat Sep 12 12:21:54 1998 +0200") -@result{} (13818 19266) +@item encoding +Valid values are @samp{7bit}, @samp{8bit}, @samp{quoted-printable} and +@samp{base64} (@code{Content-Transfer-Encoding}). @xref{Charset +Translation}. -(time-to-seconds '(13818 19266)) -@result{} 905595714.0 +@item description +A description of the part (@code{Content-Description}). -(seconds-to-time 905595714.0) -@result{} (13818 19266 0) +@item creation-date +RFC822 date when the part was created (@code{Content-Disposition}). -(time-to-day '(13818 19266)) -@result{} 729644 +@item modification-date +RFC822 date when the part was modified (@code{Content-Disposition}). -(days-to-time 729644) -@result{} (961933 65536) +@item read-date +RFC822 date when the part was read (@code{Content-Disposition}). -(time-since '(13818 19266)) -@result{} (0 430) +@item recipients +Who to encrypt/sign the part to. This field is used to override any +auto-detection based on the To/CC headers. -(time-less-p '(13818 19266) '(13818 19145)) -@result{} nil +@item sender +Identity used to sign the part. This field is used to override the +default key used. -(subtract-time '(13818 19266) '(13818 19145)) -@result{} (0 121) +@item size +The size (in octets) of the part (@code{Content-Disposition}). -(days-between "Sat Sep 12 12:21:54 1998 +0200" - "Sat Sep 07 12:21:54 1998 +0200") -@result{} 5 +@item sign +What technology to sign this @acronym{MML} part with (@code{smime}, @code{pgp} +or @code{pgpmime}) -(date-leap-year-p 2000) -@result{} t +@item encrypt +What technology to encrypt this @acronym{MML} part with (@code{smime}, +@code{pgp} or @code{pgpmime}) -(time-to-day-in-year '(13818 19266)) -@result{} 255 +@end table -@end example +Parameters for @samp{text/plain}: -And finally, we have @code{safe-date-to-time}, which does the same as -@code{date-to-time}, but returns a zero time if the date is -syntactically malformed. +@table @samp +@item format +Formatting parameter for the text, valid values include @samp{fixed} +(the default) and @samp{flowed}. Normally you do not specify this +manually, since it requires the textual body to be formatted in a +special way described in RFC 2646. @xref{Flowed text}. +@end table +Parameters for @samp{application/octet-stream}: +@table @samp +@item type +Type of the part; informal---meant for human readers +(@code{Content-Type}). +@end table -@node qp -@section qp +Parameters for @samp{message/external-body}: -This library deals with decoding and encoding Quoted-Printable text. +@table @samp +@item access-type +A word indicating the supported access mechanism by which the file may +be obtained. Values include @samp{ftp}, @samp{anon-ftp}, @samp{tftp}, +@samp{localfile}, and @samp{mailserver}. (@code{Content-Type}.) -Very briefly explained, qp encoding means translating all 8-bit -characters (and lots of control characters) into things that look like -@samp{=EF}; that is, an equal sign followed by the byte encoded as a hex -string. +@item expiration +The RFC822 date after which the file may no longer be fetched. +(@code{Content-Type}.) -The following functions are defined by the library: +@item size +The size (in octets) of the file. (@code{Content-Type}.) -@table @code -@item quoted-printable-decode-region -@findex quoted-printable-decode-region -QP-decode all the encoded text in the specified region. +@item permission +Valid values are @samp{read} and @samp{read-write} +(@code{Content-Type}). -@item quoted-printable-decode-string -@findex quoted-printable-decode-string -Decode the QP-encoded text in a string and return the results. +@end table -@item quoted-printable-encode-region -@findex quoted-printable-encode-region -QP-encode all the encodable characters in the specified region. The third -optional parameter @var{fold} specifies whether to fold long lines. -(Long here means 72.) +Parameters for @samp{sign=smime}: -@item quoted-printable-encode-string -@findex quoted-printable-encode-string -QP-encode all the encodable characters in a string and return the -results. +@table @samp -@end table +@item keyfile +File containing key and certificate for signer. +@end table -@node base64 -@section base64 -@cindex base64 +Parameters for @samp{encrypt=smime}: -Base64 is an encoding that encodes three bytes into four characters, -thereby increasing the size by about 33%. The alphabet used for -encoding is very resistant to mangling during transit. +@table @samp -The following functions are defined by this library: +@item certfile +File containing certificate for recipient. -@table @code -@item base64-encode-region -@findex base64-encode-region -base64 encode the selected region. Return the length of the encoded -text. Optional third argument @var{no-line-break} means do not break -long lines into shorter lines. +@end table -@item base64-encode-string -@findex base64-encode-string -base64 encode a string and return the result. -@item base64-decode-region -@findex base64-decode-region -base64 decode the selected region. Return the length of the decoded -text. If the region can't be decoded, return @code{nil} and don't -modify the buffer. +@node Advanced MML Example +@section Advanced MML Example -@item base64-decode-string -@findex base64-decode-string -base64 decode a string and return the result. If the string can't be -decoded, @code{nil} is returned. +Here's a complex multipart message. It's a @samp{multipart/mixed} that +contains many parts, one of which is a @samp{multipart/alternative}. -@end table +@example +<#multipart type=mixed> +<#part type=image/jpeg filename=~/rms.jpg disposition=inline> +<#multipart type=alternative> +This is a plain text part. +<#part type=text/enriched name=enriched.txt> +
This is a centered enriched part
+<#/multipart> +This is a new plain text part. +<#part disposition=attachment> +This plain text part is an attachment. +<#/multipart> +@end example +And this is the resulting @acronym{MIME} message: -@node binhex -@section binhex -@cindex binhex -@cindex Apple -@cindex Macintosh +@example +Content-Type: multipart/mixed; boundary="=-=-=" -@code{binhex} is an encoding that originated in Macintosh environments. -The following function is supplied to deal with these: -@table @code -@item binhex-decode-region -@findex binhex-decode-region -Decode the encoded text in the region. If given a third parameter, only -decode the @code{binhex} header and return the filename. +--=-=-= -@end table -@node uudecode -@section uudecode -@cindex uuencode -@cindex uudecode +--=-=-= +Content-Type: image/jpeg; + filename="~/rms.jpg" +Content-Disposition: inline; + filename="~/rms.jpg" +Content-Transfer-Encoding: base64 -@code{uuencode} is probably still the most popular encoding of binaries -used on Usenet, although @code{base64} rules the mail world. +/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRof +Hh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAAwADABAREA/8QAHwAA +AQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQR +BRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RF +RkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ip +qrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEB +AAA/AO/rifFHjldNuGsrDa0qcSSHkA+gHrXKw+LtWLrMb+RgTyhbr+HSug07xNqV9fQtZrNI +AyiaE/NuBPOOOP0rvRNE880KOC8TbXXGCv1FPqjrF4LDR7u5L7SkTFT/ALWOP1xXgTuXfc7E +sx6nua6rwp4IvvEM8chCxWxOdzn7wz6V9AaB4S07w9p5itow0rDLSY5Pt9K43xO66P4xs71m +2QXiGCbA4yOVJ9+1aYORkdK434lyNH4ahCnG66VT9Nj15JFbPdX0MS43M4VQf5/yr2vSpLnw +5ZW8dlCZ8KFXjOPX0/mK6rSPEGt3Angu44fNEReHYNvIH3TzXDeKNO8RX+kSX2ouZkicTIOc +L+g7E810ulFjpVtv3bwgB3HJyK5L4quY/C9sVxk3ij/xx6850u7t1mtp/wDlpEw3An3Jr3Dw +34gsbWza4nBlhC5LDsaW6+IFgupQyCF3iHH7gA7c9R9ay7zx6t7aX9jHC4smhfBkGCvHGfrm +tLQ7hbnRrV1GPkAP1x1/Hr+Ncr8Vzjwrbf8AX6v/AKA9eQRyYlQk8Yx9K6XTNbkgia2ciSIn +7p5Ga9Atte0LTLKO6it4i7dVRFJDcZ4PvXN+JvEMF9bILVGXJLSZ4zkjivRPDaeX4b08HOTC +pOffmua+KkbS+GLVUGT9tT/0B68eeIpIFYjB70+OOVXyoOM9+M1eaWeCLzHPyHGO/NVWvJJm +jQ8KGH1NfQWhXSXmh2c8eArRLwO3HSv/2Q== -The following function is supplied by this package: +--=-=-= +Content-Type: multipart/alternative; boundary="==-=-=" -@table @code -@item uudecode-decode-region -@findex uudecode-decode-region -Decode the text in the region. -@end table +--==-=-= -@node rfc1843 -@section rfc1843 -@cindex rfc1843 -@cindex HZ -@cindex Chinese -RFC1843 deals with mixing Chinese and ASCII characters in messages. In -essence, RFC1843 switches between ASCII and Chinese by doing this: +This is a plain text part. -@example -This sentence is in ASCII. -The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye. -@end example +--==-=-= +Content-Type: text/enriched; + name="enriched.txt" -Simple enough, and widely used in China. -The following functions are available to handle this encoding: +
This is a centered enriched part
-@table @code -@item rfc1843-decode-region -Decode HZ-encoded text in the region. +--==-=-=-- -@item rfc1843-decode-string -Decode a HZ-encoded string and return the result. +--=-=-= -@end table +This is a new plain text part. +--=-=-= +Content-Disposition: attachment -@node mailcap -@section mailcap -The @file{~/.mailcap} file is parsed by most @sc{mime}-aware message -handlers and describes how elements are supposed to be displayed. -Here's an example file: +This plain text part is an attachment. -@example -image/*; gimp -8 %s -audio/wav; wavplayer %s +--=-=-=-- @end example -This says that all image files should be displayed with @code{gimp}, and -that realaudio files should be played by @code{rvplayer}. - -The @code{mailcap} library parses this file, and provides functions for -matching types. +@node Encoding Customization +@section Encoding Customization @table @code -@item mailcap-mime-data -@vindex mailcap-mime-data -This variable is an alist of alists containing backup viewing rules. -@end table - -Interface functions: +@item mm-body-charset-encoding-alist +@vindex mm-body-charset-encoding-alist +Mapping from @acronym{MIME} charset to encoding to use. This variable is +usually used except, e.g., when other requirements force a specific +encoding (digitally signed messages require 7bit encodings). The +default is -@table @code -@item mailcap-parse-mailcaps -@findex mailcap-parse-mailcaps -Parse the @code{~/.mailcap} file. +@lisp +((iso-2022-jp . 7bit) + (iso-2022-jp-2 . 7bit) + (utf-16 . base64) + (utf-16be . base64) + (utf-16le . base64)) +@end lisp -@item mailcap-mime-info -Takes a @sc{mime} type as its argument and returns the matching viewer. +As an example, if you do not want to have ISO-8859-1 characters +quoted-printable encoded, you may add @code{(iso-8859-1 . 8bit)} to +this variable. You can override this setting on a per-message basis +by using the @code{encoding} @acronym{MML} tag (@pxref{MML Definition}). + +@item mm-coding-system-priorities +@vindex mm-coding-system-priorities +Prioritize coding systems to use for outgoing messages. The default +is @code{nil}, which means to use the defaults in Emacs. It is a list of +coding system symbols (aliases of coding systems does not work, use +@kbd{M-x describe-coding-system} to make sure you are not specifying +an alias in this variable). For example, if you have configured Emacs +to prefer UTF-8, but wish that outgoing messages should be sent in +ISO-8859-1 if possible, you can set this variable to +@code{(iso-latin-1)}. You can override this setting on a per-message +basis by using the @code{charset} @acronym{MML} tag (@pxref{MML Definition}). + +@item mm-content-transfer-encoding-defaults +@vindex mm-content-transfer-encoding-defaults +Mapping from @acronym{MIME} types to encoding to use. This variable is usually +used except, e.g., when other requirements force a safer encoding +(digitally signed messages require 7bit encoding). Besides the normal +@acronym{MIME} encodings, @code{qp-or-base64} may be used to indicate that for +each case the most efficient of quoted-printable and base64 should be +used. You can override this setting on a per-message basis by using +the @code{encoding} @acronym{MML} tag (@pxref{MML Definition}). + +@item mm-use-ultra-safe-encoding +@vindex mm-use-ultra-safe-encoding +When this is non-@code{nil}, it means that textual parts are encoded as +quoted-printable if they contain lines longer than 76 characters or +starting with "From " in the body. Non-7bit encodings (8bit, binary) +are generally disallowed. This reduce the probability that a non-8bit +clean MTA or MDA changes the message. This should never be set +directly, but bound by other functions when necessary (e.g., when +encoding messages that are to be digitally signed). @end table +@node Charset Translation +@section Charset Translation +@cindex charsets +During translation from @acronym{MML} to @acronym{MIME}, for each +@acronym{MIME} part which has been composed inside Emacs, an appropriate +charset has to be chosen. +@vindex mail-parse-charset +If you are running a non-@sc{mule} Emacs, this process is simple: If the +part contains any non-@acronym{ASCII} (8-bit) characters, the @acronym{MIME} charset +given by @code{mail-parse-charset} (a symbol) is used. (Never set this +variable directly, though. If you want to change the default charset, +please consult the documentation of the package which you use to process +@acronym{MIME} messages. +@xref{Various Message Variables, , Various Message Variables, message, + Message Manual}, for example.) +If there are only @acronym{ASCII} characters, the @acronym{MIME} charset US-ASCII is +used, of course. -@node Decoding and Viewing -@chapter Decoding and Viewing +@cindex MULE +@cindex UTF-8 +@cindex Unicode +@vindex mm-mime-mule-charset-alist +Things are slightly more complicated when running Emacs with @sc{mule} +support. In this case, a list of the @sc{mule} charsets used in the +part is obtained, and the @sc{mule} charsets are translated to @acronym{MIME} +charsets by consulting the variable @code{mm-mime-mule-charset-alist}. +If this results in a single @acronym{MIME} charset, this is used to encode +the part. But if the resulting list of @acronym{MIME} charsets contains more +than one element, two things can happen: If it is possible to encode the +part via UTF-8, this charset is used. (For this, Emacs must support +the @code{utf-8} coding system, and the part must consist entirely of +characters which have Unicode counterparts.) If UTF-8 is not available +for some reason, the part is split into several ones, so that each one +can be encoded with a single @acronym{MIME} charset. The part can only be +split at line boundaries, though---if more than one @acronym{MIME} charset is +required to encode a single line, it is not possible to encode the part. -This chapter deals with decoding and viewing @sc{mime} messages on a -higher level. +When running Emacs with @sc{mule} support, the preferences for which +coding system to use is inherited from Emacs itself. This means that +if Emacs is set up to prefer UTF-8, it will be used when encoding +messages. You can modify this by altering the +@code{mm-coding-system-priorities} variable though (@pxref{Encoding +Customization}). -The main idea is to first analyze a @sc{mime} article, and then allow -other programs to do things based on the list of @dfn{handles} that are -returned as a result of this analysis. +The charset to be used can be overridden by setting the @code{charset} +@acronym{MML} tag (@pxref{MML Definition}) when composing the message. -@menu -* Dissection:: Analyzing a @sc{mime} message. -* Handles:: Handle manipulations. -* Display:: Displaying handles. -* Customization:: Variables that affect display. -* New Viewers:: How to write your own viewers. -@end menu +The encoding of characters (quoted-printable, 8bit etc) is orthogonal +to the discussion here, and is controlled by the variables +@code{mm-body-charset-encoding-alist} and +@code{mm-content-transfer-encoding-defaults} (@pxref{Encoding +Customization}). +@node Conversion +@section Conversion -@node Dissection -@section Dissection +@findex mime-to-mml +A (multipart) @acronym{MIME} message can be converted to @acronym{MML} +with the @code{mime-to-mml} function. It works on the message in the +current buffer, and substitutes @acronym{MML} markup for @acronym{MIME} +boundaries. Non-textual parts do not have their contents in the buffer, +but instead have the contents in separate buffers that are referred to +from the @acronym{MML} tags. -The @code{mm-dissect-buffer} is the function responsible for dissecting -a @sc{mime} article. If given a multipart message, it will recursively -descend the message, following the structure, and return a tree of -@sc{mime} handles that describes the structure of the message. +@findex mml-to-mime +An @acronym{MML} message can be converted back to @acronym{MIME} by the +@code{mml-to-mime} function. +These functions are in certain senses ``lossy''---you will not get back +an identical message if you run @code{mime-to-mml} and then +@code{mml-to-mime}. Not only will trivial things like the order of the +headers differ, but the contents of the headers may also be different. +For instance, the original message may use base64 encoding on text, +while @code{mml-to-mime} may decide to use quoted-printable encoding, and +so on. -@node Handles -@section Handles +In essence, however, these two functions should be the inverse of each +other. The resulting contents of the message should remain equivalent, +if not identical. -A @sc{mime} handle is a list that fully describes a @sc{mime} -component. -The following macros can be used to access elements in a handle: +@node Flowed text +@section Flowed text +@cindex format=flowed + +The Emacs @acronym{MIME} library will respect the @code{use-hard-newlines} +variable (@pxref{Hard and Soft Newlines, ,Hard and Soft Newlines, +emacs, Emacs Manual}) when encoding a message, and the +``format=flowed'' Content-Type parameter when decoding a message. + +On encoding text, regardless of @code{use-hard-newlines}, lines +terminated by soft newline characters are filled together and wrapped +after the column decided by @code{fill-flowed-encode-column}. +Quotation marks (matching @samp{^>* ?}) are respected. The variable +controls how the text will look in a client that does not support +flowed text, the default is to wrap after 66 characters. If hard +newline characters are not present in the buffer, no flow encoding +occurs. + +On decoding flowed text, lines with soft newline characters are filled +together and wrapped after the column decided by +@code{fill-flowed-display-column}. The default is to wrap after +@code{fill-column}. + + + + +@node Interface Functions +@chapter Interface Functions +@cindex interface functions +@cindex mail-parse + +The @code{mail-parse} library is an abstraction over the actual +low-level libraries that are described in the next chapter. + +Standards change, and so programs have to change to fit in the new +mold. For instance, RFC2045 describes a syntax for the +@code{Content-Type} header that only allows @acronym{ASCII} characters in the +parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme +for continuation headers and non-@acronym{ASCII} characters. + +The traditional way to deal with this is just to update the library +functions to parse the new syntax. However, this is sometimes the wrong +thing to do. In some instances it may be vital to be able to understand +both the old syntax as well as the new syntax, and if there is only one +library, one must choose between the old version of the library and the +new version of the library. + +The Emacs @acronym{MIME} library takes a different tack. It defines a +series of low-level libraries (@file{rfc2047.el}, @file{rfc2231.el} +and so on) that parses strictly according to the corresponding +standard. However, normal programs would not use the functions +provided by these libraries directly, but instead use the functions +provided by the @code{mail-parse} library. The functions in this +library are just aliases to the corresponding functions in the latest +low-level libraries. Using this scheme, programs get a consistent +interface they can use, and library developers are free to create +write code that handles new standards. + +The following functions are defined by this library: @table @code -@item mm-handle-buffer -@findex mm-handle-buffer -Return the buffer that holds the contents of the undecoded @sc{mime} -part. +@item mail-header-parse-content-type +@findex mail-header-parse-content-type +Parse a @code{Content-Type} header and return a list on the following +format: -@item mm-handle-type -@findex mm-handle-type -Return the parsed @code{Content-Type} of the part. +@lisp +("type/subtype" + (attribute1 . value1) + (attribute2 . value2) + ...) +@end lisp -@item mm-handle-encoding -@findex mm-handle-encoding -Return the @code{Content-Transfer-Encoding} of the part. +Here's an example: -@item mm-handle-undisplayer -@findex mm-handle-undisplayer -Return the object that can be used to remove the displayed part (if it -has been displayed). +@example +(mail-header-parse-content-type + "image/gif; name=\"b980912.gif\"") +@result{} ("image/gif" (name . "b980912.gif")) +@end example + +@item mail-header-parse-content-disposition +@findex mail-header-parse-content-disposition +Parse a @code{Content-Disposition} header and return a list on the same +format as the function above. + +@item mail-content-type-get +@findex mail-content-type-get +Takes two parameters---a list on the format above, and an attribute. +Returns the value of the attribute. + +@example +(mail-content-type-get + '("image/gif" (name . "b980912.gif")) 'name) +@result{} "b980912.gif" +@end example + +@item mail-header-encode-parameter +@findex mail-header-encode-parameter +Takes a parameter string and returns an encoded version of the string. +This is used for parameters in headers like @code{Content-Type} and +@code{Content-Disposition}. + +@item mail-header-remove-comments +@findex mail-header-remove-comments +Return a comment-free version of a header. + +@example +(mail-header-remove-comments + "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)") +@result{} "Gnus/5.070027 " +@end example + +@item mail-header-remove-whitespace +@findex mail-header-remove-whitespace +Remove linear white space from a header. Space inside quoted strings +and comments is preserved. + +@example +(mail-header-remove-whitespace + "image/gif; name=\"Name with spaces\"") +@result{} "image/gif;name=\"Name with spaces\"" +@end example + +@item mail-header-get-comment +@findex mail-header-get-comment +Return the last comment in a header. + +@example +(mail-header-get-comment + "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)") +@result{} "Finnish Landrace" +@end example + +@item mail-header-parse-address +@findex mail-header-parse-address +Parse an address and return a list containing the mailbox and the +plaintext name. + +@example +(mail-header-parse-address + "Hrvoje Niksic ") +@result{} ("hniksic@@srce.hr" . "Hrvoje Niksic") +@end example + +@item mail-header-parse-addresses +@findex mail-header-parse-addresses +Parse a string with list of addresses and return a list of elements like +the one described above. + +@example +(mail-header-parse-addresses + "Hrvoje Niksic , Steinar Bang ") +@result{} (("hniksic@@srce.hr" . "Hrvoje Niksic") + ("sb@@metis.no" . "Steinar Bang")) +@end example + +@item mail-header-parse-date +@findex mail-header-parse-date +Parse a date string and return an Emacs time structure. + +@item mail-narrow-to-head +@findex mail-narrow-to-head +Narrow the buffer to the header section of the buffer. Point is placed +at the beginning of the narrowed buffer. + +@item mail-header-narrow-to-field +@findex mail-header-narrow-to-field +Narrow the buffer to the header under point. Understands continuation +headers. + +@item mail-header-fold-field +@findex mail-header-fold-field +Fold the header under point. + +@item mail-header-unfold-field +@findex mail-header-unfold-field +Unfold the header under point. + +@item mail-header-field-value +@findex mail-header-field-value +Return the value of the field under point. + +@item mail-encode-encoded-word-region +@findex mail-encode-encoded-word-region +Encode the non-@acronym{ASCII} words in the region. For instance, +@samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}. + +@item mail-encode-encoded-word-buffer +@findex mail-encode-encoded-word-buffer +Encode the non-@acronym{ASCII} words in the current buffer. This function is +meant to be called narrowed to the headers of a message. + +@item mail-encode-encoded-word-string +@findex mail-encode-encoded-word-string +Encode the words that need encoding in a string, and return the result. + +@example +(mail-encode-encoded-word-string + "This is naïve, baby") +@result{} "This is =?iso-8859-1?q?na=EFve,?= baby" +@end example + +@item mail-decode-encoded-word-region +@findex mail-decode-encoded-word-region +Decode the encoded words in the region. + +@item mail-decode-encoded-word-string +@findex mail-decode-encoded-word-string +Decode the encoded words in the string and return the result. + +@example +(mail-decode-encoded-word-string + "This is =?iso-8859-1?q?na=EFve,?= baby") +@result{} "This is naïve, baby" +@end example + +@end table + +Currently, @code{mail-parse} is an abstraction over @code{ietf-drums}, +@code{rfc2047}, @code{rfc2045} and @code{rfc2231}. These are documented +in the subsequent sections. + + + +@node Basic Functions +@chapter Basic Functions + +This chapter describes the basic, ground-level functions for parsing and +handling. Covered here is parsing @code{From} lines, removing comments +from header lines, decoding encoded words, parsing date headers and so +on. High-level functionality is dealt with in the next chapter +(@pxref{Decoding and Viewing}). + +@menu +* rfc2045:: Encoding @code{Content-Type} headers. +* rfc2231:: Parsing @code{Content-Type} headers. +* ietf-drums:: Handling mail headers defined by RFC822bis. +* rfc2047:: En/decoding encoded words in headers. +* time-date:: Functions for parsing dates and manipulating time. +* qp:: Quoted-Printable en/decoding. +* base64:: Base64 en/decoding. +* binhex:: Binhex decoding. +* uudecode:: Uuencode decoding. +* yenc:: Yenc decoding. +* rfc1843:: Decoding HZ-encoded text. +* mailcap:: How parts are displayed is specified by the @file{.mailcap} file +@end menu + + +@node rfc2045 +@section rfc2045 + +RFC2045 is the ``main'' @acronym{MIME} document, and as such, one would +imagine that there would be a lot to implement. But there isn't, since +most of the implementation details are delegated to the subsequent +RFCs. + +So @file{rfc2045.el} has only a single function: + +@table @code +@item rfc2045-encode-string +@findex rfc2045-encode-string +Takes a parameter and a value and returns a @samp{PARAM=VALUE} string. +@var{value} will be quoted if there are non-safe characters in it. +@end table + + +@node rfc2231 +@section rfc2231 + +RFC2231 defines a syntax for the @code{Content-Type} and +@code{Content-Disposition} headers. Its snappy name is @dfn{MIME +Parameter Value and Encoded Word Extensions: Character Sets, Languages, +and Continuations}. + +In short, these headers look something like this: + +@example +Content-Type: application/x-stuff; + title*0*=us-ascii'en'This%20is%20even%20more%20; + title*1*=%2A%2A%2Afun%2A%2A%2A%20; + title*2="isn't it!" +@end example + +They usually aren't this bad, though. + +The following functions are defined by this library: + +@table @code +@item rfc2231-parse-string +@findex rfc2231-parse-string +Parse a @code{Content-Type} header and return a list describing its +elements. + +@example +(rfc2231-parse-string + "application/x-stuff; + title*0*=us-ascii'en'This%20is%20even%20more%20; + title*1*=%2A%2A%2Afun%2A%2A%2A%20; + title*2=\"isn't it!\"") +@result{} ("application/x-stuff" + (title . "This is even more ***fun*** isn't it!")) +@end example + +@item rfc2231-get-value +@findex rfc2231-get-value +Takes one of the lists on the format above and returns +the value of the specified attribute. + +@item rfc2231-encode-string +@findex rfc2231-encode-string +Encode a parameter in headers likes @code{Content-Type} and +@code{Content-Disposition}. + +@end table + + +@node ietf-drums +@section ietf-drums + +@dfn{drums} is an IETF working group that is working on the replacement +for RFC822. + +The functions provided by this library include: + +@table @code +@item ietf-drums-remove-comments +@findex ietf-drums-remove-comments +Remove the comments from the argument and return the results. + +@item ietf-drums-remove-whitespace +@findex ietf-drums-remove-whitespace +Remove linear white space from the string and return the results. +Spaces inside quoted strings and comments are left untouched. -@item mm-handle-set-undisplayer -@findex mm-handle-set-undisplayer -Set the undisplayer object. +@item ietf-drums-get-comment +@findex ietf-drums-get-comment +Return the last most comment from the string. -@item mm-handle-disposition -@findex mm-handle-disposition -Return the parsed @code{Content-Disposition} of the part. +@item ietf-drums-parse-address +@findex ietf-drums-parse-address +Parse an address string and return a list that contains the mailbox and +the plain text name. -@item mm-handle-disposition -@findex mm-handle-disposition -Return the description of the part. +@item ietf-drums-parse-addresses +@findex ietf-drums-parse-addresses +Parse a string that contains any number of comma-separated addresses and +return a list that contains mailbox/plain text pairs. -@item mm-get-content-id -Returns the handle(s) referred to by @code{Content-ID}. +@item ietf-drums-parse-date +@findex ietf-drums-parse-date +Parse a date string and return an Emacs time structure. + +@item ietf-drums-narrow-to-header +@findex ietf-drums-narrow-to-header +Narrow the buffer to the header section of the current buffer. @end table -@node Display -@section Display +@node rfc2047 +@section rfc2047 -Functions for displaying, removing and saving. +RFC2047 (Message Header Extensions for Non-@acronym{ASCII} Text) specifies how +non-@acronym{ASCII} text in headers are to be encoded. This is actually rather +complicated, so a number of variables are necessary to tweak what this +library does. + +The following variables are tweakable: @table @code -@item mm-display-part -@findex mm-display-part -Display the part. +@item rfc2047-default-charset +@vindex rfc2047-default-charset +Characters in this charset should not be decoded by this library. +This defaults to @code{iso-8859-1}. -@item mm-remove-part -@findex mm-remove-part -Remove the part (if it has been displayed). +@item rfc2047-header-encoding-alist +@vindex rfc2047-header-encoding-alist +This is an alist of header / encoding-type pairs. Its main purpose is +to prevent encoding of certain headers. -@item mm-inlinable-p -@findex mm-inlinable-p -Say whether a @sc{mime} type can be displayed inline. +The keys can either be header regexps, or @code{t}. -@item mm-automatic-display-p -@findex mm-automatic-display-p -Say whether a @sc{mime} type should be displayed automatically. +The values can be either @code{nil}, in which case the header(s) in +question won't be encoded, or @code{mime}, which means that they will be +encoded. -@item mm-destroy-part -@findex mm-destroy-part -Free all resources occupied by a part. +@item rfc2047-charset-encoding-alist +@vindex rfc2047-charset-encoding-alist +RFC2047 specifies two forms of encoding---@code{Q} (a +Quoted-Printable-like encoding) and @code{B} (base64). This alist +specifies which charset should use which encoding. -@item mm-save-part -@findex mm-save-part -Offer to save the part in a file. +@item rfc2047-encoding-function-alist +@vindex rfc2047-encoding-function-alist +This is an alist of encoding / function pairs. The encodings are +@code{Q}, @code{B} and @code{nil}. -@item mm-pipe-part -@findex mm-pipe-part -Offer to pipe the part to some process. +@item rfc2047-q-encoding-alist +@vindex rfc2047-q-encoding-alist +The @code{Q} encoding isn't quite the same for all headers. Some +headers allow a narrower range of characters, and that is what this +variable is for. It's an alist of header regexps / allowable character +ranges. -@item mm-interactively-view-part -@findex mm-interactively-view-part -Prompt for a mailcap method to use to view the part. +@item rfc2047-encoded-word-regexp +@vindex rfc2047-encoded-word-regexp +When decoding words, this library looks for matches to this regexp. @end table - -@node Customization -@section Customization +Those were the variables, and these are this functions: @table @code +@item rfc2047-narrow-to-field +@findex rfc2047-narrow-to-field +Narrow the buffer to the header on the current line. -@item mm-inline-media-tests -This is an alist where the key is a @sc{mime} type, the second element -is a function to display the part @dfn{inline} (i.e., inside Emacs), and -the third element is a form to be @code{eval}ed to say whether the part -can be displayed inline. - -This variable specifies whether a part @emph{can} be displayed inline, -and, if so, how to do it. It does not say whether parts are -@emph{actually} displayed inline. +@item rfc2047-encode-message-header +@findex rfc2047-encode-message-header +Should be called narrowed to the header of a message. Encodes according +to @code{rfc2047-header-encoding-alist}. -@item mm-inlined-types -This, on the other hand, says what types are to be displayed inline, if -they satisfy the conditions set by the variable above. It's a list of -@sc{mime} media types. +@item rfc2047-encode-region +@findex rfc2047-encode-region +Encodes all encodable words in the region specified. -@item mm-automatic-display -This is a list of types that are to be displayed ``automatically'', but -only if the above variable allows it. That is, only inlinable parts can -be displayed automatically. +@item rfc2047-encode-string +@findex rfc2047-encode-string +Encode a string and return the results. -@item mm-attachment-override-types -Some @sc{mime} agents create parts that have a content-disposition of -@samp{attachment}. This variable allows overriding that disposition and -displaying the part inline. (Note that the disposition is only -overridden if we are able to, and want to, display the part inline.) +@item rfc2047-decode-region +@findex rfc2047-decode-region +Decode the encoded words in the region. -@item mm-discouraged-alternatives -List of @sc{mime} types that are discouraged when viewing -@samp{multipart/alternative}. Viewing agents are supposed to view the -last possible part of a message, as that is supposed to be the richest. -However, users may prefer other types instead, and this list says what -types are most unwanted. If, for instance, @samp{text/html} parts are -very unwanted, and @samp{text/richtech} parts are somewhat unwanted, -then the value of this variable should be set to: +@item rfc2047-decode-string +@findex rfc2047-decode-string +Decode a string and return the results. -@lisp -("text/html" "text/richtext") -@end lisp +@end table -@item mm-inline-large-images-p -When displaying inline images that are larger than the window, XEmacs -does not enable scrolling, which means that you cannot see the whole -image. To prevent this, the library tries to determine the image size -before displaying it inline, and if it doesn't fit the window, the -library will display it externally (e.g. with @samp{ImageMagick} or -@samp{xv}). Setting this variable to @code{t} disables this check and -makes the library display all inline images as inline, regardless of -their size. -@item mm-inline-override-p -@code{mm-inlined-types} may include regular expressions, for example to -specify that all @samp{text/.*} parts be displayed inline. If a user -prefers to have a type that matches such a regular expression be treated -as an attachment, that can be accomplished by setting this variable to a -list containing that type. For example assuming @code{mm-inlined-types} -includes @samp{text/.*}, then including @samp{text/html} in this -variable will cause @samp{text/html} parts to be treated as attachments. +@node time-date +@section time-date -@end table +While not really a part of the @acronym{MIME} library, it is convenient to +document this library here. It deals with parsing @code{Date} headers +and manipulating time. (Not by using tesseracts, though, I'm sorry to +say.) +These functions convert between five formats: A date string, an Emacs +time structure, a decoded time list, a second number, and a day number. -@node New Viewers -@section New Viewers +Here's a bunch of time/date/second/day examples: -Here's an example viewer for displaying @code{text/enriched} inline: +@example +(parse-time-string "Sat Sep 12 12:21:54 1998 +0200") +@result{} (54 21 12 12 9 1998 6 nil 7200) -@lisp -(defun mm-display-enriched-inline (handle) - (let (text) - (with-temp-buffer - (mm-insert-part handle) - (save-window-excursion - (enriched-decode (point-min) (point-max)) - (setq text (buffer-string)))) - (mm-insert-inline handle text))) -@end lisp +(date-to-time "Sat Sep 12 12:21:54 1998 +0200") +@result{} (13818 19266) -We see that the function takes a @sc{mime} handle as its parameter. It -then goes to a temporary buffer, inserts the text of the part, does some -work on the text, stores the result, goes back to the buffer it was -called from and inserts the result. +(time-to-seconds '(13818 19266)) +@result{} 905595714.0 -The two important helper functions here are @code{mm-insert-part} and -@code{mm-insert-inline}. The first function inserts the text of the -handle in the current buffer. It handles charset and/or content -transfer decoding. The second function just inserts whatever text you -tell it to insert, but it also sets things up so that the text can be -``undisplayed' in a convenient manner. +(seconds-to-time 905595714.0) +@result{} (13818 19266 0) +(time-to-days '(13818 19266)) +@result{} 729644 -@node Composing -@chapter Composing -@cindex Composing -@cindex MIME Composing -@cindex MML -@cindex MIME Meta Language +(days-to-time 729644) +@result{} (961933 65536) -Creating a @sc{mime} message is boring and non-trivial. Therefore, a -library called @code{mml} has been defined that parses a language called -MML (@sc{mime} Meta Language) and generates @sc{mime} messages. +(time-since '(13818 19266)) +@result{} (0 430) -@findex mml-generate-mime -The main interface function is @code{mml-generate-mime}. It will -examine the contents of the current (narrowed-to) buffer and return a -string containing the @sc{mime} message. +(time-less-p '(13818 19266) '(13818 19145)) +@result{} nil -@menu -* Simple MML Example:: An example MML document. -* MML Definition:: All valid MML elements. -* Advanced MML Example:: Another example MML document. -* Charset Translation:: How charsets are mapped from @sc{mule} to MIME. -* Conversion:: Going from @sc{mime} to MML and vice versa. -@end menu +(subtract-time '(13818 19266) '(13818 19145)) +@result{} (0 121) +(days-between "Sat Sep 12 12:21:54 1998 +0200" + "Sat Sep 07 12:21:54 1998 +0200") +@result{} 5 -@node Simple MML Example -@section Simple MML Example +(date-leap-year-p 2000) +@result{} t -Here's a simple @samp{multipart/alternative}: +(time-to-day-in-year '(13818 19266)) +@result{} 255 -@example -<#multipart type=alternative> -This is a plain text part. -<#part type=text/enriched> -
This is a centered enriched part
-<#/multipart> +(time-to-number-of-days + (time-since + (date-to-time "Mon, 01 Jan 2001 02:22:26 GMT"))) +@result{} 4.146122685185185 @end example -After running this through @code{mml-generate-mime}, we get this: +And finally, we have @code{safe-date-to-time}, which does the same as +@code{date-to-time}, but returns a zero time if the date is +syntactically malformed. -@example -Content-Type: multipart/alternative; boundary="=-=-=" +The five data representations used are the following: +@table @var +@item date +An RFC822 (or similar) date string. For instance: @code{"Sat Sep 12 +12:21:54 1998 +0200"}. ---=-=-= +@item time +An internal Emacs time. For instance: @code{(13818 26466)}. +@item seconds +A floating point representation of the internal Emacs time. For +instance: @code{905595714.0}. -This is a plain text part. +@item days +An integer number representing the number of days since 00000101. For +instance: @code{729644}. ---=-=-= -Content-Type: text/enriched +@item decoded time +A list of decoded time. For instance: @code{(54 21 12 12 9 1998 6 t +7200)}. +@end table +All the examples above represent the same moment. -
This is a centered enriched part
+These are the functions available: ---=-=-=-- -@end example +@table @code +@item date-to-time +Take a date and return a time. +@item time-to-seconds +Take a time and return seconds. -@node MML Definition -@section MML Definition +@item seconds-to-time +Take seconds and return a time. -The MML language is very simple. It looks a bit like an SGML -application, but it's not. +@item time-to-days +Take a time and return days. -The main concept of MML is the @dfn{part}. Each part can be of a -different type or use a different charset. The way to delineate a part -is with a @samp{<#part ...>} tag. Multipart parts can be introduced -with the @samp{<#multipart ...>} tag. Parts are ended by the -@samp{<#/part>} or @samp{<#/multipart>} tags. Parts started with the -@samp{<#part ...>} tags are also closed by the next open tag. +@item days-to-time +Take days and return a time. -There's also the @samp{<#external ...>} tag. These introduce -@samp{external/message-body} parts. +@item date-to-day +Take a date and return days. -Each tag can contain zero or more parameters on the form -@samp{parameter=value}. The values may be enclosed in quotation marks, -but that's not necessary unless the value contains white space. So -@samp{filename=/home/user/#hello$^yes} is perfectly valid. +@item time-to-number-of-days +Take a time and return the number of days that represents. + +@item safe-date-to-time +Take a date and return a time. If the date is not syntactically valid, +return a ``zero'' date. + +@item time-less-p +Take two times and say whether the first time is less (i. e., earlier) +than the second time. + +@item time-since +Take a time and return a time saying how long it was since that time. + +@item subtract-time +Take two times and subtract the second from the first. I. e., return +the time between the two times. -The following parameters have meaning in MML; parameters that have no -meaning are ignored. The MML parameter names are the same as the -@sc{mime} parameter names; the things in the parentheses say which -header it will be used in. +@item days-between +Take two days and return the number of days between those two days. -@table @samp -@item type -The @sc{mime} type of the part (@code{Content-Type}). +@item date-leap-year-p +Take a year number and say whether it's a leap year. -@item filename -Use the contents of the file in the body of the part -(@code{Content-Disposition}). +@item time-to-day-in-year +Take a time and return the day number within the year that the time is +in. -@item charset -The contents of the body of the part are to be encoded in the character -set speficied (@code{Content-Type}). +@end table -@item name -Might be used to suggest a file name if the part is to be saved -to a file (@code{Content-Type}). -@item disposition -Valid values are @samp{inline} and @samp{attachment} -(@code{Content-Disposition}). +@node qp +@section qp -@item encoding -Valid values are @samp{7bit}, @samp{8bit}, @samp{quoted-printable} and -@samp{base64} (@code{Content-Transfer-Encoding}). +This library deals with decoding and encoding Quoted-Printable text. -@item description -A description of the part (@code{Content-Description}). +Very briefly explained, qp encoding means translating all 8-bit +characters (and lots of control characters) into things that look like +@samp{=EF}; that is, an equal sign followed by the byte encoded as a hex +string. -@item creation-date -RFC822 date when the part was created (@code{Content-Disposition}). +The following functions are defined by the library: -@item modification-date -RFC822 date when the part was modified (@code{Content-Disposition}). +@table @code +@item quoted-printable-decode-region +@findex quoted-printable-decode-region +QP-decode all the encoded text in the specified region. -@item read-date -RFC822 date when the part was read (@code{Content-Disposition}). +@item quoted-printable-decode-string +@findex quoted-printable-decode-string +Decode the QP-encoded text in a string and return the results. -@item size -The size (in octets) of the part (@code{Content-Disposition}). +@item quoted-printable-encode-region +@findex quoted-printable-encode-region +QP-encode all the encodable characters in the specified region. The third +optional parameter @var{fold} specifies whether to fold long lines. +(Long here means 72.) + +@item quoted-printable-encode-string +@findex quoted-printable-encode-string +QP-encode all the encodable characters in a string and return the +results. @end table -Parameters for @samp{application/octet-stream}: -@table @samp -@item type -Type of the part; informal---meant for human readers -(@code{Content-Type}). -@end table +@node base64 +@section base64 +@cindex base64 -Parameters for @samp{message/external-body}: +Base64 is an encoding that encodes three bytes into four characters, +thereby increasing the size by about 33%. The alphabet used for +encoding is very resistant to mangling during transit. -@table @samp -@item access-type -A word indicating the supported access mechanism by which the file may -be obtained. Values include @samp{ftp}, @samp{anon-ftp}, @samp{tftp}, -@samp{localfile}, and @samp{mailserver}. (@code{Content-Type}.) +The following functions are defined by this library: -@item expiration -The RFC822 date after which the file may no longer be fetched. -(@code{Content-Type}.) +@table @code +@item base64-encode-region +@findex base64-encode-region +base64 encode the selected region. Return the length of the encoded +text. Optional third argument @var{no-line-break} means do not break +long lines into shorter lines. -@item size -The size (in octets) of the file. (@code{Content-Type}.) +@item base64-encode-string +@findex base64-encode-string +base64 encode a string and return the result. -@item permission -Valid values are @samp{read} and @samp{read-write} -(@code{Content-Type}). +@item base64-decode-region +@findex base64-decode-region +base64 decode the selected region. Return the length of the decoded +text. If the region can't be decoded, return @code{nil} and don't +modify the buffer. + +@item base64-decode-string +@findex base64-decode-string +base64 decode a string and return the result. If the string can't be +decoded, @code{nil} is returned. @end table -@node Advanced MML Example -@section Advanced MML Example +@node binhex +@section binhex +@cindex binhex +@cindex Apple +@cindex Macintosh -Here's a complex multipart message. It's a @samp{multipart/mixed} that -contains many parts, one of which is a @samp{multipart/alternative}. +@code{binhex} is an encoding that originated in Macintosh environments. +The following function is supplied to deal with these: -@example -<#multipart type=mixed> -<#part type=image/jpeg filename=~/rms.jpg disposition=inline> -<#multipart type=alternative> -This is a plain text part. -<#part type=text/enriched name=enriched.txt> -
This is a centered enriched part
-<#/multipart> -This is a new plain text part. -<#part disposition=attachment> -This plain text part is an attachment. -<#/multipart> -@end example +@table @code +@item binhex-decode-region +@findex binhex-decode-region +Decode the encoded text in the region. If given a third parameter, only +decode the @code{binhex} header and return the filename. -And this is the resulting @sc{mime} message: +@end table -@example -Content-Type: multipart/mixed; boundary="=-=-=" +@node uudecode +@section uudecode +@cindex uuencode +@cindex uudecode +@code{uuencode} is probably still the most popular encoding of binaries +used on Usenet, although @code{base64} rules the mail world. ---=-=-= +The following function is supplied by this package: +@table @code +@item uudecode-decode-region +@findex uudecode-decode-region +Decode the text in the region. +@end table ---=-=-= -Content-Type: image/jpeg; - filename="~/rms.jpg" -Content-Disposition: inline; - filename="~/rms.jpg" -Content-Transfer-Encoding: base64 +@node yenc +@section yenc +@cindex yenc -/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRof -Hh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAAwADABAREA/8QAHwAA -AQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQR -BRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RF -RkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ip -qrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEB -AAA/AO/rifFHjldNuGsrDa0qcSSHkA+gHrXKw+LtWLrMb+RgTyhbr+HSug07xNqV9fQtZrNI -AyiaE/NuBPOOOP0rvRNE880KOC8TbXXGCv1FPqjrF4LDR7u5L7SkTFT/ALWOP1xXgTuXfc7E -sx6nua6rwp4IvvEM8chCxWxOdzn7wz6V9AaB4S07w9p5itow0rDLSY5Pt9K43xO66P4xs71m -2QXiGCbA4yOVJ9+1aYORkdK434lyNH4ahCnG66VT9Nj15JFbPdX0MS43M4VQf5/yr2vSpLnw -5ZW8dlCZ8KFXjOPX0/mK6rSPEGt3Angu44fNEReHYNvIH3TzXDeKNO8RX+kSX2ouZkicTIOc -L+g7E810ulFjpVtv3bwgB3HJyK5L4quY/C9sVxk3ij/xx6850u7t1mtp/wDlpEw3An3Jr3Dw -34gsbWza4nBlhC5LDsaW6+IFgupQyCF3iHH7gA7c9R9ay7zx6t7aX9jHC4smhfBkGCvHGfrm -tLQ7hbnRrV1GPkAP1x1/Hr+Ncr8Vzjwrbf8AX6v/AKA9eQRyYlQk8Yx9K6XTNbkgia2ciSIn -7p5Ga9Atte0LTLKO6it4i7dVRFJDcZ4PvXN+JvEMF9bILVGXJLSZ4zkjivRPDaeX4b08HOTC -pOffmua+KkbS+GLVUGT9tT/0B68eeIpIFYjB70+OOVXyoOM9+M1eaWeCLzHPyHGO/NVWvJJm -jQ8KGH1NfQWhXSXmh2c8eArRLwO3HSv/2Q== +@code{yenc} is used for encoding binaries on Usenet. The following +function is supplied by this package: ---=-=-= -Content-Type: multipart/alternative; boundary="==-=-=" +@table @code +@item yenc-decode-region +@findex yenc-decode-region +Decode the encoded text in the region. +@end table ---==-=-= +@node rfc1843 +@section rfc1843 +@cindex rfc1843 +@cindex HZ +@cindex Chinese -This is a plain text part. +RFC1843 deals with mixing Chinese and @acronym{ASCII} characters in messages. In +essence, RFC1843 switches between @acronym{ASCII} and Chinese by doing this: ---==-=-= -Content-Type: text/enriched; - name="enriched.txt" +@example +This sentence is in @acronym{ASCII}. +The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye. +@end example +Simple enough, and widely used in China. -
This is a centered enriched part
+The following functions are available to handle this encoding: ---==-=-=-- +@table @code +@item rfc1843-decode-region +Decode HZ-encoded text in the region. ---=-=-= +@item rfc1843-decode-string +Decode a HZ-encoded string and return the result. -This is a new plain text part. +@end table ---=-=-= -Content-Disposition: attachment +@node mailcap +@section mailcap -This plain text part is an attachment. +The @file{~/.mailcap} file is parsed by most @acronym{MIME}-aware message +handlers and describes how elements are supposed to be displayed. +Here's an example file: ---=-=-=-- +@example +image/*; gimp -8 %s +audio/wav; wavplayer %s +application/msword; catdoc %s ; copiousoutput ; nametemplate=%s.doc @end example -@node Charset Translation -@section Charset Translation -@cindex charsets +This says that all image files should be displayed with @code{gimp}, +that WAVE audio files should be played by @code{wavplayer}, and that +MS-WORD files should be inlined by @code{catdoc}. -During translation from MML to @sc{mime}, for each @sc{mime} part which -has been composed inside Emacs, an appropriate charset has to be chosen. +The @code{mailcap} library parses this file, and provides functions for +matching types. -@vindex mail-parse-charset -If you are running a non-@sc{mule} Emacs, this process is simple: If the -part contains any non-ASCII (8-bit) characters, the @sc{mime} charset -given by @code{mail-parse-charset} (a symbol) is used. (Never set this -variable directly, though. If you want to change the default charset, -please consult the documentation of the package which you use to process -@sc{mime} messages. -@xref{Various Message Variables, , Various Message Variables, message, - Message Manual}, for example.) -If there are only ASCII characters, the @sc{mime} charset US-ASCII is -used, of course. +@table @code +@item mailcap-mime-data +@vindex mailcap-mime-data +This variable is an alist of alists containing backup viewing rules. -@cindex MULE -@cindex UTF-8 -@cindex Unicode -@vindex mm-mime-mule-charset-alist -Things are slightly more complicated when running Emacs with @sc{mule} -support. In this case, a list of the @sc{mule} charsets used in the -part is obtained, and the @sc{mule} charsets are translated to @sc{mime} -charsets by consulting the variable @code{mm-mime-mule-charset-alist}. -If this results in a single @sc{mime} charset, this is used to encode -the part. But if the resulting list of @sc{mime} charsets contains more -than one element, two things can happen: If it is possible to encode the -part via UTF-8, this charset is used. (For this, Emacs must support -the @code{utf-8} coding system, and the part must consist entirely of -characters which have Unicode counterparts.) If UTF-8 is not available -for some reason, the part is split into several ones, so that each one -can be encoded with a single @sc{mime} charset. The part can only be -split at line boundaries, though---if more than one @sc{mime} charset is -required to encode a single line, it is not possible to encode the part. +@end table -@node Conversion -@section Conversion +Interface functions: -@findex mime-to-mml -A (multipart) @sc{mime} message can be converted to MML with the -@code{mime-to-mml} function. It works on the message in the current -buffer, and substitutes MML markup for @sc{mime} boundaries. -Non-textual parts do not have their contents in the buffer, but instead -have the contents in separate buffers that are referred to from the MML -tags. +@table @code +@item mailcap-parse-mailcaps +@findex mailcap-parse-mailcaps +Parse the @file{~/.mailcap} file. -@findex mml-to-mime -An MML message can be converted back to @sc{mime} by the -@code{mml-to-mime} function. +@item mailcap-mime-info +Takes a @acronym{MIME} type as its argument and returns the matching viewer. + +@end table -These functions are in certain senses ``lossy''---you will not get back -an identical message if you run @sc{mime-to-mml} and then -@sc{mml-to-mime}. Not only will trivial things like the order of the -headers differ, but the contents of the headers may also be different. -For instance, the original message may use base64 encoding on text, -while @sc{mml-to-mime} may decide to use quoted-printable encoding, and -so on. -In essence, however, these two functions should be the inverse of each -other. The resulting contents of the message should remain equivalent, -if not identical. @node Standards @chapter Standards -The Emacs @sc{mime} library implements handling of various elements +The Emacs @acronym{MIME} library implements handling of various elements according to a (somewhat) large number of RFCs, drafts and standards documents. This chapter lists the relevant ones. They can all be -fetched from @samp{http://quimby.gnus.org/notes/}. +fetched from @uref{http://quimby.gnus.org/notes/}. @table @dfn @item RFC822 @@ -1274,7 +1726,7 @@ Format of Internet Message Bodies Media Types @item RFC2047 -Message Header Extensions for Non-ASCII Text +Message Header Extensions for Non-@acronym{ASCII} Text @item RFC2048 Registration Procedures @@ -1283,18 +1735,18 @@ Registration Procedures Conformance Criteria and Examples @item RFC2231 -MIME Parameter Value and Encoded Word Extensions: Character Sets, +@acronym{MIME} Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations @item RFC1843 HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and -ASCII characters +@acronym{ASCII} characters @item draft-ietf-drums-msg-fmt-05.txt Draft for the successor of RFC822 @item RFC2112 -The MIME Multipart/Related Content-type +The @acronym{MIME} Multipart/Related Content-type @item RFC1892 The Multipart/Report Content Type for the Reporting of Mail System @@ -1304,6 +1756,9 @@ Administrative Messages Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field +@item RFC2646 +Documentation of the text/plain format parameter for flowed text. + @end table @@ -1315,4 +1770,8 @@ Content-Disposition Header Field @contents @bye + +@c Local Variables: +@c mode: texinfo +@c coding: iso-8859-1 @c End: