+@node Filtering Spam Using spam.el
+@subsection Filtering Spam Using spam.el
+@cindex spam filtering
+@cindex spam.el
+
+The idea behind @code{spam.el} is to have a control center for spam detection
+and filtering in Gnus. To that end, @code{spam.el} does two things: it
+filters incoming mail, and it analyzes mail known to be spam or ham.
+@emph{Ham} is the name used throughout @code{spam.el} to indicate
+non-spam messages.
+
+So, what happens when you load @code{spam.el}? First of all, you get
+the following keyboard commands:
+
+@table @kbd
+
+@item M-d
+@itemx M s x
+@itemx S x
+@kindex M-d
+@kindex S x
+@kindex M s x
+@findex gnus-summary-mark-as-spam
+@code{gnus-summary-mark-as-spam}.
+
+Mark current article as spam, showing it with the @samp{H} mark.
+Whenever you see a spam article, make sure to mark its summary line
+with @kbd{M-d} before leaving the group. This is done automatically
+for unread articles in @emph{spam} groups.
+
+@item M s t
+@itemx S t
+@kindex M s t
+@kindex S t
+@findex spam-bogofilter-score
+@code{spam-bogofilter-score}.
+
+You must have bogofilter processing enabled for that command to work
+properly.
+
+@xref{Bogofilter}.
+
+@end table
+
+Also, when you load @code{spam.el}, you will be able to customize its
+variables. Try @code{customize-group} on the @samp{spam} variable
+group.
+
+The concepts of ham processors and spam processors are very important.
+Ham processors and spam processors for a group can be set with the
+@code{spam-process} group parameter, or the
+@code{gnus-spam-process-newsgroups} variable. Ham processors take
+mail known to be non-spam (@emph{ham}) and process it in some way so
+that later similar mail will also be considered non-spam. Spam
+processors take mail known to be spam and process it so similar spam
+will be detected later.
+
+Gnus learns from the spam you get. You have to collect your spam in
+one or more spam groups, and set or customize the variable
+@code{spam-junk-mailgroups} as appropriate. You can also declare
+groups to contain spam by setting their group parameter
+@code{spam-contents} to @code{gnus-group-spam-classification-spam}, or
+by customizing the corresponding variable
+@code{gnus-spam-newsgroup-contents}. The @code{spam-contents} group
+parameter and the @code{gnus-spam-newsgroup-contents} variable can
+also be used to declare groups as @emph{ham} groups if you set their
+classification to @code{gnus-group-spam-classification-ham}. If
+groups are not classified by means of @code{spam-junk-mailgroups},
+@code{spam-contents}, or @code{gnus-spam-newsgroup-contents}, they are
+considered @emph{unclassified}. All groups are unclassified by
+default.
+
+In spam groups, all messages are considered to be spam by default:
+they get the @samp{H} mark when you enter the group. You must review
+these messages from time to time and remove the @samp{H} mark for
+every message that is not spam after all. To remove the @samp{H}
+mark, you can use @kbd{M-u} to "unread" the article, or @kbd{d} for
+declaring it read the non-spam way. When you leave a group, all
+spam-marked (@samp{H}) articles are sent to a spam processor which
+will study them as spam samples.
+
+Messages may also be deleted in various other ways, and unless
+@code{spam-ham-marks} gets overridden below, marks @samp{R} and
+@samp{r} for default read or explicit delete, marks @samp{X} and
+@samp{K} for automatic or explicit kills, as well as mark @samp{Y} for
+low scores, are all considered to be associated with articles which
+are not spam. This assumption might be false, in particular if you
+use kill files or score files as means for detecting genuine spam, you
+should then adjust the @code{spam-ham-marks} variable.
+
+@defvar spam-ham-marks
+You can customize this variable to be the list of marks you want to
+consider ham. By default, the list contains the deleted, read,
+killed, kill-filed, and low-score marks.
+@end defvar
+
+@defvar spam-spam-marks
+You can customize this variable to be the list of marks you want to
+consider spam. By default, the list contains only the spam mark.
+@end defvar
+
+When you leave @emph{any} group, regardless of its
+@code{spam-contents} classification, all spam-marked articles are sent
+to a spam processor, which will study these as spam samples. If you
+explicit kill a lot, you might sometimes end up with articles marked
+@samp{K} which you never saw, and which might accidentally contain
+spam. Best is to make sure that real spam is marked with @samp{H},
+and nothing else.
+
+When you leave a @emph{spam} group, all spam-marked articles are
+marked as expired after processing with the spam processor. This is
+not done for @emph{unclassified} or @emph{ham} groups. Also, any
+@strong{ham} articles in a spam group will be moved to a location
+determined by either the @code{ham-process-destination} group
+parameter or the @code{gnus-ham-process-destinations} variable. The
+location is a group name. If the @code{ham-process-destination}
+parameter is not set, spam articles are only expired.
+
+When you leave a @emph{ham} group, all ham-marked articles are sent to
+a ham processor, which will study these as non-spam samples.
+
+When you leave a @emph{ham} or @emph{unclassified} group, all
+@strong{spam} articles are moved to a location determined by either
+the @code{spam-process-destination} group parameter or the
+@code{gnus-spam-process-destinations} variable. The location is a
+group name. If the @code{spam-process-destination} parameter is not
+set, the spam articles are only expired.
+
+To use the @code{spam.el} facilities for incoming mail filtering, you
+must add the following to your fancy split list
+@code{nnmail-split-fancy} or @code{nnimap-split-fancy}:
+
+@example
+(: spam-split)
+@end example
+
+Note that the fancy split may be called @code{nnmail-split-fancy} or
+@code{nnimap-split-fancy}, depending on whether you use the nnmail or
+nnimap back ends to retrieve your mail.
+
+The @code{spam-split} function will process incoming mail and send the
+mail considered to be spam into the group name given by the variable
+@code{spam-split-group}. By default that group name is @samp{spam},
+but you can customize it.
+
+@emph{TODO: Currently, spam.el only supports insertion of articles
+into a backend. There is no way to tell spam.el that an article is no
+longer spam or ham.}
+
+@emph{TODO: spam.el needs to provide a uniform way of training all the
+statistical databases. Some have that functionality built-in, others
+don't.}
+
+The following are the methods you can use to control the behavior of
+@code{spam-split} and their corresponding spam and ham processors:
+
+@menu
+* Blacklists and Whitelists::
+* BBDB Whitelists::
+* Blackholes::
+* Bogofilter::
+* ifile spam filtering::
+* spam-stat spam filtering::
+* Extending spam.el::
+@end menu
+
+@node Blacklists and Whitelists
+@subsubsection Blacklists and Whitelists
+@cindex spam filtering
+@cindex whitelists, spam filtering
+@cindex blacklists, spam filtering
+@cindex spam.el
+
+@defvar spam-use-blacklist
+Set this variable to t if you want to use blacklists when splitting
+incoming mail. Messages whose senders are in the blacklist will be
+sent to the @code{spam-split-group}. This is an explicit filter,
+meaning that it acts only on mail senders @emph{declared} to be
+spammers.
+@end defvar
+
+@defvar spam-use-whitelist
+Set this variable to t if you want to use whitelists when splitting
+incoming mail. Messages whose senders are not in the whitelist will
+be sent to the @code{spam-split-group}. This is an implicit filter,
+meaning it believes everyone to be a spammer unless told otherwise.
+Use with care.
+@end defvar
+
+@defvar gnus-group-spam-exit-processor-blacklist
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, the senders of
+spam-marked articles will be added to the blacklist.
+@end defvar
+
+@defvar gnus-group-ham-exit-processor-whitelist
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, the senders of
+ham-marked articles in @emph{ham} groups will be added to the
+whitelist. Note that this ham processor has no effect in @emph{spam}
+or @emph{unclassified} groups.
+@end defvar
+
+Blacklists are lists of regular expressions matching addresses you
+consider to be spam senders. For instance, to block mail from any
+sender at @samp{vmadmin.com}, you can put @samp{vmadmin.com} in your
+blacklist. You start out with an empty blacklist. Blacklist entries
+use the Emacs regular expression syntax.
+
+Conversely, whitelists tell Gnus what addresses are considered
+legitimate. All non-whitelisted addresses are considered spammers.
+This option is probably not useful for most Gnus users unless the
+whitelists is very comprehensive or permissive. Also see @ref{BBDB
+Whitelists}. Whitelist entries use the Emacs regular expression
+syntax.
+
+The blacklist and whitelist file locations can be customized with the
+@code{spam-directory} variable (@file{~/News/spam} by default), or
+the @code{spam-whitelist} and @code{spam-blacklist} variables
+directly. The whitelist and blacklist files will by default be in the
+@code{spam-directory} directory, named @file{whitelist} and
+@file{blacklist} respectively.
+
+@node BBDB Whitelists
+@subsubsection BBDB Whitelists
+@cindex spam filtering
+@cindex BBDB whitelists, spam filtering
+@cindex BBDB, spam filtering
+@cindex spam.el
+
+@defvar spam-use-BBDB
+
+Analogous to @code{spam-use-whitelist} (@pxref{Blacklists and
+Whitelists}), but uses the BBDB as the source of whitelisted addresses,
+without regular expressions. You must have the BBDB loaded for
+@code{spam-use-BBDB} to work properly. Only addresses in the BBDB
+will be allowed through; all others will be classified as spam.
+
+@end defvar
+
+@defvar gnus-group-ham-exit-processor-BBDB
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, the senders of
+ham-marked articles in @emph{ham} groups will be added to the
+BBDB. Note that this ham processor has no effect in @emph{spam}
+or @emph{unclassified} groups.
+@end defvar
+
+@node Blackholes
+@subsubsection Blackholes
+@cindex spam filtering
+@cindex blackholes, spam filtering
+@cindex spam.el
+
+@defvar spam-use-blackholes
+
+This option is disabled by default. You can let Gnus consult the
+blackhole-type distributed spam processing systems (DCC, for instance)
+when you set this option. The variable @code{spam-blackhole-servers}
+holds the list of blackhole servers Gnus will consult. The current
+list is fairly comprehensive, but make sure to let us know if it
+contains outdated servers.
+
+The blackhole check uses the @code{dig.el} package, but you can tell
+@code{spam.el} to use @code{dns.el} instead for better performance if
+you set @code{spam-use-dig} to nil. It is not recommended at this
+time to set @code{spam-use-dig} to nil despite the possible
+performance improvements, because some users may be unable to use it,
+but you can try it and see if it works for you.
+
+@end defvar
+
+@defvar spam-blackhole-servers
+
+The list of servers to consult for blackhole checks.
+
+@end defvar
+
+@defvar spam-use-dig
+
+Use the @code{dig.el} package instead of the @code{dns.el} package.
+The default setting of t is recommended.
+
+@end defvar
+
+Blackhole checks are done only on incoming mail. There is no spam or
+ham processor for blackholes.
+
+@node Bogofilter
+@subsubsection Bogofilter
+@cindex spam filtering
+@cindex bogofilter, spam filtering
+@cindex spam.el
+
+@defvar spam-use-bogofilter
+
+Set this variable if you want @code{spam-split} to use Eric Raymond's
+speedy Bogofilter. This has been tested with a locally patched copy
+of version 0.4. Make sure to read the installation comments in
+@code{spam.el}.
+
+With a minimum of care for associating the @samp{H} mark for spam
+articles only, Bogofilter training all gets fairly automatic. You
+should do this until you get a few hundreds of articles in each
+category, spam or not. The shell command @command{head -1
+~/.bogofilter/*} shows both article counts. The command @kbd{S t} in
+summary mode, either for debugging or for curiosity, triggers
+Bogofilter into displaying in another buffer the @emph{spamicity}
+score of the current article (between 0.0 and 1.0), together with the
+article words which most significantly contribute to the score.
+
+If the @code{bogofilter} executable is not in your path, Bogofilter
+processing will be turned off.
+
+@end defvar
+
+
+@defvar gnus-group-spam-exit-processor-bogofilter
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, spam-marked articles
+will be added to the bogofilter spam database, and ham-marked articles
+will be added to the bogofilter ham database. @strong{Note that the
+Bogofilter spam processor is the only spam processor to also do ham
+processing.}
+@end defvar
+
+@node ifile spam filtering
+@subsubsection ifile spam filtering
+@cindex spam filtering
+@cindex ifile, spam filtering
+@cindex spam.el
+
+@defvar spam-use-ifile
+
+Enable this variable if you want @code{spam-split} to use ifile, a
+statistical analyzer similar to Bogofilter.
+
+@end defvar
+
+@defvar spam-ifile-all-categories
+
+Enable this variable if you want @code{spam-use-ifile} to give you all
+the ifile categories, not just spam/non-spam. If you use this, make
+sure you train ifile as described in its documentation.
+
+@end defvar
+
+@defvar spam-ifile-spam-category
+
+This is the category of spam messages as far as ifile is concerned.
+The actual string used is irrelevant, but you probably want to leave
+the default value of @samp{spam}.
+@end defvar
+
+@defvar spam-ifile-database-path
+
+This is the filename for the ifile database. It is not specified by
+default, so ifile will use its own default database name.
+
+@end defvar
+
+The ifile mail classifier is similar to Bogofilter in intent and
+purpose. A ham and a spam processor are provided, plus the
+@code{spam-use-ifile} variable to indicate to spam-split that ifile
+should be used. The 1.2.1 version of ifile was used to test this
+functionality.
+
+@node spam-stat spam filtering
+@subsubsection spam-stat spam filtering
+@cindex spam filtering
+@cindex spam-stat, spam filtering
+@cindex spam-stat.el
+@cindex spam.el
+
+@xref{Filtering Spam Using Statistics (spam-stat.el)}.
+
+@defvar spam-use-stat
+
+Enable this variable if you want @code{spam-split} to use
+spam-stat.el, an Emacs Lisp statistical analyzer.
+
+@end defvar
+
+@defvar gnus-group-spam-exit-processor-stat
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, the spam-marked
+articles will be added to the spam-stat database of spam messages.
+@end defvar
+
+@defvar gnus-group-ham-exit-processor-stat
+Add this symbol to a group's @code{spam-process} parameter by
+customizing the group parameters or the
+@code{gnus-spam-process-newsgroups} variable. When this symbol is
+added to a group's @code{spam-process} parameter, the ham-marked
+articles in @emph{ham} groups will be added to the spam-stat database
+of non-spam messages. Note that this ham processor has no effect in
+@emph{spam} or @emph{unclassified} groups.
+@end defvar
+
+This enables spam.el to cooperate with spam-stat.el. spam-stat.el
+provides an internal (Lisp-only) spam database, which unlike ifile or
+Bogofilter does not require external programs. A spam and a ham
+processor, and the @code{spam-use-stat} variable for @code{spam-split}
+are provided.
+
+@node Extending spam.el
+@subsubsection Extending spam.el
+@cindex spam filtering
+@cindex spam.el, extending
+@cindex extending spam.el
+
+Say you want to add a new back end called blackbox. For filtering
+incoming mail, provide the following:
+
+@enumerate
+
+@item
+code
+
+@example
+(defvar spam-use-blackbox nil
+ "True if blackbox should be used.")
+@end example
+
+Add
+@example
+ (spam-use-blackbox . spam-check-blackbox)
+@end example
+to @code{spam-list-of-checks}.
+
+@item
+functionality
+
+Write the @code{spam-check-blackbox} function. It should return
+@samp{nil} or @code{spam-split-group}. See the existing
+@code{spam-check-*} functions for examples of what you can do.
+@end enumerate
+
+For processing spam and ham messages, provide the following:
+
+@enumerate
+
+@item
+code
+
+Note you don't have to provide a spam or a ham processor. Only
+provide them if Blackbox supports spam or ham processing.
+
+@example
+(defvar gnus-group-spam-exit-processor-blackbox "blackbox"
+ "The Blackbox summary exit spam processor.
+Only applicable to spam groups.")
+
+(defvar gnus-group-ham-exit-processor-blackbox "blackbox"
+ "The whitelist summary exit ham processor.
+Only applicable to non-spam (unclassified and ham) groups.")
+
+@end example
+
+@item
+functionality
+
+@example
+(defun spam-blackbox-register-spam-routine ()
+ (spam-generic-register-routine
+ ;; the spam function
+ (lambda (article)
+ (let ((from (spam-fetch-field-from-fast article)))
+ (when (stringp from)
+ (blackbox-do-something-with-this-spammer from))))
+ ;; the ham function
+ nil))
+
+(defun spam-blackbox-register-ham-routine ()
+ (spam-generic-register-routine
+ ;; the spam function
+ nil
+ ;; the ham function
+ (lambda (article)
+ (let ((from (spam-fetch-field-from-fast article)))
+ (when (stringp from)
+ (blackbox-do-something-with-this-ham-sender from))))))
+@end example
+
+Write the @code{blackbox-do-something-with-this-ham-sender} and
+@code{blackbox-do-something-with-this-spammer} functions. You can add
+more complex code than fetching the message sender, but keep in mind
+that retrieving the whole message takes significantly longer than the
+sender through @code{spam-fetch-field-from-fast}, because the message
+senders are kept in memory by Gnus.
+
+@end enumerate
+
+
+@node Filtering Spam Using Statistics (spam-stat.el)
+@subsection Filtering Spam Using Statistics (spam-stat.el)
+@cindex Paul Graham
+@cindex Graham, Paul
+@cindex naive Bayesian spam filtering
+@cindex Bayesian spam filtering, naive
+@cindex spam filtering, naive Bayesian
+
+Paul Graham has written an excellent essay about spam filtering using
+statistics: @uref{http://www.paulgraham.com/spam.html,A Plan for
+Spam}. In it he describes the inherent deficiency of rule-based
+filtering as used by SpamAssassin, for example: Somebody has to write
+the rules, and everybody else has to install these rules. You are
+always late. It would be much better, he argues, to filter mail based
+on whether it somehow resembles spam or non-spam. One way to measure
+this is word distribution. He then goes on to describe a solution
+that checks whether a new mail resembles any of your other spam mails
+or not.
+
+The basic idea is this: Create a two collections of your mail, one
+with spam, one with non-spam. Count how often each word appears in
+either collection, weight this by the total number of mails in the
+collections, and store this information in a dictionary. For every
+word in a new mail, determine its probability to belong to a spam or a
+non-spam mail. Use the 15 most conspicuous words, compute the total
+probability of the mail being spam. If this probability is higher
+than a certain threshold, the mail is considered to be spam.
+
+Gnus supports this kind of filtering. But it needs some setting up.
+First, you need two collections of your mail, one with spam, one with
+non-spam. Then you need to create a dictionary using these two
+collections, and save it. And last but not least, you need to use
+this dictionary in your fancy mail splitting rules.
+
+@menu
+* Creating a spam-stat dictionary::
+* Splitting mail using spam-stat::
+* Low-level interface to the spam-stat dictionary::
+@end menu
+
+@node Creating a spam-stat dictionary
+@subsubsection Creating a spam-stat dictionary
+
+Before you can begin to filter spam based on statistics, you must
+create these statistics based on two mail collections, one with spam,
+one with non-spam. These statistics are then stored in a dictionary
+for later use. In order for these statistics to be meaningful, you
+need several hundred emails in both collections.
+
+Gnus currently supports only the nnml back end for automated dictionary
+creation. The nnml back end stores all mails in a directory, one file
+per mail. Use the following:
+
+@defun spam-stat-process-spam-directory
+Create spam statistics for every file in this directory. Every file
+is treated as one spam mail.
+@end defun
+
+@defun spam-stat-process-non-spam-directory
+Create non-spam statistics for every file in this directory. Every
+file is treated as one non-spam mail.
+@end defun
+
+Usually you would call @code{spam-stat-process-spam-directory} on a
+directory such as @file{~/Mail/mail/spam} (this usually corresponds
+the the group @samp{nnml:mail.spam}), and you would call
+@code{spam-stat-process-non-spam-directory} on a directory such as
+@file{~/Mail/mail/misc} (this usually corresponds the the group
+@samp{nnml:mail.misc}).
+
+When you are using IMAP, you won't have the mails available locally,
+so that will not work. One solution is to use the Gnus Agent to cache
+the articles. Then you can use directories such as
+@file{"~/News/agent/nnimap/mail.yourisp.com/personal_spam"} for
+@code{spam-stat-process-spam-directory}. @xref{Agent as Cache}.
+
+@defvar spam-stat
+This variable holds the hash-table with all the statistics -- the
+dictionary we have been talking about. For every word in either
+collection, this hash-table stores a vector describing how often the
+word appeared in spam and often it appeared in non-spam mails.
+@end defvar
+
+If you want to regenerate the statistics from scratch, you need to
+reset the dictionary.
+
+@defun spam-stat-reset
+Reset the @code{spam-stat} hash-table, deleting all the statistics.
+@end defun
+
+When you are done, you must save the dictionary. The dictionary may
+be rather large. If you will not update the dictionary incrementally
+(instead, you will recreate it once a month, for example), then you
+can reduce the size of the dictionary by deleting all words that did
+not appear often enough or that do not clearly belong to only spam or
+only non-spam mails.
+
+@defun spam-stat-reduce-size
+Reduce the size of the dictionary. Use this only if you do not want
+to update the dictionary incrementally.
+@end defun
+
+@defun spam-stat-save
+Save the dictionary.
+@end defun
+
+@defvar spam-stat-file
+The filename used to store the dictionary. This defaults to
+@file{~/.spam-stat.el}.
+@end defvar
+
+@node Splitting mail using spam-stat
+@subsubsection Splitting mail using spam-stat
+
+In order to use @code{spam-stat} to split your mail, you need to add the
+following to your @file{~/.gnus} file:
+
+@example
+(require 'spam-stat)
+(spam-stat-load)
+@end example
+
+This will load the necessary Gnus code, and the dictionary you
+created.
+
+Next, you need to adapt your fancy splitting rules: You need to
+determine how to use @code{spam-stat}. The following examples are for
+the nnml back end. Using the nnimap back end works just as well. Just
+use @code{nnimap-split-fancy} instead of @code{nnmail-split-fancy}.
+
+In the simplest case, you only have two groups, @samp{mail.misc} and
+@samp{mail.spam}. The following expression says that mail is either
+spam or it should go into @samp{mail.misc}. If it is spam, then
+@code{spam-stat-split-fancy} will return @samp{mail.spam}.
+
+@example
+(setq nnmail-split-fancy
+ `(| (: spam-stat-split-fancy)
+ "mail.misc"))
+@end example
+
+@defvar spam-stat-split-fancy-spam-group
+The group to use for spam. Default is @samp{mail.spam}.
+@end defvar
+
+If you also filter mail with specific subjects into other groups, use
+the following expression. Only mails not matching the regular
+expression are considered potential spam.
+
+@example
+(setq nnmail-split-fancy
+ `(| ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ (: spam-stat-split-fancy)
+ "mail.misc"))
+@end example
+
+If you want to filter for spam first, then you must be careful when
+creating the dictionary. Note that @code{spam-stat-split-fancy} must
+consider both mails in @samp{mail.emacs} and in @samp{mail.misc} as
+non-spam, therefore both should be in your collection of non-spam
+mails, when creating the dictionary!
+
+@example
+(setq nnmail-split-fancy
+ `(| (: spam-stat-split-fancy)
+ ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ "mail.misc"))
+@end example
+
+You can combine this with traditional filtering. Here, we move all
+HTML-only mails into the @samp{mail.spam.filtered} group. Note that since
+@code{spam-stat-split-fancy} will never see them, the mails in
+@samp{mail.spam.filtered} should be neither in your collection of spam mails,
+nor in your collection of non-spam mails, when creating the
+dictionary!
+
+@example
+(setq nnmail-split-fancy
+ `(| ("Content-Type" "text/html" "mail.spam.filtered")
+ (: spam-stat-split-fancy)
+ ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ "mail.misc"))
+@end example
+
+
+@node Low-level interface to the spam-stat dictionary
+@subsubsection Low-level interface to the spam-stat dictionary
+
+The main interface to using @code{spam-stat}, are the following functions:
+
+@defun spam-stat-buffer-is-spam
+Called in a buffer, that buffer is considered to be a new spam mail.
+Use this for new mail that has not been processed before.
+@end defun
+
+@defun spam-stat-buffer-is-no-spam
+Called in a buffer, that buffer is considered to be a new non-spam
+mail. Use this for new mail that has not been processed before.
+@end defun
+
+@defun spam-stat-buffer-change-to-spam
+Called in a buffer, that buffer is no longer considered to be normal
+mail but spam. Use this to change the status of a mail that has
+already been processed as non-spam.
+@end defun
+
+@defun spam-stat-buffer-change-to-non-spam
+Called in a buffer, that buffer is no longer considered to be spam but
+normal mail. Use this to change the status of a mail that has already
+been processed as spam.
+@end defun
+
+@defun spam-stat-save
+Save the hash table to the file. The filename used is stored in the
+variable @code{spam-stat-file}.
+@end defun
+
+@defun spam-stat-load
+Load the hash table from a file. The filename used is stored in the
+variable @code{spam-stat-file}.
+@end defun
+
+@defun spam-stat-score-word
+Return the spam score for a word.
+@end defun
+
+@defun spam-stat-score-buffer
+Return the spam score for a buffer.
+@end defun
+
+@defun spam-stat-split-fancy
+Use this function for fancy mail splitting. Add the rule @samp{(:
+spam-stat-split-fancy)} to @code{nnmail-split-fancy}
+@end defun
+
+Make sure you load the dictionary before using it. This requires the
+following in your @file{~/.gnus} file:
+
+@example
+(require 'spam-stat)
+(spam-stat-load)
+@end example
+
+Typical test will involve calls to the following functions:
+
+@example
+Reset: (setq spam-stat (make-hash-table :test 'equal))
+Learn spam: (spam-stat-process-spam-directory "~/Mail/mail/spam")
+Learn non-spam: (spam-stat-process-non-spam-directory "~/Mail/mail/misc")
+Save table: (spam-stat-save)
+File size: (nth 7 (file-attributes spam-stat-file))
+Number of words: (hash-table-count spam-stat)
+Test spam: (spam-stat-test-directory "~/Mail/mail/spam")
+Test non-spam: (spam-stat-test-directory "~/Mail/mail/misc")
+Reduce table size: (spam-stat-reduce-size)
+Save table: (spam-stat-save)
+File size: (nth 7 (file-attributes spam-stat-file))
+Number of words: (hash-table-count spam-stat)
+Test spam: (spam-stat-test-directory "~/Mail/mail/spam")
+Test non-spam: (spam-stat-test-directory "~/Mail/mail/misc")
+@end example
+
+Here is how you would create your dictionary:
+
+@example
+Reset: (setq spam-stat (make-hash-table :test 'equal))
+Learn spam: (spam-stat-process-spam-directory "~/Mail/mail/spam")
+Learn non-spam: (spam-stat-process-non-spam-directory "~/Mail/mail/misc")
+Repeat for any other non-spam group you need...
+Reduce table size: (spam-stat-reduce-size)
+Save table: (spam-stat-save)
+@end example
+