-;;;; Bogofilter
-
-;;; See Paul Graham article, at `http://www.paulgraham.com/spam.html'.
-
-;;; This page is for those wanting to control spam with the help of
-;;; Eric Raymond's speedy Bogofilter, see
-;;; http://www.tuxedo.org/~esr/bogofilter. This has been tested with
-;;; a locally patched copy of version 0.4.
-
-;;; Make sure Bogofilter is installed. Bogofilter internally uses
-;;; Judy fast associative arrays, so you need to install Judy first,
-;;; and Bogofilter next. Fetch both distributions by visiting the
-;;; following links and downloading the latest version of each:
-;;;
-;;; http://sourceforge.net/projects/judy/
-;;; http://www.tuxedo.org/~esr/bogofilter/
-;;;
-;;; Unpack the Judy distribution and enter its main directory. Then do:
-;;;
-;;; ./configure
-;;; make
-;;; make install
-;;;
-;;; You will likely need to become super-user for the last step.
-;;; Then, unpack the Bogofilter distribution and enter its main
-;;; directory:
-;;;
-;;; make
-;;; make install
-;;;
-;;; Here as well, you need to become super-user for the last step.
-;;; Now, initialize your word lists by doing, under your own identity:
-;;;
-;;; mkdir ~/.bogofilter
-;;; touch ~/.bogofilter/badlist
-;;; touch ~/.bogofilter/goodlist
-;;;
-;;; These two files are text files you may edit, but you normally don't!
-
-;;; The `M-d' command gets added to Gnus summary mode, marking current
-;;; article as spam, showing it with the `H' mark. Whenever you see a
-;;; spam article, make sure to mark its summary line with `M-d' before
-;;; leaving the group. Some groups, as per variable
-;;; `spam-junk-mailgroups' below, receive articles from Gnus splitting
-;;; on clues added by spam recognisers, so for these groups, we tack
-;;; an `H' mark at group entry for all summary lines which would
-;;; otherwise have no other mark. Make sure to _remove_ `H' marks for
-;;; any article which is _not_ genuine spam, before leaving such
-;;; groups: you may use `M-u' to "unread" the article, or `d' for
-;;; declaring it read the non-spam way. When you leave a group, all
-;;; `H' marked articles, saved or unsaved, are sent to Bogofilter
-;;; which will study them as spam samples.
-
-;;; Messages may also be deleted in various other ways, and unless
-;;; `spam-ham-marks-form' gets overridden below, marks `R' and `r' for
-;;; default read or explicit delete, marks `X' and 'K' for automatic
-;;; or explicit kills, as well as mark `Y' for low scores, are all
-;;; considered to be associated with articles which are not spam.
-;;; This assumption might be false, in particular if you use kill
-;;; files or score files as means for detecting genuine spam, you
-;;; should then adjust `spam-ham-marks-form'. When you leave a group,
-;;; all _unsaved_ articles bearing any the above marks are sent to
-;;; Bogofilter which will study these as not-spam samples. If you
-;;; explicit kill a lot, you might sometimes end up with articles
-;;; marked `K' which you never saw, and which might accidentally
-;;; contain spam. Best is to make sure that real spam is marked with
-;;; `H', and nothing else.
-
-;;; All other marks do not contribute to Bogofilter pre-conditioning.
-;;; In particular, ticked, dormant or souped articles are likely to
-;;; contribute later, when they will get deleted for real, so there is
-;;; no need to use them prematurely. Explicitly expired articles do
-;;; not contribute, command `E' is a way to get rid of an article
-;;; without Bogofilter ever seeing it.
-
-;;; In a word, with a minimum of care for associating the `H' mark for
-;;; spam articles only, Bogofilter training all gets fairly automatic.
-;;; You should do this until you get a few hundreds of articles in
-;;; each category, spam or not. The shell command `head -1
-;;; ~/.bogofilter/*' shows both article counts. The command `S S' in
-;;; summary mode, either for debugging or for curiosity, triggers
-;;; Bogofilter into displaying in another buffer the "spamicity" score
-;;; of the current article (between 0.0 and 1.0), together with the
-;;; article words which most significantly contribute to the score.
-
-;;; The real way for using Bogofilter, however, is to have some use
-;;; tool like `procmail' for invoking it on message reception, then
-;;; adding some recognisable header in case of detected spam. Gnus
-;;; splitting rules might later trip on these added headers and react
-;;; by sorting such articles into specific junk folders as per
-;;; `spam-junk-mailgroups'. Here is a possible `.procmailrc' contents
-;;; (still untested -- please tell me how it goes):
-;;;
-;;; :0HBf:
-;;; * ? bogofilter
-;;; | formail -bfI "X-Spam-Status: Yes"
-
-(defun spam-check-bogofilter ()
- ;; Dynamic spam check. I do not know how to check the exit status,
- ;; so instead, read `bogofilter -v' output.
- (when (and spam-use-bogofilter spam-bogofilter-path)
- (spam-bogofilter-articles nil "-v" (list (gnus-summary-article-number)))
- (when (save-excursion
- (set-buffer spam-bogofilter-output-buffer-name)
- (goto-char (point-min))
- (re-search-forward "Spamicity: \\(0\\.9\\|1\\.0\\)" nil t))
- spam-split-group)))