2 Copyright (C) 1991, 1992 Per Hammarlund (perham@nada.kth.se)
5 This is documentation for some emacs lisp code that looks for
6 translations of English and Japanese using the EDICTJ Public Domain
7 Japanese/English dictionary.
9 Written by Per Hammarlund <perham@nada.kth.se>.
10 Morphology and private dictionary handling/editing by Bob Kerns
12 Helpful remarks from Ken-Ichi Handa <handa@etl.go.jp>.
13 The EDICTJ PD dictionary is maintained by Jim Breen
14 <jwb@monu6.cc.monash.edu.au>.
17 This program is free software; you can redistribute it and/or modify
18 it under the terms of the GNU General Public License as published by
19 the Free Software Foundation; either version 1, or (at your option)
22 This program is distributed in the hope that it will be useful, but
23 WITHOUT ANY WARRANTY; without even the implied warranty of
24 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
25 General Public License for more details.
27 You should have received a copy of the GNU General Public License
28 along with this program; if not, write to the Free Software
29 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
34 This software, called edict, helps nemacs/mule users to use the public
35 domain Japanese-English dictionary EDICT. (It is sometimes called
36 EDICTJ, but it is the same thing.)
38 Edict is a set of functions that it helps you to perform:
40 * Search for an English word. With one key command, ie "<ESC>*", the
41 English word under or in front of the point is used as the key in a
42 search of EDICT. The matching words are shown in a window that is not
43 selected, edict also tries to make the window fit snugly around the
46 * Search for a kanji/kana sequence. Here you mark a region, and then
47 with one key command, ie "<ESC>_", that region is used as the key in a
48 search of EDICTJ. The matches are presented as above. Edict will
49 attempt to transform the character sequence in the region to a "ground
50 form", ie verbs will be transformed to their plain present form, eg "
51 »È¤Ã¤Æ" to "»È¤¦", and also "ÍÍ" and such postfixes will be stripped
52 from the sequence, then it will search for the transformed sequence in
55 * Inserting one of the matches of a search into the text in the buffer
56 from which the search was initiated. This is also a key command, ie
57 "<ESC>+". If you perform the command again, the next word in the list
58 of matches will be inserted instead. Edict realizes if the search was
59 done for an English or a Japanese word, and inserts accordingly.
61 * Update a private edict file. If you give a numeric argument, ie
62 C-u, to the two commands above, edict will help you insert this key in
63 a private edict file. The updating is done in an electric mode that
64 tries to ensure that the syntax of the file is correct. Right now the
65 input methods EGG and SKK are supported in the electric mode.
67 Edict is entirely written in emacs lisp. It has been tested and works
68 in Nemacs 3.3.2 and testing has begun in Mule 0.9.2, soon 0.9.3.
70 Short Getting Started Guide
72 The best way to get started using the software is to install it using
73 the install.edict script, if that fails or if you are not to keen to
74 use installation scripts with unknown effects, this is the harder, and
75 more "error prone" way of doing it. This text a more talkative
76 version of the getting started guide in the edict.el file.
78 ** Installing with install.edict.
80 Make a new directory to keep the edict software. Move all the files
81 there. Cd there and run the installation script, eg:
83 cd /usr/local/src/edict
86 ** Installing edict yourself
88 (Indented text includes more information, that you might find useful
91 1. Make sure that you have placed edict.el in a directory that is
92 included in the nemacs's search path, look at the variable "load-path"
93 to make sure that the directory is in that list.
95 One way to get to see the load-path, is to type
96 "<ESC><ESC>load-path<RETURN>", this will print the value in
97 the mini buffer. Another way is to print
98 "load-path<LINEFEED>" in a buffer that is in lisp interaction
99 mode, like *scratch* when you start nemacs.
101 2. As mentioned above you will, for convenience, want to define what
102 keys to use when activating the commands. To do that you will have to
103 add something like this to your .emacs (or .nemacs) file:
105 ---------------- 8< ----------------
106 (autoload 'edict-search-english "edict" "Search for a translation of an English word")
107 (global-set-key "\e*" 'edict-search-english)
108 (autoload 'edict-search-kanji "edict" "Search for a translation of a Kanji sequence")
109 (global-set-key "\e_" 'edict-search-kanji)
110 (autoload 'edict-insert "edict" "Insert the last translation")
111 (global-set-key "\e+" 'edict-insert)
112 ---------------- 8< ----------------
114 The autoload functions tells nemacs what program file to load
115 when a certain function is referenced. This way the program
116 file does not have to be loaded when nemacs is started, but
117 instead it is started when you first (if at all) use the
120 The global-set-key maps a sequence of key strokes a function.
121 Global means that it will be valid for all modes.
123 Note that you can change the key binding to whatever you like, these
124 are only "examples". In your personalized nemacs these three key
125 sequences may be taken or you may prefer something else.
128 3. You have to tell edict where it can find the edict dictionary.
129 Preferably, Place the edict dictionary in the standard emacs lisp
130 library directory. If you don't have write access there, put it in
131 your ~/emacs directory and make sure that this directory is in the
134 You can add a local emacs directory to load-path by, for
136 (setq load-path (cons (concat (getenv "HOME") "/emacs")
138 Note that nemacs searches the load path in a "left to right"
139 order, if you put a file in a directory that appears early in
140 the load-path list, this will be loaded in preference of
141 something appearing a later directory.
143 The variable *edict-files* should be a list of filenames of edict
144 dictionary files that you want edict to load and search in. The real
145 dictionary EDICTJ should be one of these files. You may also have
146 have some local file(s) there, like your friend's private edict files.
147 Something like this *may* be appropriate to:
149 (setq *edict-files* '("edictj"
150 "~my-friend-the-user/.edict"
151 "~my-other-friend-the-user/.edict"))
153 By default, nemacs searches the load-path (the same directories that
154 are searched when you do m-X load-file<return>edict<return>), for a
158 4. Set the name of your *own* local edictj file. (Note that this file
159 should not be included in the list above!) Edict will include the
160 additions that you do in this file. The variable *edict-private-file*
161 defaults to "~/.edict", if you want something else do a:
163 (setq *edict-private-file* "~/somewhere/somethingelse")
167 (setq *edict-private-file* "~/emacs/private-edict")
169 In UNIX filenames that begin with a "." are "invisible" if you
170 do a plain "ls" command. If you want to see them you have to
171 do a "ls -a", "-a" for "all".
173 Don't forget to submit your useful words to Jim Breen once in a while!
174 His address is <jwb@monu6.cc.monash.edu.au>.
176 You are done. Please report errors and comments to
177 <perham@nada.kth.se>.
182 Here we will try to give some examples of how it all works. These
183 examples assume that you are reasonably familiar with nemacs and/or
184 mule and that you are familiar with one input method, like EGG or SKK.
186 In these examples, I will use the default key mappings as described
187 above, I hope it is clear what I mean.
189 * Searching for an English word.
193 Just some idle suggestions: Either you are a Japanese speaker and you
194 want to find what an English word means, or you aren't and you want to
195 find out what the Japanese equivalent might be.
197 Note that you can use M-_ just as well for searching for English text,
198 if you want to search for a multi word string. M-* is just usually
203 When you issue the command, M-*, edict will try to find and English
204 word at or in front of the point, much like ispell does. So, for the
205 example below, edict will find the word "dictionary" if the point is
206 at any of the "^" positions.
208 Why would I like to search that dictionary file?
211 If you are not looking at an English character, edict will scan
212 backwards until it finds the first English character.
214 If you place the point somewhere on dictionary, and press M-*, edict
215 will ask you this in the mini buffer:
217 Translate word (default "dictionary"):
219 If you hit RETURN, the default, "dictionary", will be used. If you
220 don't like the default, you can type something else in.
222 Just hit RETURN, then edict will say:
224 Searching for word "dictionary"...
226 and then after a short while it will say, "Found it!", and also
227 display an unselected window called "*edict matches*", looking
229 ---------------- 8< ----------------
230 ¼½ñ [¤¸¤·¤ç] /dictionary/
231 ¥Ç¥£¥¯¥·¥ç¥Ê¥ê /dictionary/
232 »ú°ú [¤¸¤Ó¤] /dictionary/
233 ¼Åµ [¤¸¤Æ¤ó] /dictionary/
234 ±ÑÏ [¤¨¤¤¤ï] /English-Japanese (e.g. dictionary)/
235 ¹¼±ñ [¤³¤¦¤¸¤¨¤ó] /Kojien (pn) (Japanese Dictionary)/
236 ÅŻҼ½ñ [¤Ç¤ó¤·¤¸¤·¤ç] /electronic dictionary/
237 ---------------- 8< ----------------
239 The matches are sorted so that "clear matches", like the first 4
240 above, are at the top. This is to aid you when you try to find the
241 "correct" match. English verbs that are inserted into the dictionary
242 as "/to something/" are also considered to be "clear matches". So if
243 you search for "use", you will get:
244 ---------------- 8< ----------------
245 ÍѤ¤¤ë [¤â¤Á¤¤¤ë] /to use/to make use of/
246 Ìò [¤ä¤¯] /use/service/role/position/
247 ºÎÍÑ [¤µ¤¤¤è¤¦] /use/adapt/
248 ¹Ô»È [¤³¤¦¤·] /use/exercise/
249 »È¤¦ [¤Ä¤«¤¦] /to use/
250 ÍÑ [¤è¤¦] /task/business/use/
251 ÍÑÅÓ [¤è¤¦¤È] /use/usefulness/
252 л¤Æ [¤è¤Ã¤Æ] /accordingly/because of/
253 ξÍÑ [¤ê¤ç¤¦¤è¤¦] /dual use/
254 ÍøÍÑ [¤ê¤è¤¦] /use (vs)/utilization/
255 Í×°ø [¤è¤¦¤¤¤ó] /primary factor/main cause/
256 ÍÎ´Û [¤è¤¦¤«¤ó] /western-style house/
257 ÍÍÑ [¤æ¤¦¤è¤¦] /useful (an)/helpful/
258 Í«¤µ¤òÀ²¤é¤·¤Ë /for amusement/by way of diversion (distraction from grief)/
259 ---------------- 8< ----------------
261 and a lot of other matches. "Clear matches" are at the top, and not
262 so clear matches are at the bottom.
264 ** What is an English character? What is romaji?
266 When edict tries to find and English word, it will look for something
267 that *looks* like and English word. This means that even strings that
268 are in JIS/EUC will be considered to be English text, these will be
269 remapped to ASCII before they are used as keys in a search. Examples
276 These will be remapped to "string" before searching.
279 * Searching for a Japanese string.
281 Searching for a Japanese string is currently slight more complicated.
282 Edict can currently not find the word boundaries in Japanese text.
283 (This will change soon, edict will in the future try to make an
284 educated guess based on the grammar of the sentence under the point.
286 »ä¤Ïedict¤ò»È¤Ã¤Æ¤¤¤ë¡£
290 Say that you want to search for "»È¤Ã¤Æ". What you have to do is to
291 move to the starting char, 1 above, and press C-<SPACE>, nemacs will
292 then say, "Mark set", in the mini buffer. Then you move to the first
293 char after the string you want to search for, to 2 above, char "¤¤".
294 Now you have marked a region, now you can do the command M-_. Edict
295 will say "Searching for word "»È¤Ã¤Æ" and then in rapid sequence show
296 the remappings, don't bother about those. It will find the word "»È¤¦
297 " in the dictionary and in a separate window display:
298 ---------------- 8< ----------------
299 »È¤¦ [¤Ä¤«¤¦] /to use/
300 ---------------- 8< ----------------
302 This example showed that edict tries to map verbs and adjectives back
305 Edict can also clean up a string from "alien" chars, for instance the
306 sequence that you want to search for has been split in a news article
312 If you now put the mark at the same chars as before, »È and ¤¤, edict
313 will first clean up the string, ie remove the newline and the ">"
314 chars and leading white space. Then it will apply the transformation
315 rules. The chars that edict will remove in strings are currently:
317 "¡¡-¡º¡½-¢à \n\t>;!:#?,.\"/@¨¡-¨À",
319 there are specified in the *edict-kanji-whitespace* variable. If you
320 want other chars, please add to this string, and also tell us what you
321 prefer so that it can be incorporated into future releases of edict.
323 Edict also tries to remove postfixes that carry "no" information, or
324 even if they carry information they might not be in the dictionary
325 with that (possibly) common postfix. An example of a postfix is "ÍÍ".
326 Searching for this string:
331 ---------------- 8< ----------------
332 ÅÄÃæ [¤¿¤Ê¤«] /Tanaka (pn)/
333 ¾å¾®ÅÄÃæ [¤«¤ß¤³¤¿¤Ê¤«] /Kamikotanaka (pl)/
334 ²¼¾®ÅÄÃæ [¤·¤â¤³¤¿¤Ê¤«] /Shimokotanaka (pl)/
335 ---------------- 8< ----------------
337 * Inserting from the list of matches.
339 What do you do with a match when you have found it? Obviously, you
340 may be reading and using edict to find words that you don't
341 understand. One might also use edict to find words when writing, we
342 believe that it is should be convenient for writing both Japanese and
343 English. Again, searching for the word "search" will give you
345 ---------------- 8< ----------------
346 õ¤¹ [¤µ¤¬¤¹] /to search/to seek/to look for/
348 õº÷ [¤¿¤ó¤µ¤¯] /search/
349 Áܺº [¤½¤¦¤µ] /search (vs)/investigation/
350 ººÆÉ [¤µ¤É¤¯] /investigative reading/research/
351 ¸¦µæ½ê [¤±¤ó¤¤å¤¦¤·¤ç] /research lab/
352 ¸¦µæ²ñ [¤±¤ó¤¤å¤¦¤«¤¤] /research society/
353 ¸¦µæ [¤±¤ó¤¤å¤¦] /study (vs)/research/investigation/
354 µÒ°÷¸¦µæ°÷ [¤¤ã¤¯¤¤¤ó¤±¤ó¤¤å¤¦¤¤¤ó] /visiting researcher/
355 ¸¦µæ°÷ [¤±¤ó¤¤å¤¦¤¤¤ó] /researcher/
356 ¸¦µæ³«È¯ [¤±¤ó¤¤å¤¦¤«¤¤¤Ï¤Ä] /R&D/research & development/
357 ¸¦µæÀ¸ [¤±¤ó¤¤å¤¦¤»¤¤] /research student/
358 Áܤ¹ [¤µ¤¬¤¹] /to seek/to search for/to look for/
359 õº÷ÌÚ [¤¿¤ó¤µ¤¯¤®] /search tree/
360 ---------------- 8< ----------------
362 If you now hit M-+, edict will insert the first match (õ¤¹) at point,
363 and then if you hit M-+ again, it will replace the first one it
364 inserted with the second. When it comes to the end of the list it
365 wraps. You can also use this command with a numerical argument,
366 getting the nth match in the list, starting with row 1. So C-u 3 M-+
367 will give you: "õº÷".
369 Edict works similarly for inserting English strings. Searching for "
371 ---------------- 8< ----------------
372 õº÷ÌÚ [¤¿¤ó¤µ¤¯¤®] /search tree/
373 õµá [¤¿¤ó¤¤å¤¦] /quest/pursuit/
374 ¼êõ¤ê [¤Æ¤µ¤°¤ê] /fumbling (vs)/groping/
375 õ¤¹ [¤µ¤¬¤¹] /to search/to seek/to look for/
376 õº÷ [¤¿¤ó¤µ¤¯] /search/
377 õÄå [¤¿¤ó¤Æ¤¤] /detective work/
378 ---------------- 8< ----------------
380 Doing the insert command will then give you in sequence: "search
381 tree", "quest", "pursuit", "fumbling (vs)", etc. Note that all the
382 matching English phrases are used.
384 * Inserting and entry in the private edict file.
386 OK, so what do you do if you cannot find a match? Or what do you do
387 if you have a large set of words that you would like to insert into
390 ** Searching for a word edict cannot find.
392 Say that you search for "gazillion", then edict will tell you "No
393 matches for key "gazillion". No you can redo the command, but with a
394 numerical argument, ie C-u M-*, then you wind up in a buffer with
395 edict electric mode with a newly created entry with the missing word
396 at the correct place. The file in the buffer will be your private
397 edict file. So, C-u M-* will give you:
399 ---------------- 8< ----------------
401 ---------------- 8< ----------------
403 Note now that you are in an "electric" environment, ie some keys do
404 specialized things. TAB will move to the next slot, RETURN will
405 create a new entry. When you move from slot to slot with TAB, edict
406 will make sure that the correct input mode is active, ie you can
407 insert Japanese in the Japanese slots and english in the english
408 slots. You stop editing your private edict file by doing a save
411 I works similarly for an unknown Japanese string.
413 You can also start these commands by doing M-x edict-add-word, M-x
414 edict-add-english, or M-x edict-add-kanji. The last command has to be
415 given a region, as usual with Japanese.
421 If you have any suggestions, please state them! Send them to
422 <perham@nada.kth.se>, sending both text and an example of what
423 functionality you want is probably best. If you think about
424 contributing code, please make sure that you have the most recent
425 version of edict.el before you start to hack around in it! Apart from
426 that, to minimize wasted efforts and difficult merging sessions,
427 please contribute code.
429 * Edict will (quite) soon make an educated guess at what it is that
430 you want to translate, search for, when you are looking at a
431 Kanji/Kana characters. It will basically improve the forward-word
432 backward-word functionality, since it does not work on Japanese text.
433 When this starts to work, most searches will be performed with "M-*".
434 This will simplify the user interface.
436 * Edict will have (more) functionality for "intext replacement" of
437 what one translated. This is convenient when writing for both
438 speakers of Japanese and English.
442 Phrases that you might be wondering about.
444 * More edict dictionaries, Short Getting Started Guide.
446 * Input methods, EGG and SKK, and where to find them.