Configuring Linux for Japanese Kana and Kanji input

This is not an all-inclusive guide to editing Japanese in Linux. The steps below were needed to create and view documents in Japanese on this system. I will not even mention how long it took to figure out the steps detailed below. Basically, you need to run everything from a kterm in X with LANG and JSERVER environment variables set. A server, which can be either canna or wnn (for normal editors), or sj3server or skkserver (for emacs), is started first. This server translates kana into kanji when necessary. Then kinput2, which intercepts keystrokes and translates combinations of ASCII characters into kana, is started. Kinput2 calls the kanji server when necessary. Kinput2 also has a resource file (kinput2rc) that must correctly specify which kanji server is running. Finally kterm is started. There is also a similar cxterm for Chinese, which is much easier to set up. This is not discussed here.

See Craig Oda's Linux-Nihongo manual for more details.

  1. Download the PJE (Project JE) files from ftp.linux.or.jp. These consist of several directories:
    base          canna 
    mule          develop
    net           doc 
    font          ptex 
    gs            sj3 
    wnn           xclt    
    Of these, base, canna, wnn, font, and xclt are necessary for basic editing. The files in mule are necessary if your local mule doesn't work. On my system, it seemed necessary to install wnn even though I was using canna. Altogether, the files take up about 90 MB - just enough to fit on a zip disk. Unfortunately, the sj3 directory was empty, which may explain why I couldn't get xemacs to work in sj3 mode.
  2. Exit from X. This is a necessary step, since fvwm, rxvt, and xterm are replaced. Actually, it might not be a bad idea to back these up before starting, along with emacs, xemacs, and mule if they are present.
  3. Install all the files in the essential directories using the command
     tar -zxvf whatever.tgz -C /  
    if you downloaded tgz's, or use your favorite package manager (i.e., the one you can get to work :-) if you downloaded rpms. Note that you should only install one of the following three files in mule: mulewnn4, mulewnn6, or mulecan. I had the most luck with mulewnn4, which is to say not much, as I could never get it to convert anything to Kanji. The other two didn't work at all. Note that many of the .tgz's install a file install.sh which must also be run to create appropriate links. You must be in the / directory for install/install.sh to run properly.
  4. Warning: During a security scan, it was later found that the PJE installation adds a new user in /home named pje, and copies a number of files there. Thus you should ensure you don't already have a user named pje before starting.
  5. Start up X.
  6. Type
    kinput2 
    This will cause kinput2 to start and immediately exit with an error message. It also creates its .kinput2rc in your home directory, which must be edited as described below.
  7. If you wish to use canna (which I prefer), do the following:
    1. Type
      /usr/local/canna/bin/cannaserver 
    2. Edit ~/.kinput2rc to indicate SERVER=canna and add the line
       -cannaserver localhost 
    3. Edit your ~/.emacs to contain the following lines:
       
      (if (and (boundp 'CANNA) CANNA)
         (progn
           (load-library "canna")
           (canna)))
      (set-input-mode (car (current-input-mode))
                      (nth 1 (current-input-mode))
                       0)
      (setq kanji-flag t)
      (setq kanji-process-code 1)
      (cond ((boundp 'NEMACS) (setq kanji-display-code 3))
              ((boundp 'MULE) (set-display-coding-system *euc-japan*)))
      (cond ((boundp 'NEMACS) (setq kanji-fileio-code 3))
                      ((boundp 'MULE) (set-default-file-coding-system *iso-2022-jp*)))
      (cond ((boundp 'NEMACS) (setq kanji-input-code 3))
                      ((boundp 'MULE) (set-keyboard-coding-system *euc-japan*)))  
  8. If you wish to use wnn instead of cannaserver, do the following:
    1. Type
      /usr/local/bin/Wnn4/jserver 
      It should say something like "Nihongo multi client-server Reading... blah blah... Finished reading files".
    2. Edit ~/.kinput2rc to indicate SERVER=wnn.
    3. Edit your ~/.emacs to contain the following lines:
       
      (when (featurep 'wnn)
        (require 'egg-wnn)
        (setq egg-default-startup-file "eggrc-wnn"))
      (setenv "JSERVER" "localhost") 
      (set-input-mode (car (current-input-mode))
                      (nth 1 (current-input-mode))
                       0)
      (setq kanji-flag t)
      (setq kanji-process-code 1)
      (cond ((boundp 'NEMACS) (setq kanji-display-code 3))
              ((boundp 'MULE) (set-display-coding-system *euc-japan*)))
      (cond ((boundp 'NEMACS) (setq kanji-fileio-code 3))
                      ((boundp 'MULE) (set-default-file-coding-system *iso-2022-jp*)))
      (cond ((boundp 'NEMACS) (setq kanji-input-code 3))
                      ((boundp 'MULE) (set-keyboard-coding-system *euc-japan*)))  
    4. I also had to create a link manually in /usr/local/lib:
      ln -s libwnn.so.1.0  libwnn.so.1.0.0 
  9. chmod a+rw /tmp/jd* (permissions are set incorrectly).
  10. Contrary to what the FAQ states, it is not necessary to reboot after installing cannaserver. It may be necessary, however, to exit and restart X.
  11. Start kinput2 to run in the background. At this point, ps -aux should show both kinput2 and cannaserver. The wnn jserver does not show up in ps, but if it is running, 'top' will show 'jserver' using cpu time when you type a character in a kterm, and trying to start jserver again will print an error message like ": can't bind inet-socket". If you are using wnn, the command line should be:
    kinput2 -wnn -jserver localhost & 
    otherwise,
    kinput2 & 
    If you get a message "Can't connect to jserver" then your ~/.kinput2rc file is wrong.
  12. Add the following to ~/.Xdefaults:
       # japanese - Kanji mode = ctrl+right shift
       *vt100*translations: #override Ctrl <Key>Shift_R: begin-conversion() 
    This allows Ctrl-right shift to pop up a small box allowing you to convert into Kanji. Note, this step is easier than for cxterm, which requires the following:
    # cxterm
    OpenWindows.Beep:       never
    OpenWindows.DragRightDistance:  100
    OpenWindows.PopupJumpCursor:    True
    OpenWindows.SetInput:   select
    OpenWindows.WorkspaceColor:     #40a0c0
    OpenWindows.ScrollbarPlacement: right
    OpenWindows.WindowColor:        #cccccc
    OpenWindows.MultiClickTimeout:  4
    OpenWindows.IconLocation:       bottom
    OpenWindows.SelectDisplaysMenu: True
    Scrollbar.JumpCursor:   True
    *numeric:       C
    *displayLang:   C
    *basicLocale:   C
    *timeFormat:    C
    *inputLang:     C
    cxterm*loginShell: True
    cxterm*HanziEncoding:   GB
    cxterm*hanziInputDir:  /var/X11R6/lib/cxterm.dic/gb
    cxterm*HanziFont:      hanzigb16st
    
    #cxterm*Font:            <Default english font's name>
    cxterm*VT100.Translations: #override \
                          <KeyPress> F1:     set-HZ-parameter(input-conv=toggle) \n\
                          <KeyPress> F2:     switch-HZ-mode(IC) \n\
                          <KeyPress> F3:     popup-panel(config) \n\
                   ~Shift <KeyPress> F4:     switch-HZ-mode(TONEPY) \n\
                    Shift <KeyPress> F4:     switch-HZ-mode(PY) \n\
                   ~Shift <KeyPress> F5:     switch-HZ-mode(WuBi) \n\
                    Shift <KeyPress> F5:     switch-HZ-mode(CangJie) \n\
                ~Meta <KeyPress> Escape:     insert() set-HZ-parameter(input-conv=off)  
  13. Start kterm using
    kterm & 
    or
     kterm -fn 10x20 & 
    (for canna) or
    kterm -wnn & 
    for wnn. If your .Xdefaults is set up correctly, pressing Ctrl-right shift should bring up a small box with the hiragana character for 'a'. Typing something in Romaji such as "watashi" should automatically convert into the hiragana. Pressing the space bar should convert it into its corresponding Kanji. If you press space again, a small menu will appear allowing you to select other Kanji or katakana with the arrow keys. Pressing Shift-space will toggle back into ASCII mode.

    If the kterm freezes instead of allowing you to type, this means either your .kinput2rc file is pointing to the wnn jserver instead of cannaserver or vice versa, or else your server is not running.

    With this setup, pico, vi, and cat should work. If you only want to edit a few documents, this may be sufficient.

  14. Set the two environment variables:
    export LANG=ja_JP.ujis
    export JSERVER=localhost  
  15. Ctrl+middle mouse button allows you to switch between EUC and Shift-JIS mode. These are two encoding systems used for storing Japanese characters in files. Typing 'cat filename' on a Japanese file should display correctly. If not, try switching to the other encoding system.
  16. There are 3 main versions of emacs (emacs, xemacs, and mule), as well as multiple sub-versions of each, all compiled with different options. Thus, there is no way to know whether any particular emacs will work in Japanese. On my system, regular emacs only leaves a square in Japanese mode instead of displaying a character, regardless of what input mode is set. In xemacs, which is the nicest-appearing version, typing c-\ (i.e., ctrl-backslash) sets the input mode (but only when xemacs is started from a kterm with the environment variables set). Type japan? for a list. For example, if 'japanese-egg-sj3' is entered, it should say (in English) "Loading its kana..loading its zenkaku...done". You can then type 'watashi' as before, and it should automatically change to hiragana. However, typing the space bar will print the message:
    EGG: Network service (sj3) ga mitsukarimasen 
    or "sj3 network service not found".

    Although this message is printed in Japanese, with Kanji characters, this message means that emacs is not going to work in Japanese with your setup. The other two input methods (japanese-skk and japanese-skk-auto-fill) did not seem to work at all.

    In mule, typing c-\ should immediately put you in hiragana mode. The characters are displayed between vertical bars called a "fence". Typing the space bar in this case prints a more interesting error message:
    Saaba to setzuoku dekimasen deshita 
    which means "server connection could not", i.e., could not connect to the server.

    If it says
    Kana kanji henkan saaba tsuushin dekimasen  
    i.e., "kana kanji conversion server communicate can't", this means either your .kinput2rc file is not set up correctly or the server isn't running.

    Sometimes it also says,
    Hosuto localhostno wnn wo kidoshimashita.
       Hindo fairu "usr/root/kihon.h" ga naiyo. Tsukuru? (y or n)  
    "Host localhost's wnn was started. Frequency file "usr/root/kihon ['fundamental'] .h" is not [present]. Create?" If one types 'y' (as root), it says,
    Fairu ga sakusei dekimasen   
    "Can't create file". On the other hand, if one types 'n', it says,
    Fairu ga sonzai shimasen  
    "File doesn't exist". Obviously, something fundamentally important with regard to wnn is happening here. Unfortunately, I have no clue what it might be.

    These are the only error messages I could get emacs to produce. However, I am sure there must be many more.

    As mentioned above, it is likely that some files in the sj3 directory are necessary to get xemacs and mule to convert text to Kanji. Alternatively, it might be necessary to recompile mule with --canna or --wnn options. However, attempts to get mule-2.3 to compile were unsuccessful (numerous function parameter mismatches, object files not being found, etc). It appears that this program was designed for an extremely old version of linux.

    It is tricky to get 'kinput2' to insert the correct Kanji character, as it uses a very non-standard Romanization. For instance, in order to type the word 'senkoo' (specialization), it is necessary to separately type and convert 'sen' and 'kou'. To get 'chome' you must type 'choume'. Moreover, pico frequently gets confused with double byte characters, making a total mess of a line. These corrupted lines can be tricky to get rid of, as they contain control characters such as form feed, backspace, etc. And of course, there is no easy way to print the result. Thus, there is still a dire need for a better way of typing Asian languages in Linux. One program, mtscript, looked promising, but doesn't work for Asian languages (supposedly it can handle Arabic and several Western languages simultaneously). However, the source is not available. Attempting to run the precompiled version of mtscript resulted in the following:
    $ mtscript
    mtscript: can't find library 'libX11.so.6'
    $ strings /etc/ld.so.cache | grep libX11.so.6
    libX11.so.6
    /usr/X11R6/lib/libX11.so.6
    libX11.so.6
    /usr/i486-linux-libc5/lib/libX11.so.6
    $ ldd mtscript
    not a dynamic executable
    $ file mtscript
    mtscript: Linux/i386 demand-paged executable (QMAGIC)    
    Qmagic is an obsolete executable format which doesn't run on modern Linux systems. The latest version on their website is from 1996, so it would appear that development of mtscript has been abandoned.

    Solution: Download NJSTAR Japanese and Chinese word processors and run them on a (gag) Windows machine.

  17. After removing all traces of wnn, PJE, and kterm, it was found that the $TERM variable in all xterms is still being set to kterm. Thus far I have not been able to figure out where this is being set. Removing /sbin/init.d/rc3.d/S3canna and a number of similar files, editing ~/.xinitrc, and rebooting didn't help. It is proving very difficult to get rid of this software.

name and address

Back