Japanese in LaTeX documents in Unicode with MacTeX May 10, 2008

Since I discovered LaTeX, I have become a real fan of this typesetting system. Yet, something bothered me very much: I could not use easily Japanese with LaTeX using a normal encoding, that is to say using UTF-8. Using CJK is not too difficult, but getting CJKutf8 to work is much harder. I just couldn’t resign myself to use some strange encoding popular in Japan, mostly for Windows Addicts

But I just found how to do it. Since it is not so easy, I want to do a very simple tutorial explaining precisely how to install UTF-8 Japanese support in MacTeX. So, just follow the guide…

Important Note

Even though this tutorial worked just fine a while back (with Mac OS X Leopard 10.5.2, I believe), it stopped working for some reason with later updates (including the latest software updates for Leopard and Snow Leopard). The problem is that at the end of the whole process, the PDF generated using pdflatex uses crazy glyphs instead of Japanese ones.

However, I know that copying the generated files from another machine on which this procedure has worked makes it also work on other machines. So I may end up publishing those files someday.

Updated

  • Friday, November 14, 2008: updated for MacTeX 2008 support on Leopard; small “bugfix” of the tutorial.
  • Monday, December 1, 2008: replaced wget by curl -O.

Of course, installing MacTeX is a prerequisite. This tutorial has been tested on Mac OS X 10.5.2 (Leopard) with both versions 2007 and 2008 of MacTeX, and it is just the adaptation of a procedure described in a Debian README file (/usr/share/doc/latex-cjk-common/README.Debian.gz in Debian Etch)…

The problem when we want to type Japanese directly in UTF-8 encoding in a LaTeX document is that we have to use the package CJKutf8 instead of CJK, and by default, there is no font with Japanese symbols adapted for UTF-8 typesetting in LaTeX. Therefore, the main goal of this tutorial is to install the font Cyberbit into your LaTeX distribution. Please, be sure to acknowledge the font’s license before proceeding with this tutorial.

First, in a terminal, download and unzip the font file:

curl -O ftp://ftp.netscape.com/pub/communicator/extras/fonts/windows/Cyberbit.ZIP
unzip Cyberbit.ZIP
rm -f Cyberbit.ZIP

Then, we need to become root in order to prepare some directories and files:

sudo -s   # Enter your password if needed
mkdir -p /usr/local/share/fonts/truetype/bitstream/
mv Cyberbit.ttf /usr/local/share/fonts/truetype/bitstream/cyberbit.ttf
mkdir -p /usr/src/cyberbit-fonts/
exit      # After this you should be back in a shell without superuser privileges
sudo chown "$USER":staff /usr/src/cyberbit-fonts/
cd /usr/src/cyberbit-fonts/

/usr/src/cyberbit-fonts/ will be our build directory. We must now prepare some more files in that directory:

ln -s /usr/local/share/fonts/truetype/bitstream/cyberbit.ttf
cp /Library/TeX/Root/texmf-dist/source/latex/CJK/utils/subfonts/subfonts.pe .

You mustn’t forget the final “.”, since it means the working directory. Then, you have to run the following command:

fontforge -script subfonts.pe cyberbit.ttf cyberbit \
    /Library/TeX/Root/texmf/fonts/sfd/Unicode.sfd

This will take a very long time, probably several hours, so if you have something else to do, go for it! Then you must generate a file called cyberbit.map:

touch cyberbit.map
for i in *.pfb
do
    echo "$(basename $i .pfb) $(basename $i .pfb)" >> cyberbit.map
done

Then, we need root‘s privileges for pretty much all the commands we need to run, so type the following command:

sudo -s

in order to become root, and type your password. Then do the following:

mkdir -p /usr/local/texlive/texmf-local/fonts/map/dvips/cyberbit/
mkdir -p /usr/local/texlive/texmf-local/fonts/{afm,type1,tfm}/cyberbit/
cp cyberbit.map /usr/local/texlive/texmf-local/fonts/map/dvips/cyberbit/
cp *.afm /usr/local/texlive/texmf-local/fonts/afm/cyberbit/
cp *.pfb /usr/local/texlive/texmf-local/fonts/type1/cyberbit/
cp *.tfm /usr/local/texlive/texmf-local/fonts/tfm/cyberbit/
echo "Map cyberbit.map" >> /Library/TeX/Root/texmf-var/web2c/updmap.cfg
cd ..
export PATH="/usr/texbin:$PATH"
texhash
updmap
updmap-sys

Then you have to replace a file of the MacTeX distribution by another one:

rm -f /Library/TeX/Root/texmf-dist/tex/latex/CJK/UTF8/c70song.fd
mkdir -p /usr/local/texlive/texmf-local/tex/latex/CJK/UTF-8/
cd /usr/local/texlive/texmf-local/tex/latex/CJK/UTF-8/
touch c70song.fd && chmod -R go+w .
open -a TextEdit c70song.fd

Now, a new TextEdit window should have opened, with an empty file. Copy-paste the following text into the empty file:

/usr/local/texlive/texmf-local/tex/latex/CJK/UTF-8/c70song.fd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
%%%%%%
% This is the file c70song.fd of the CJK package
% for using Asian logographs (Chinese/Japanese/Korean) with LaTeX2e
%
% created by Werner Lemberg
%
% Version 4.6.0 (11-Aug-2005)

\def\fileversion{4.6.0}
\def\filedate{2005/08/11}
\ProvidesFile{c70song.fd
}[\filedate\space\fileversion]

% character set: Unicode U+0080 - U+FFFD
% font encoding: Unicode

\DeclareFontFamily{C70}{song}{\hyphenchar \font\m@ne}

\DeclareFontShape{C70}{song}{m}{n}{<-> CJK * cyberbit}{}
\DeclareFontShape{C70}{song}{bx}{n}{<-> CJKb * cyberbit}{\CJKbold
}

\endinput
%%%%%%

Then save the file. Lastly, run:

chown -R root:wheel . && chmod -R go-w .
texhash

That’s it! Now you can enter the following LaTeX code as UTF-8 (double check the preferences of your LaTeX editor) in a new LaTeX file and compile it with pdflatex:

Example.tex

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
\documentclass[a4paper,12pt]{article}

%%%%%%%%%%%%%%%%%%%%%%%
% Encodings, fonts... %
%%%%%%%%%%%%%%%%%%%%%%%

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Oriental languages specificic options %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\usepackage[T1]{CJKutf8}

\usepackage[overlap,CJK]{ruby
}    % Furigana support
\renewcommand{\rubysize}{0.5}     % Furigana size
\renewcommand{\rubysep}{-0.3ex}   % Spacing between Furigana and Kanji

%%%%%%%%%%%%%%%%%
% Title, author %
%%%%%%%%%%%%%%%%%

\title{Japanese in UTF-8 with \LaTeX}
\author{Joel \textsc{Lopes Da Silva}}

\begin{document}

    \begin{CJK}{UTF8}{song}

        \maketitle
        \thispagestyle{empty}

        \section{タイトル}

            おめでとうございます。\ruby{自分}{じぶん}でできましたね。

            \bigskip

            これから、日本語で \LaTeX\ruby{資料}{しりょう}\ruby{}{}けます。

        \section{Another title}

            You can even use English words within a
            日本語の\ruby{資料}{しりょう}~(document in Japanese).

            \bigskip

            Have fun!

    \end{CJK}

\end{document
}

Once compiled with pdflatex, you should get the expected result: a nice PDF document generated by LaTeX with Japanese text. Enjoy!

13 Comments
Yoichi Takayama July 6th, 2008

Great

Thanks for the detailed note. The fonts are being created at the moment, but I expect that it would work. One thing, though, I’m not certain is how to do the vertical writing of Japanese in LaTEX. Is that a well known problem? Thanks Yoichi

Chaz October 15th, 2008

Great guide!

Thank you for the great guide. My computer is processing the fonts at the moment. I was wondering if there was any chance you could give a brief explanation of how other Japanese fonts, such as those freely found on the net, can be incorporated into LaTeX?

Joel October 15th, 2008

Re: Great guide!

I was wondering if there was any chance you could give a brief explanation of how other Japanese fonts, such as those freely found on the net, can be incorporated into LaTeX?

Honestly, I don’t know how to do it with just LaTeX and packages such as CJK. However, a friend of mine told me about XeTeX, a fantastic alternative which could allow using any normal font installed on the system as we would do on other Text Processing programs. When I saw it, I thought it was extraordinary simple, and that I would switch from LaTeX to XeTeX sometime, but I still haven’t had the time to learn how to use XeTeX properly. But I would be glad if you could try it and explain it for us, Japanese, LaTeX, and Apple fans!

Chaz October 15th, 2008

updmap-sys produced an error…

When I ran updmap-sys I got the following error after quite a bit of output that indicated things were going well:

The map file ‘cyberbit.map’ has not been found at all.

Either put the file into the right place or remove the reference from the configuration file. An automatic way to disable unavailable map files is to call
updmap –syncwithtrees

For manual editing, call
updmap –edit

Any ideas?

Chaz October 15th, 2008

As far as using XeTeX goes..

Do you know if XeTex supports writing vertically? I have heard that there is currently a windows front end that supports this…but I dont want to use virtualization.

I am currently trying to get Japanese, English and IPA to work under Lyx…I am having some trouble getting Japanese to work, and I am wondering if it has something to do with installing Basic Tex before installing the full Mac Tex…if you have any ideas let me know.

I am pretty happy with Lyx for English and IPA, with Lyx 1.6.0rc3 I can input IPA using the IPA Unicode 5.0c IME with full character support (for English and Japanese IPA symbols at least – except for some of the less common symbols for changes in pitch, and syllabic consonants), its a lot easier than using TIPA.

Joel October 15th, 2008

Re: As far as using XeTeX goes..

First, concerning the fact “updmap-sys produced an error…”, I have some questions: do you use Mac OS X Leopard? Do you installed properly MacTeX 2008 on it? (it does not require any “Basic Tex”) Anyway, please try the following:

sudo -s   # Enter your password if needed
/usr/libexec/locate.updatedb
locate -i cyberbit.map

Then paste me the output of this last command.

Concerning Lyx, and XeTeX and all, honestly, I never tried any of these: I am the kind of guy who writes himself his LaTeX code, and I never used a WYSIWYG editor such as Lyx for LaTeX. What I know is that writing Japanese vertically with LaTeX using UTF-8 might be impossible right now: if I remember correctly, the CJK and the CJKvert packages use other encodings less universal than Unicode, but CJKutf8 doesn’t allow vertical writing. Therefore, to write Japanese vertically with LaTeX, you have two possibilities:
– you can choose to forget Unicode and use CJKvert with LaTeX and a more specific encoding;
– you can try and see if XeTeX is a good solution.

As you must have perfectly understood, I still don’t know anything about XeTeX except what I told you yesterday. I don’t know wether XeTeX integrates well within Lyx, nor if you can write Japanese vertically with XeTeX. But in my opinion, XeTeX might be the best solution if vertical writing is crucial to you. However, I honestly don’t have the time to look for the good answers, so you will have to look for them by yourself. But please, if you learn anything interesting, share with the community leaving a comment here. Good luck, and see ya!

Chaz October 16th, 2008

Here’s the output…

I executed those commands, here’s the output:

/usr/local/texlive/texmf-local/fonts/map/dvips/cyberbit.map
/src/cyberbit-fonts/cyberbit.map

Chaz October 16th, 2008

Sorry I forgot to add…

I forgot to mention that I am using OS X 10.5.2. Basic Tex was installed before the full Mac Tex install because I wanted to test out Lyx and I had a slow connection at the time. I had no issues installing Mac Tex.

Ramos October 26th, 2008

Thanks!

many thanks!

Mauro November 10th, 2008

copyright

Thank you for the great guide, anh thanks for sharing.
I am an Italian user, so I was wondering if you could allow me to translate your post in Italian. I would like to post it on a friend’s blog.
Of course we would link your post here and give you all the credit, just saying that we are translating from your post.
I think it is very useful and complete, translating it would make it more useful to many friends of mine.
Please let me know if there is any problem with that, writing me at [e-mail address removed by moderator].

Thanks a lot
Mauro

Joel November 14th, 2008

Re: Here’s the output…

For Chaz, and for other people who might have used this tutorial and ended up without having it working: my bad, I was mistaken. I just received my new MacBook a few days ago; so I tried my own tutorial once more on a brand new Leopard install, but this time with MacTeX 2008. Indeed, I also experienced your issue: “The map file ‘cyberbit.map’ has not been found at all. ” To solve that, you just need to execute `texhash` before using `updmap`. I also experienced another problem with permissions when using a graphical editor to input the content of the file c70song.fd; so I also changed those lines a little bit so this could work properly. Very, very sorry, and thank you all for your feedback!

Joel November 14th, 2008

Re: copyright

No problem, Mauro, you can translate my post, and I’m very happy about it! But I would like you to give me the link of the translated tutorial when it is posted in order to create (also) myself a link to your translation on this page: it will help the Page Rank to grow! Thanks, and see ya!

Mike "Pomax" Kamermans November 26th, 2009

Why not use XeTeX for proper unicode handling?

Why MacTeX, rather than XeTeX?

MacTeX is a non-unicode LaTeX made utf8 aware via package hacking, while XeTeX is basically the only properly unicode aware TeX flavour out there (save LUATeX, but that’s more beta than done).

It expects input in UTF8, and doesn’t care what languages you use, as long as you make sure your fonts support that language. It even comes with a mechanism to automatically insert specific code between latin, spacing and CJK characters so you can effect the kind of automatic fontswitching that an office application does.

http://grammar.nihongoresources.com/doku.php?id=technical:details#latex

Leave a Reply