linux-kernel - [RFC PATCH 0/3] docs: pdfdocs: Improve alignment of CJK ascii-art

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <386938dc-6290-239c-4b4f-c6153f3d98c5@gmail.com>
Date:   Thu, 24 Jun 2021 21:06:59 +0900
From:   Akira Yokosawa <akiyks@...il.com>
To:     Jonathan Corbet <corbet@....net>,
        Mauro Carvalho Chehab <mchehab@...nel.org>
Cc:     "Wu X.C." <bobwxc@...il.cn>, SeongJae Park <sj38.park@...il.com>,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        Akira Yokosawa <akiyks@...il.com>
Subject: [RFC PATCH 0/3] docs: pdfdocs: Improve alignment of CJK ascii-art

Subject: [RFC PATCH 0/3] docs: pdfdocs: Improve alignment of CJK ascii-art

Hi all,

This is another attempt to improve translations' pdf output.
I see there is a mismatch in the font choice for CJK documents, which
causes poor-looking ascii-art where CJK characters and Latin letters
are mixed used.

One of noticeable examples of such ascii-art can be found in
Korean translation of memory-barriers.txt.

Hence the author of Korean translation of memory-barriers.txt is
in the CC list.

At first, I thought the issue could be fixed by simply selecting
"Noto Sans Mono CJK SC" as both of monofont and CJKmonofont.
It fixed the mis-alignment in the Chinese translation, but failed
in the Korean translation.

It turns out that Hangul characters in "Noto Sans Mono CJK SC"
are slightly narrower than Chinese and Japanese counterparts.
I have no idea why the so-called "mono" font has non-uniform
character widths.

GNU Unifont is an alternative monospace font which covers
almost all Unicode codepoints.
However, due to its bitmap-font nature, the resulting document
might not be acceptable to Korean readers, I guess.

As a compromise, Patch 2/3 enables Unifont only when it is available.

A comparison of some of ascii-art figures before and after this change
can be found in the attached PDF.

Patch 1/3 is a preparation of Patch 2/3.
It converts font-availability check in python to LaTeX and make the
resulting LaTeX code portable across systems with different sets of
installed fonts.

Patch 3/3 is an independent white space fix (or a workaround of Sphinx
mis-handling of tabs behind CJK characters) in Korean translation
of memory-barriers.txt.

Any feedback is welcome!

Side note:

In Korean translation's PDF, I see there is another issue of missing
white spaces between Hangul "phrase groups" in normal text.
Looks like the pair of xelatex + xeCJK just ignores white spaces
between CJK characters.

There is a package named "xetexko", which might (or might not) be
a reasonable choice for Korean translation.

It should be possible to use a language-specific preamble once
we figure out the way to load per-directory Sphinx configuration
and move translation docs into per-language subdirectories.  

As I am not familiar with Korean LaTeX typesetting, I must defer to
those who are well aware of such conventions.

        Thanks, Akira
--
Akira Yokosawa (3):
  docs: pdfdocs: Refactor config for CJK document
  docs: pdfdocs: Add font settings for CJK ascii-art
  docs: ko_KR: Use white spaces behind CJK characters in ascii-art

 Documentation/conf.py                         | 26 +++++++++++--------
 .../translations/ko_KR/memory-barriers.txt    | 14 +++++-----
 2 files changed, 22 insertions(+), 18 deletions(-)

-- 
2.17.1

Download attachment "ascii-art-alignment.pdf" of type "application/pdf" (197513 bytes)