[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pp270717-111q-8746-4r1o-2srp04r4roo7@syhkavp.arg>
Date: Wed, 7 May 2025 10:11:08 -0400 (EDT)
From: Nicolas Pitre <nico@...xnic.net>
To: Jiri Slaby <jirislaby@...nel.org>
cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/8] vt: introduce gen_ucs_fallback_table.py to create
ucs_fallback_table.h
On Tue, 6 May 2025, Jiri Slaby wrote:
> On 05. 05. 25, 18:55, Nicolas Pitre wrote:
> > From: Nicolas Pitre <npitre@...libre.com>
> >
> > The generated table maps complex characters to their simpler fallback
> > forms for a terminal display when corresponding glyphs are unavailable.
> > This includes diacritics, symbols as well as many drawing characters.
> > Fallback characters aren't perfect replacements, obviously. But they are
> > still far more useful than a bunch of squared question marks.
> >
> > Signed-off-by: Nicolas Pitre <npitre@...libre.com>
> > ---
> > drivers/tty/vt/gen_ucs_fallback_table.py | 882 +++++++++++++++++++++++
> > 1 file changed, 882 insertions(+)
> > create mode 100755 drivers/tty/vt/gen_ucs_fallback_table.py
> >
> > diff --git a/drivers/tty/vt/gen_ucs_fallback_table.py
> > b/drivers/tty/vt/gen_ucs_fallback_table.py
> > new file mode 100755
> > index 000000000000..cb4e75b454fe
> > --- /dev/null
> > +++ b/drivers/tty/vt/gen_ucs_fallback_table.py
> > @@ -0,0 +1,882 @@
> > + fallback_map[0x00D9] = ord('U') # Ù LATIN CAPITAL LETTER U WITH GRAVE
> > + fallback_map[0x00DA] = ord('U') # Ú LATIN CAPITAL LETTER U WITH ACUTE
> > + fallback_map[0x00DB] = ord('U') # Û LATIN CAPITAL LETTER U WITH CIRCUMFLEX
> > + fallback_map[0x00DC] = ord('U') # Ü LATIN CAPITAL LETTER U WITH DIAERESIS
> > + fallback_map[0x00DD] = ord('Y') # Ý LATIN CAPITAL LETTER Y WITH ACUTE
>
>
> So you are in fact doing iconv's utf-8 -> ascii//translit conversion. Does
> python not have an iconv lib?
>
> > perl -e 'use Text::Iconv; print Text::Iconv->new("UTF8",
> "ASCII//TRANSLIT")->convert("áąà"), "\n";'
> aaa
>
> /me digging
>
> Ah, unidecode:
> > python3 -c 'from unidecode import unidecode; print(unidecode("áąà"))'
> aaa
>
> Perhaps use that instead of manual table?
Good idea! Go figure why I didn't think of that.
Some overrides are still needed but the script is much smaller now (and
the table somewhat bigger though).
Nicolas
Powered by blists - more mailing lists