lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070412145822.GA12310@uhulinux.hu>
Date:	Thu, 12 Apr 2007 16:58:22 +0200
From:	Egmont Koblinger <egmont@...linux.hu>
To:	Roman Zippel <zippel@...ux-m68k.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Jan Engelhardt <jengelh@...ux01.gwdg.de>,
	Pavel Machek <pavel@....cz>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] console UTF-8 fixes

On Thu, Apr 12, 2007 at 04:38:54PM +0200, Roman Zippel wrote:

> Considering this possible volatility I'm not certain we really need this 
> in the kernel.
> The other point is that I have problems imagining, that this should be 
> enough to edit random text files with a random editor without problems. 

No, this would not be enough for all special corner cases. It would just be
better than currently it is. That's my goal.

> OTOH if the editor has to all this parsing anyway, the whole thing could 
> be pushed to userspace and the Linux terminal could be marked as handling 
> all characters equally (a good hint would be if the terminal doesn't even 
> support wide characters). The terminfo database exists for a good reason 

Currently not even applications and terminal emulators agree on the width. I
don't think terminfo has anything to do with it. Width information should
come from glibc, but unfortunately its database is quite old, so
applications tend to implement their own version. (Example: according to
glibc 2.5, U+0221 is not printable. Still it's present in many fonts, and at
least vim and joe display them.)

So, there are several (maybe a few hundred or few thousands) characters that
are handled differently by different applications/libraries. On the other
hand, there are approximately 42.000 characters in BMP that are double-width
according to every width specification or implementation I've seen so far.
That's roughly the 2/3 of all the code points in BMP. Why is a problem if
the kernel knows to jump 2 character cells on them?

I'm not seeking for a perfect solution. (Taking a look at the current state
of specs/libs/apps perfect solution doesn't even exist.) I'm seeking for
something that is just way better than the current situation.


-- 
Egmont
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ