lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 19 Jun 2007 14:13:02 +0200
From:	Egmont Koblinger <egmont@...linux.hu>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Jan Engelhardt <jengelh@...ux01.gwdg.de>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Behdad Esfahbod <behdad@...dad.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] console UTF-8 fixes

Hi folks,

Recently my console UTF-8 patch went mainline. Here I send a very small
additinal patch that fixes two nasty issues and improves a third one,
namely:

1. My patch changed the behavior if a glyph is not found in the Unicode
   mapping table. Previously for Unicode values less than 256 or 512 the
   kernel tried to display the glyph from that position of the glyph table,
   which could lead to a different accented letter being displayed. I
   removed this fallback possibility and changed it to display the
   replacement symbol.

   As Behdad pointed out, some fonts (e.g. sun12x22 from the kbd package)
   lack Unicode mapping information, hence all you get is lots of question
   marks. Though theoretically it's actually a user-space bug (the font
   should be fixed), Behdad and I both believe that it'd be good to work
   around in the kernel by re-introducing the fallback solution for ASCII
   characters only. This sounds a quite reasonable decision, since all fonts
   ship the ASCII characters in the first 128 positions. This way users
   won't be surprised by lots of question marks just because s/he issued a
   not-so-perfectly parameterized setfont command. As this fallback is only
   re-introduced for code points below 128, you still won't see an accented
   letter replaced by another, but at least you'll always get the English
   letters right.

2. My patch introduced "question mark with inverted color attributes" as a
   last resort fallback glyph. Though it perfectly works on VGA console, on
   framebuffer you may end up with question marks that are highlighed but
   shouldn't be, and normal characters that are accidentally highlighed.
   This is caused by missing FLUSHes when changing the color attribute.

3. I've updated the table of double-width character based on Markus's
   updated version. Only ten new code poings (one interval) is added.


Please review and commit this patch... I'm afraid it might be too late, but
it would be nice to see it in 2.6.22 -- as this will be the first stable
release that introduces my work, and the first two issues addressed now are
regressions since 2.6.21.

Thanks,

Signed-off-by: Egmont Koblinger <egmont@...linux.hu>

diff -Naur linux-2.6.22-rc5.orig/drivers/char/vt.c linux-2.6.22-rc5/drivers/char/vt.c
--- linux-2.6.22-rc5.orig/drivers/char/vt.c	2007-06-17 04:09:12.000000000 +0200
+++ linux-2.6.22-rc5/drivers/char/vt.c	2007-06-19 13:49:55.000000000 +0200
@@ -1956,7 +1956,7 @@
 DEFINE_MUTEX(con_buf_mtx);
 
 /* is_double_width() is based on the wcwidth() implementation by
- * Markus Kuhn -- 2003-05-20 (Unicode 4.0)
+ * Markus Kuhn -- 2007-05-26 (Unicode 5.0)
  * Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
  */
 struct interval {
@@ -1988,8 +1988,8 @@
 	static const struct interval double_width[] = {
 		{ 0x1100, 0x115F }, { 0x2329, 0x232A }, { 0x2E80, 0x303E },
 		{ 0x3040, 0xA4CF }, { 0xAC00, 0xD7A3 }, { 0xF900, 0xFAFF },
-		{ 0xFE30, 0xFE6F }, { 0xFF00, 0xFF60 }, { 0xFFE0, 0xFFE6 },
-		{ 0x20000, 0x2FFFD }, { 0x30000, 0x3FFFD }
+		{ 0xFE10, 0xFE19 }, { 0xFE30, 0xFE6F }, { 0xFF00, 0xFF60 },
+		{ 0xFFE0, 0xFFE6 }, { 0x20000, 0x2FFFD }, { 0x30000, 0x3FFFD }
 	};
 	return bisearch(ucs, double_width,
 		sizeof(double_width) / sizeof(*double_width) - 1);
@@ -2187,9 +2187,12 @@
 				    continue; /* nothing to display */
 				}
 				/* Glyph not found */
-				if (!(vc->vc_utf && !vc->vc_disp_ctrl) && !(c & ~charmask)) {
+				if ((!(vc->vc_utf && !vc->vc_disp_ctrl) || c < 128) && !(c & ~charmask)) {
 				    /* In legacy mode use the glyph we get by a 1:1 mapping.
-				       This would make absolutely no sense with Unicode in mind. */
+				       This would make absolutely no sense with Unicode in mind,
+				       but do this for ASCII characters since a font may lack
+				       Unicode mapping info and we don't want to end up with
+				       having question marks only. */
 				    tc = c;
 				} else {
 				    /* Display U+FFFD. If it's not found, display an inverse question mark. */
@@ -2213,6 +2216,7 @@
 				} else {
 					vc_attr = ((vc->vc_attr) & 0x88) | (((vc->vc_attr) & 0x70) >> 4) | (((vc->vc_attr) & 0x07) << 4);
 				}
+				FLUSH
 			}
 
 			while (1) {
@@ -2246,6 +2250,10 @@
 				if (tc < 0) tc = ' ';
 			}
 
+			if (inverse) {
+				FLUSH
+			}
+
 			if (rescan) {
 				rescan = 0;
 				inverse = 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ