[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070108015958.GT20714@stusta.de>
Date: Mon, 8 Jan 2007 02:59:58 +0100
From: Adrian Bunk <bunk@...sta.de>
To: Tilman Schmidt <tilman@...p.cc>
Cc: Willy Tarreau <w@....eu>, Jan Engelhardt <jengelh@...ux01.gwdg.de>,
Russell King <rmk+lkml@....linux.org.uk>,
David Woodhouse <dwmw2@...radead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: OT: character encodings
On Mon, Jan 08, 2007 at 02:32:42AM +0100, Tilman Schmidt wrote:
> Am 08.01.2007 01:38 schrieb Willy Tarreau:
>...
> > And I'm not even
> > discussing the stupidity which requires that you read a whole text to get
> > its number of characters !
>
> Personally I find the requirement to know the number of characters in a text
> rather unusual, so I wouldn't base a decision for an encoding on that. In
> fact, I cannot remember ever really wanting to know the actual number of
> characters in a text. The number of bytes occupied on storage, ok. The
> number of letters, of words, of lines, perhaps even the number of printable
> characters, all potentially interesting, depending on the application.
> But the raw number of characters? I don't know what that might serve for.
Also note that the UTF-32 Unicode encoding would offer this property,
but with the following disadvantages compared to the UTF-8 Unicode
encoding:
- 7bit ASCII is not a subset of UTF-32 losing a lot of compatibility
(code 7bit ASCII with some UTF-8 in the comments is no problem
for not-Unicode aware systems except for slight misdisplayments
of the comments)
- UTF-32 has up to 4 times the size of UTF-8
There's also the point that you can use e.g. "wc" or your editor for
counting the characters.
> HTH
> Tilman
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists