lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Jan 2007 02:59:58 +0100
From:	Adrian Bunk <bunk@...sta.de>
To:	Tilman Schmidt <tilman@...p.cc>
Cc:	Willy Tarreau <w@....eu>, Jan Engelhardt <jengelh@...ux01.gwdg.de>,
	Russell King <rmk+lkml@....linux.org.uk>,
	David Woodhouse <dwmw2@...radead.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: OT: character encodings

On Mon, Jan 08, 2007 at 02:32:42AM +0100, Tilman Schmidt wrote:
> Am 08.01.2007 01:38 schrieb Willy Tarreau:
>...
> > And I'm not even
> > discussing the stupidity which requires that you read a whole text to get
> > its number of characters !
> 
> Personally I find the requirement to know the number of characters in a text
> rather unusual, so I wouldn't base a decision for an encoding on that. In
> fact, I cannot remember ever really wanting to know the actual number of
> characters in a text. The number of bytes occupied on storage, ok. The
> number of letters, of words, of lines, perhaps even the number of printable
> characters, all potentially interesting, depending on the application.
> But the raw number of characters? I don't know what that might serve for.

Also note that the UTF-32 Unicode encoding would offer this property, 
but with the following disadvantages compared to the UTF-8 Unicode 
encoding:
- 7bit ASCII is not a subset of UTF-32 losing a lot of compatibility
  (code 7bit ASCII with some UTF-8 in the comments is no problem
   for not-Unicode aware systems except for slight misdisplayments 
   of the comments)
- UTF-32 has up to 4 times the size of UTF-8

There's also the point that you can use e.g. "wc" or your editor for 
counting the characters.

> HTH
> Tilman

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ