lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <DF4PR8401MB13058AE34765190922714B2F852C0@DF4PR8401MB1305.NAMPRD84.PROD.OUTLOOK.COM> Date: Mon, 8 Apr 2019 12:02:49 +0000 From: "Weber, Olaf (HPC Data Management & Storage)" <olaf.weber@....com> To: Theodore Ts'o <tytso@....edu>, Gabriel Krisman Bertazi <krisman@...labora.com> CC: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>, "sfrench@...ba.org" <sfrench@...ba.org>, "darrick.wong@...cle.com" <darrick.wong@...cle.com>, "jlayton@...nel.org" <jlayton@...nel.org>, "bfields@...ldses.org" <bfields@...ldses.org>, "paulus@...ba.org" <paulus@...ba.org>, "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>, Olaf Weber <olaf@....com>, "Gabriel Krisman Bertazi" <krisman@...labora.co.uk> Subject: RE: [PATCH RFC v6 04/11] unicode: reduce the size of utf8data[] From: Theodore Ts'o > On Mon, Mar 18, 2019 at 04:27:38PM -0400, Gabriel Krisman Bertazi wrote: > > From: Olaf Weber <olaf@....com> > > > > Remove the Hangul decompositions from the utf8data trie, and do > > algorithmic decomposition to calculate them on the fly. To store > > the decomposition the caller of utf8lookup()/utf8nlookup() must > > provide a 12-byte buffer, which is used to synthesize a leaf with > > the decomposition. Trie size is reduced from 245kB to 90kB. > > I'm seeing sizes much smaller; the actual utf8data[] array is 63,584. > And size utf8-norm.o reports: > > text data bss dec hex filename > 68752 96 0 68848 10cf0 fs/unicode/utf8-norm.o > > Were you measuring the size of the utf8-norm.o file? That will vary > in size depending on whether debugging symbols are enabled, etc. > > - Ted These numbers came from the size of the array reported in utf8data.h, and were correct for the NFKDI + NFKDICF normalizations for Unicode 9. The switch to NFDI + NFDICF reduced the size, and it looks like the commit message was not updated to account for this. Olaf
Powered by blists - more mailing lists