lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 13 Oct 2017 15:40:43 +0000 From: Matthew Wilcox <mawilcox@...rosoft.com> To: Wei Yang <richard.weiyang@...il.com>, Andrew Morton <akpm@...ux-foundation.org> CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org> Subject: RE: [PATCH] radix-tree: get_slot_offset() returns invalid offset when parent is NULL From: Wei Yang [mailto:richard.weiyang@...il.com] > On Wed, Oct 11, 2017 at 04:39:40PM -0700, Andrew Morton wrote: > >On Wed, 11 Oct 2017 10:33:39 +0800 Wei Yang > >> BTW, I have another question on the implementation of radix tree. > >> > >> #Question: why 6 is one of the choice of RADIX_TREE_MAP_SHIFT > >> > >> Currently, the shift of the tree is defined as below. > >> > >> #define RADIX_TREE_MAP_SHIFT (CONFIG_BASE_SMALL ? 4 : 6) > >> > >> The index in the tree is with type unsigned long, which means the size is > >> 32bits or 64bits. > >> > >> 32 % 4 = 0, 64 % 4 = 0 > >> > >> 32 % 6 = 2, 64 % 6 = 4 > >> > >> This means when the tree shift is 6 and represents the max index, there > would > >> be some unused slots at the top level. > >> > >> So why we chose 6 instead of a 2^N? > >> > >> The original commit is 139e561660fe11e0fc35e142a800df3dd7d03e9d, > which I don't > >> see some reason for this choice. > >> > >> If possible, I suggest to change this value to a power of 2. Or maybe I missed > >> some critical reason behind this. > >> > > > >I'm not sure I really see the problem. 1<<6 is still a power of 2. > >radix_tree_split_preload() is doing mod and div on shift distances > >which hurts my brain a bit. But CONFIG_BASE_SMALL=0 is what most > >people will be using so presumably the current code tests out OK. > >Although simple memory wastage would be hard to detect. Matthew has > >been here recently and perhaps can consider? > > Hmm, I may not state it clearly. > > What I mean is to set RADIX_TREE_MAP_SHIFT a power of 2, instead of the > result > of (1 << RADIX_TREE_MAP_SHIFT). > > Let me explain a simplified example. Assume the max index is 16 bits. > > Case 1, shift is 4 > > Each 4 bit forms a group and exactly 4 groups. > > |0123|4567|0123|4567| > > Case 2, shift is 6 > > Each 6 bit forms a group and left 3 groups with 2 empty slots at the > beginning. > > | 0123|456701|234567| > > Radix tree is a little bit like the page table, which is divided into several > levels and each level is referenced by a "slot" of bits. > > When the shift is a power of 2, the digit fits in exact "slots". This means > there is no waste. > > When the shift is not, we will have several unused digits at the most > significant part. This means several slot will never be used. > > This means on theory, in a tree with shift 6 > > | 0123|456701|234567| > > We can store an index with value > > ffffff ffffff ffffff > > While actually the largest value a 16 bits index could represents > > __ffff ffffff ffffff > > Since radix tree will not allocate memory when there is no corresponding > index, this will just leave several unused slot at the top level. Yes, if we have a tree with 2^64 pointers in it, we'll have 48 unused pointers in the top level leaf node. That's not a huge percentage of waste. > While I still prefer to fully utilize the whole tree instead of leave some > never touch slot. And this can be achieved by set RADIX_TREE_MAP_SHIFT to a > power of 2. > > So if there is no specific reason to use 6, I would like to form a patch to > change it to 8. The cost of that is in the slab allocator. With 64bit pointers and a 64-entry node, we use 576 bytes per node, fitting 7 nodes per page. With 256 entries per node, the node balloons to just over 2kB, and we get 3 nodes per pair of pages. This is massively less efficient than the inefficiency you’re complaining about.
Powered by blists - more mailing lists