lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5d81c10-8b92-0a5d-93c2-8b8377b1396f@gmail.com>
Date:   Wed, 28 Feb 2018 11:44:51 -0800
From:   Frank Rowand <frowand.list@...il.com>
To:     Andy Shevchenko <andy.shevchenko@...il.com>
Cc:     Rob Herring <robh+dt@...nel.org>, cpandya@...eaurora.org,
        devicetree <devicetree@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of
 of_find_node_by_phandle()

On 02/28/18 11:31, Andy Shevchenko wrote:
> On Wed, Feb 28, 2018 at 9:04 PM,  <frowand.list@...il.com> wrote:
> 
>> Create a cache of the nodes that contain a phandle property.  Use this
>> cache to find the node for a given phandle value instead of scanning
>> the devicetree to find the node.  If the phandle value is not found
>> in the cache, of_find_node_by_phandle() will fall back to the tree
>> scan algorithm.
>>
>> The cache is initialized in of_core_init().
>>
>> The cache is freed via a late_initcall_sync() if modules are not
>> enabled.
>>
>> If the devicetree is created by the dtc compiler, with all phandle
>> property values auto generated, then the size required by the cache
>> could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
>> node lookup cost for a given phandle value.  Due to a concern that the
>> phandle property values might not be consistent with what is generated
>> by the dtc compiler, a mask has been added to the cache lookup algorithm.
>> To maintain the O(1) node lookup cost, the size of the cache has been
>> increased by rounding the number of entries up to the next power of
>> two.
>>
>> The overhead of finding the devicetree node containing a given phandle
>> value has been noted by several people in the recent past, in some cases
>> with a patch to add a hashed index of devicetree nodes, based on the
>> phandle value of the node.  One concern with this approach is the extra
>> space added to each node.  This patch takes advantage of the phandle
>> property values auto generated by the dtc compiler, which begin with
>> one and monotonically increase by one, resulting in a range of 1..n
>> for n phandle values.  This implementation should also provide a good
>> reduction of overhead for any range of phandle values that are mostly
>> in a monotonic range.
>>
>> Performance measurements by Chintan Pandya <cpandya@...eaurora.org>
>> of several implementations of patches that are similar to this one
>> suggest an expected reduction of boot time by ~400ms for his test
>> system.  If the cache size was decreased to 64 entries, the boot
>> time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
>> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
>> 814 phandle values.
> 
> The question is why O(1) is so important? O(log(n)) wouldn't work?

O(1) is not critical.  It was just a nice side result.


> Using radix_tree() I suppose allows to dynamically extend or shrink
> the cache which would work with DT overlays.

The memory usage of the phandle cache in this patch is fairly small.
The memory overhead of a radix_tree() would not be justified.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ