[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHp75VfD6Hi0NScdZrYozp9dAqmUUYjcLuPPyRwM31ezxNO2fw@mail.gmail.com>
Date: Wed, 28 Feb 2018 21:31:29 +0200
From: Andy Shevchenko <andy.shevchenko@...il.com>
To: Frank Rowand <frowand.list@...il.com>
Cc: Rob Herring <robh+dt@...nel.org>, cpandya@...eaurora.org,
devicetree <devicetree@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
On Wed, Feb 28, 2018 at 9:04 PM, <frowand.list@...il.com> wrote:
> Create a cache of the nodes that contain a phandle property. Use this
> cache to find the node for a given phandle value instead of scanning
> the devicetree to find the node. If the phandle value is not found
> in the cache, of_find_node_by_phandle() will fall back to the tree
> scan algorithm.
>
> The cache is initialized in of_core_init().
>
> The cache is freed via a late_initcall_sync() if modules are not
> enabled.
>
> If the devicetree is created by the dtc compiler, with all phandle
> property values auto generated, then the size required by the cache
> could be 4 * (1 + number of phandles) bytes. This results in an O(1)
> node lookup cost for a given phandle value. Due to a concern that the
> phandle property values might not be consistent with what is generated
> by the dtc compiler, a mask has been added to the cache lookup algorithm.
> To maintain the O(1) node lookup cost, the size of the cache has been
> increased by rounding the number of entries up to the next power of
> two.
>
> The overhead of finding the devicetree node containing a given phandle
> value has been noted by several people in the recent past, in some cases
> with a patch to add a hashed index of devicetree nodes, based on the
> phandle value of the node. One concern with this approach is the extra
> space added to each node. This patch takes advantage of the phandle
> property values auto generated by the dtc compiler, which begin with
> one and monotonically increase by one, resulting in a range of 1..n
> for n phandle values. This implementation should also provide a good
> reduction of overhead for any range of phandle values that are mostly
> in a monotonic range.
>
> Performance measurements by Chintan Pandya <cpandya@...eaurora.org>
> of several implementations of patches that are similar to this one
> suggest an expected reduction of boot time by ~400ms for his test
> system. If the cache size was decreased to 64 entries, the boot
> time was reduced by ~340 ms. The measurements were on a 4.9.73 kernel
> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
> 814 phandle values.
The question is why O(1) is so important? O(log(n)) wouldn't work?
Using radix_tree() I suppose allows to dynamically extend or shrink
the cache which would work with DT overlays.
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists