[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 12 Dec 2019 21:08:04 +0000
From: Robin Murphy <robin.murphy@....com>
To: Dmitry Torokhov <dmitry.torokhov@...il.com>
Cc: Marc Gonzalez <marc.w.gonzalez@...e.fr>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Kuninori Morimoto <kuninori.morimoto.gx@...esas.com>,
Stephen Boyd <sboyd@...nel.org>,
Michael Turquette <mturquette@...libre.com>,
LKML <linux-kernel@...r.kernel.org>,
Sudip Mukherjee <sudipm.mukherjee@...il.com>,
Russell King <rmk+kernel@...linux.org.uk>,
Guenter Roeck <linux@...ck-us.net>,
linux-clk <linux-clk@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
x86 <x86@...nel.org>
Subject: Re: [PATCH v1] clk: Convert managed get functions to devm_add_action
API
On 2019-12-12 7:10 pm, Dmitry Torokhov wrote:
> On Thu, Dec 12, 2019 at 06:15:16PM +0000, Robin Murphy wrote:
>> On 12/12/2019 4:59 pm, Marc Gonzalez wrote:
>>> On 12/12/2019 15:47, Robin Murphy wrote:
>>>
>>>> On 12/12/2019 1:53 pm, Marc Gonzalez wrote:
>>>>
>>>>> On 11/12/2019 23:28, Dmitry Torokhov wrote:
>>>>>
>>>>>> On Wed, Dec 11, 2019 at 05:17:28PM +0100, Marc Gonzalez wrote:
>>>>>>
>>>>>>> What is the rationale for the devm_add_action API?
>>>>>>
>>>>>> For one-off and maybe complex unwind actions in drivers that wish to use
>>>>>> devm API (as mixing devm and manual release is verboten). Also is often
>>>>>> used when some core subsystem does not provide enough devm APIs.
>>>>>
>>>>> Thanks for the insight, Dmitry. Thanks to Robin too.
>>>>>
>>>>> This is what I understand so far:
>>>>>
>>>>> devm_add_action() is nice because it hides/factorizes the complexity
>>>>> of the devres API, but it incurs a small storage overhead of one
>>>>> pointer per call, which makes it unfit for frequently used actions,
>>>>> such as clk_get.
>>>>>
>>>>> Is that correct?
>>>>>
>>>>> My question is: why not design the API without the small overhead?
>>>>
>>>> Probably because on most architectures, ARCH_KMALLOC_MINALIGN is at
>>>> least as big as two pointers anyway, so this "overhead" should mostly be
>>>> free in practice. Plus the devres API is almost entirely about being
>>>> able to write simple robust code, rather than absolute efficiency - I
>>>> mean, struct devres itself is already 5 pointers large at the absolute
>>>> minimum ;)
>>>
>>> (3 pointers: 1 list_head + 1 function pointer)
>>
>> Ah yes, I failed to mentally preprocess the debug config :)
>>
>>> I'm confused. The first patch was criticized for potentially adding
>>> an extra pointer for every devm_clk_get (e.g. 800 bytes on a 64-bit
>>> platform with 100 clocks).
>>
>> I'm not sure it was a criticism so much as an observation of an aspect that
>> deserved consideration (certainly it was on my part, and I read Dmitry's "It
>> might still, ..." as implying the same). I'd say by this point it has been
>> thoroughly considered, and personally I'm now happy with the conclusion that
>> the kind of embedded platforms that will have many dozens of clocks are also
>> the kind that will tend to have enough padding to make it moot, and thus the
>> code simplification probably is worthwhile overall.
>
> I wonder if we could actually avoid allocating the data with
> ARCH_KMALLOC_MINALIGN in all the cases. It is definitely needed for the
> devm_k*alloc() group of functions as they are direct replacement for
> k*alloc() APIs that give users aligned memory, but for other data
> structures (clocks, regulators, etc, etc) it is not required.
That's a very good point - perhaps something like this (only done properly)?
Robin.
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index 0bbb328bd17f..2382f963abbe 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -26,14 +26,7 @@ struct devres_node {
struct devres {
struct devres_node node;
- /*
- * Some archs want to perform DMA into kmalloc caches
- * and need a guaranteed alignment larger than
- * the alignment of a 64-bit integer.
- * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
- * buffer alignment as if it was allocated by plain kmalloc().
- */
- u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
+ u8 data[];
};
struct devres_group {
@@ -810,6 +803,17 @@ static int devm_kmalloc_match(struct device *dev,
void *res, void *data)
void * devm_kmalloc(struct device *dev, size_t size, gfp_t gfp)
{
struct devres *dr;
+ size_t align;
+
+ /*
+ * Some archs want to perform DMA into kmalloc caches
+ * and need a guaranteed alignment larger than
+ * the alignment of a 64-bit integer.
+ * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
+ * buffer alignment as if it was allocated by plain kmalloc().
+ */
+ align = (ARCH_KMALLOC_MINALIGN - sizeof(*dr)) %
ARCH_KMALLOC_MINALIGN;
+ size += align;
/* use raw alloc_dr for kmalloc caller tracing */
dr = alloc_dr(devm_kmalloc_release, size, gfp, dev_to_node(dev));
@@ -822,7 +826,7 @@ void * devm_kmalloc(struct device *dev, size_t size,
gfp_t gfp)
*/
set_node_dbginfo(&dr->node, "devm_kzalloc_release", size);
devres_add(dev, dr->data);
- return dr->data;
+ return dr->data + align;
}
EXPORT_SYMBOL_GPL(devm_kmalloc);
Powered by blists - more mailing lists