lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bff4c4ad-de8f-7229-1d16-7ea67e066f65@free.fr>
Date:   Wed, 18 Dec 2019 16:40:22 +0100
From:   Marc Gonzalez <marc.w.gonzalez@...e.fr>
To:     Alexey Brodkin <Alexey.Brodkin@...opsys.com>
Cc:     Robin Murphy <robin.murphy@....com>,
        Dmitry Torokhov <dmitry.torokhov@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>,
        Russell King <rmk+kernel@...linux.org.uk>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Tejun Heo <tj@...nel.org>, Mark Brown <broonie@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Rafael Wysocki <rjw@...ysocki.net>,
        LKML <linux-kernel@...r.kernel.org>,
        arcml <linux-snps-arc@...ts.infradead.org>,
        Vineet Gupta <Vineet.Gupta1@...opsys.com>,
        Eugeniy Paltsev <Eugeniy.Paltsev@...opsys.com>
Subject: Re: [RFC PATCH v1] devres: align devres.data strictly only for
 devm_kmalloc()

On 18/12/2019 15:20, Alexey Brodkin wrote:

> On 17/12/2019 16:30, Marc Gonzalez wrote:
> 
>> Commit a66d972465d15 ("devres: Align data[] to ARCH_KMALLOC_MINALIGN")
>> increased the alignment of devres.data unconditionally.
>>
>> Some platforms have very strict alignment requirements for DMA-safe
>> addresses, e.g. 128 bytes on arm64. There, struct devres amounts to:
>> 	3 pointers + pad_to_128 + data + pad_to_256
>> i.e. ~220 bytes of padding.
> 
> Could you please elaborate a bit on mentioned paddings?
> I may understand the first one for 128 bytes but where does the
> second one for 256 bytes come from?

Sure thing.

struct devres {
	struct devres_node node;
	u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
};

struct devres_node = 3 pointers
kmalloc dishes out memory in multiples of ARCH_KMALLOC_MINALIGN bytes.
On arm64, ARCH_KMALLOC_MINALIGN = 128
(Everything written below assumes ARCH_KMALLOC_MINALIGN = 128)

In alloc_dr() we request sizeof(struct devres) + sizeof(data) from kmalloc.
sizeof(struct devres) = 128 because of the alignment directive.
I.e. the 'data' field is automatically padded to 128 by the compiler.

For most devm allocs (non-devm_kmalloc allocs), data is just 1 or 2 pointers.
So kmalloc(128 + 16) allocates 256 bytes.

>> Let's enforce the alignment only for devm_kmalloc().
> 
> Ok so for devm_kmalloc() we don't change anything, right?
> We still add the same padding before real data array.

(My commit message probably requires improvement & refining.)

Yes, the objective of my patch is to keep the same behavior for
devm_kmalloc() while reverting to the old behavior for all other
uses of struct devres.


>> I had not been aware that dynamic allocation granularity on arm64 was
>> 128 bytes. This means there's a lot of waste on small allocations.
> 
> Now probably I'm missing something but when do you expect to save something?
> If those smaller allocations are done with devm_kmalloc() you aren't
> saving anything.

With my patch, a "non-kmalloc" struct devres would take 128 bytes, instead
of 256.

>> I suppose there's no easy solution, though.
> 
> Right! It took a while till I was able to propose something
> people [almost silently] agreed with.

I meant the wider subject of dynamic allocation granularity.

The 128-byte requirement is only for DMA. Some (most?) uses of kmalloc
are not for DMA. If the user could provide a flag ("this is to be used
for DMA") we could save lots of memory for small non-DMA allocs.


>> +#define DEVM_KMALLOC_PADDING_SIZE \
>> +	(ARCH_KMALLOC_MINALIGN - sizeof(struct devres) % ARCH_KMALLOC_MINALIGN)
> 
> Even given your update with:
> ------------------------------->8--------------------------------
> #define DEVM_KMALLOC_PADDING_SIZE \
>   ((ARCH_KMALLOC_MINALIGN - sizeof(struct devres)) % ARCH_KMALLOC_MINALIGN)
> ------------------------------->8--------------------------------
> I don't think I understand why do you need that "% ARCH_KMALLOC_MINALIGN" part?

To handle the case where sizeof(struct devres) > ARCH_KMALLOC_MINALIGN

e.g ARCH_KMALLOC_MINALIGN = 8 and sizeof(struct devres) = 12


>> +	/* Add enough padding to provide a DMA-safe address */
>> +	size += DEVM_KMALLOC_PADDING_SIZE;
> 
> This implementation gets ugly and potentially will lead to problems later
> when people will start changing code here. Compared to that initially aligned by
> the compiler dr->data looks much more foolproof.

Yes, it's better to let the compiler handle the padding... But, we don't
want any padding in the non-devm_kmalloc use-case.

We could add a pointer to the data field, but arches with small ARCH_KMALLOC_MINALIGN
will have to pay the size increase, which doesn't seem fair to them (x86, amd64).


>> @@ -822,7 +825,7 @@ void * devm_kmalloc(struct device *dev, size_t size, gfp_t gfp)
>>  	 */
>>  	set_node_dbginfo(&dr->node, "devm_kzalloc_release", size);
>>  	devres_add(dev, dr->data);
>> -	return dr->data;
>> +	return dr->data + DEVM_KMALLOC_PADDING_SIZE;
> 
> Ditto. But first I'd like to understand what are you trying to really do
> with your change and then we'll see if there could be any better implementation.

Basically, every call to devres_alloc() or devm_add_action() allocates
256 bytes instead of 128. A typical arm64 system will call these
thousands of times during driver probe.

Regards.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ