[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2024110903-litmus-stir-0956@gregkh>
Date: Sat, 9 Nov 2024 10:29:55 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Matteo Martelli <matteomartelli3@...il.com>
Cc: Jonathan Cameron <jic23@...nel.org>, Joe Perches <joe@...ches.com>,
Jens Axboe <axboe@...nel.dk>, Peter Zijlstra <peterz@...radead.org>,
Marc Gonzalez <marc.w.gonzalez@...e.fr>,
Peter Rosin <peda@...ntia.se>,
"Rafael J. Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, linux-iio@...r.kernel.org,
linux-block@...r.kernel.org
Subject: Re: iio, syfs, devres: devm_kmalloc not aligned to pow2 size argument
On Fri, Nov 08, 2024 at 10:04:27AM +0100, Matteo Martelli wrote:
> On Mon, 28 Oct 2024 13:04:10 +0100, matteomartelli3@...il.com wrote:
> > Hi everyone,
> >
> > I found an issue that might interest iio, sysfs and devres, about a
> > particular usage of devm_kmalloc() for buffers that later pass through
> > sysfs_emit() or sysfs_emit_at(). These sysfs helpers require the output
> > buffer to be PAGE_SIZE aligned since commit 2efc459d06f1 ("sysfs: Add
> > sysfs_emit and sysfs_emit_at to format sysfs output"). Such requirement
> > is satisfied when kmalloc(PAGE_SIZE, ...) is used but not when
> > devm_kmalloc(PAGE_SIZE,...) is used as it actually returns a pointer to
> > a buffer located after the devres metadata and thus aligned to
> > PAGE_SIZE+sizeof(struct devres).
> >
> > Specifically, I came across this issue during some testing of the
> > pac1921 iio driver together with the iio-mux iio consumer driver, which
> > allocates a page sized buffer to copy the ext_info of the producer
> > pac1921 iio producer driver. To fill the buffer, the latter calls
> > iio_format_value(), and so sysfs_emit_at() which fails due to the buffer
> > not being page aligned. This pattern seems common for many iio drivers
> > which fill the ext_info attributes through sysfs_emit*() helpers, likely
> > necessary as they are exposed on sysfs.
> >
> > I could reproduce the same error behavior with a minimal dummy char
> > device driver completely unrelated to iio. I will share the entire dummy
> > driver code if needed but essentially this is the only interesting part:
> >
> > data->info_buf = devm_kzalloc(data->dev, PAGE_SIZE, GFP_KERNEL);
> > if (!data->info_buf)
> > return -ENOMEM;
> >
> > if (offset_in_page(data->info_buf))
> > pr_err("dummy_test: buf not page algined\n");
> >
> > When running this, the error message is printed out for the reason above.
> >
> > I am not sure whether this should be addressed in the users of
> > devm_kmalloc() or in the devres implementation itself. I would say that
> > it would be more clear if devm_kmalloc() would return the pointer to the
> > size aligned buffer, as it would also comply to the following kmalloc
> > requirement (introduced in [1]):
> >
> > The address of a chunk allocated with `kmalloc` is aligned to at least
> > ARCH_KMALLOC_MINALIGN bytes. For sizes of power of two bytes, the
> > alignment is also guaranteed to be at least to the respective size.
> >
> > To do so I was thinking to try to move the devres metadata after the
> > data buffer, so that the latter would directly correspond to pointer
> > returned by kmalloc. I then found out that it had been already suggested
> > previously to address a memory optimization [2]. Thus I am reporting the
> > issue before submitting any patch as some discussions might be helpful
> > first.
> >
> > I am sending this to who I think might be interested based on previous
> > related activity. Feel free to extend the cc list if needed.
>
> Adding some more context to better understand the impact of this.
>
> With a trivial grep it looks like there are only few instances where
> devm_k*alloc() is used to allocate a PAGE_SIZE buffer:
>
> $ git grep -n 'devm_.*alloc.*(.*PAGE_SIZE'
> block/badblocks.c:1584: bb->page = devm_kzalloc(dev, PAGE_SIZE, GFP_KERNEL);
> drivers/iio/multiplexer/iio-mux.c:287: page = devm_kzalloc(dev, PAGE_SIZE, GFP_KERNEL);
> drivers/mtd/nand/raw/mxc_nand.c:1702: host->data_buf = devm_kzalloc(&pdev->dev, PAGE_SIZE, GFP_KERNEL);
> drivers/usb/gadget/udc/gr_udc.c:1987: buf = devm_kzalloc(dev->dev, PAGE_SIZE, GFP_DMA | GFP_ATOMIC);
> sound/soc/sof/debug.c:277: dfse->buf = devm_kmalloc(sdev->dev, PAGE_SIZE, GFP_KERNEL);
>
> What takes my attention is the bb->page in blocks/badblocks.c, being the
> buffer named "page" maybe it is supposed to be page aligned?
>
> Also in [3] it was suggested to add the page alignment check for
> sysfs_emit() and sysfs_emit_at(), but I haven't found why that's
> necessary. My guess is for optimizations to avoid the buffer to spread
> in more than one page. Is this correct? Are there other reasons? Can
> anyone add more details? I think it would help to understand whether
> page alignment is necessary in the other instances of devm_k*alloc().
sysfs_emit* functions should only be operating on the buffer that was
passed to the show function callback, which is allocated by the sysfs
core, so should not have any of these issues. So why would it need to
be checked?
> Beside page alignment, there are plenty of devm_k*alloc() around the
> code base, is there any way to spot whether any of those instances
> expect the allocated buffer to be aligned to the provided size?
That's a good question, and a worry about the devm_* calls. I know many
busses (i.e. USB) require that the data passed to them are allocated
from kmalloc buffers, but I don't know about the alignment issues
required, as that is usually very hardware-specific.
> If this is a limited use-case it can be worked around with just regular
> k*alloc() + devm_add_action_or_reset() as Jonathan suggested. However, I
> still think it can be easy to introduce some alignment related bug,
> especially when transitioning from k*alloc() to devm_k*alloc() in an old
> implementation since it can be assumed that they have the same alignment
> guarantees. Maybe some comment in the devres APIs or documentation would
> help in this case?
I think the general statement of "don't migrate drivers to devm_* calls
unless you have the hardware and can test the changes" is good to follow
here. That should resolve the problem going forward as new drivers are
expected to be at least tested by the submitter :)
thanks,
greg k-h
Powered by blists - more mailing lists