lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ91LC610AsBZ8X3u8ZxAUhc6QT0FHeffQT0ARmnMgsGrdZQQ@mail.gmail.com>
Date: Wed, 2 Jul 2025 03:23:09 +0530
From: Abinash <abinashlalotra@...il.com>
To: Tu Dinh <ngoc-tu.dinh@...es.tech>
Cc: jgross@...e.com, sstabellini@...nel.org, oleksandr_tyshchenko@...m.com, 
	xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org, 
	Abinash Singh <abinashsinghlalotra@...il.com>
Subject: Re: [RFC PATCH] xen/gntdev: reduce stack usage by dynamically
 allocating gntdev_copy_batch

Hi ,

Thanks for pointing that out.

I haven’t measured the performance impact yet — my main focus was on
getting rid of the stack usage warning triggered by LLVM due to
inlining. But you're right, gntdev_ioctl_grant_copy() is on a hot
path, and calling kmalloc() there could definitely slow things down,
especially under memory pressure.

I’ll run some benchmarks to compare the current approach with the
dynamic allocation, and also look into alternatives — maybe
pre-allocating the struct or limiting inlining instead. If you have
any ideas or suggestions on how best to approach this, I’d be happy to
hear them.

Do you have any suggestions on how to test the performance?

Best,
Abinash


On Mon, 30 Jun 2025 at 16:05, Tu Dinh <ngoc-tu.dinh@...es.tech> wrote:
>
> Hi,
>
> On 30/06/2025 06:54, Abinash Singh wrote:
> > While building the kernel with LLVM, a warning was reported due to
> > excessive stack usage in `gntdev_ioctl`:
> >
> >       drivers/xen/gntdev.c:991: warning: stack frame size (1160) exceeds limit (1024) in function 'gntdev_ioctl'
> >
> > Further analysis revealed that the large stack frame was caused by
> > `struct gntdev_copy_batch`, which was declared on the stack inside
> > `gntdev_ioctl_grant_copy()`. Since this function was inlined into
> > `gntdev_ioctl`, the stack usage was attributed to the latter.
> >
> > This patch fixes the issue by dynamically allocating `gntdev_copy_batch`
> > using `kmalloc()`, which significantly reduces the stack footprint and
> > eliminates the warning.
> >
> > This approach is consistent with similar fixes upstream, such as:
> >
> > commit fa26198d30f3 ("iommu/io-pgtable-arm: dynamically allocate selftest device struct")
> >
> > Fixes: a4cdb556cae0 ("xen/gntdev: add ioctl for grant copy")
> > Signed-off-by: Abinash Singh <abinashsinghlalotra@...il.com>
> > ---
> > The stack usage was confirmed using the `-fstack-usage`  flag and mannual analysis, which showed:
> >
> >    drivers/xen/gntdev.c:953: gntdev_ioctl_grant_copy.isra   1048 bytes
> >    drivers/xen/gntdev.c:826: gntdev_copy                     56 bytes
> >
> > Since `gntdev_ioctl` was calling the inlined `gntdev_ioctl_grant_copy`, the total
> > frame size exceeded 1024 bytes, triggering the warning.
> >
> > This patch addresses the warning and keeps stack usage within acceptable limits.
> > Is this patch fine or kmalloc may affect performance ?
> > ---
>
> Have you measured the performance impact? gntdev_ioctl_grant_copy is
> called quite often especially by the backend. I'm afraid calling the
> allocator here may cause even more slowdown than there already is,
> especially when memory is tight.
>
> >   drivers/xen/gntdev.c | 24 +++++++++++++++---------
> >   1 file changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> > index 61faea1f0663..9811f3d7c87c 100644
> > --- a/drivers/xen/gntdev.c
> > +++ b/drivers/xen/gntdev.c
> > @@ -953,15 +953,19 @@ static int gntdev_grant_copy_seg(struct gntdev_copy_batch *batch,
> >   static long gntdev_ioctl_grant_copy(struct gntdev_priv *priv, void __user *u)
> >   {
> >       struct ioctl_gntdev_grant_copy copy;
> > -     struct gntdev_copy_batch batch;
> > +     struct gntdev_copy_batch *batch;
> >       unsigned int i;
> >       int ret = 0;
> >
> >       if (copy_from_user(&copy, u, sizeof(copy)))
> >               return -EFAULT;
> > -
> > -     batch.nr_ops = 0;
> > -     batch.nr_pages = 0;
> > +
> > +     batch = kmalloc(sizeof(*batch), GFP_KERNEL);
> > +     if (!batch)
> > +             return -ENOMEM;
> > +
> > +     batch->nr_ops = 0;
> > +     batch->nr_pages = 0;
> >
> >       for (i = 0; i < copy.count; i++) {
> >               struct gntdev_grant_copy_segment seg;
> > @@ -971,18 +975,20 @@ static long gntdev_ioctl_grant_copy(struct gntdev_priv *priv, void __user *u)
> >                       goto out;
> >               }
> >
> > -             ret = gntdev_grant_copy_seg(&batch, &seg, &copy.segments[i].status);
> > +             ret = gntdev_grant_copy_seg(batch, &seg, &copy.segments[i].status);
> >               if (ret < 0)
> >                       goto out;
> >
> >               cond_resched();
> >       }
> > -     if (batch.nr_ops)
> > -             ret = gntdev_copy(&batch);
> > -     return ret;
> > +     if (batch->nr_ops)
> > +             ret = gntdev_copy(batch);
> > +     goto free_batch;
> >
> >     out:
> > -     gntdev_put_pages(&batch);
> > +     gntdev_put_pages(batch);
> > +  free_batch:
> > +     kfree(batch);
> >       return ret;
> >   }
> >
>
>
>
> Ngoc Tu Dinh | Vates XCP-ng Developer
>
> XCP-ng & Xen Orchestra - Vates solutions
>
> web: https://vates.tech
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ