[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0eee9b08bf1f1889b3455099a68f9eed7f71c50e.camel@9elements.com>
Date: Mon, 24 Jun 2024 12:25:27 +0200
From: Marcello Sylvester Bauer <marcello.bauer@...ements.com>
To: linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org, x86@...nel.org,
syzbot+c793a7eca38803212c61@...kaller.appspotmail.com,
syzbot+5127feb52165f8ab165b@...kaller.appspotmail.com,
oe-lkp@...ts.linux.dev, bp@...en8.de, dave.hansen@...ux.intel.com,
syzkaller-bugs@...glegroups.com
Cc: Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker
<frederic@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Uwe
Kleine-Koenig <u.kleine-koenig@...gutronix.de>, gregkh@...uxfoundation.org,
hpa@...or.com, mingo@...hat.com, stern@...land.harvard.edu, Alan Stern
<stern@...land.harvard.edu>, Matthias Stoeckl
<matthias.stoeckl@...unet.com>
Subject: Needed help: dummy_hcd: Fix stalls/inconsistent_lock_state due to
hrtimer migration
Hi everyone,
I need some help evaluating and fixing a regression due to migration to
hztimer scheduler in dummy_hcd.
About two months ago I was investigating poor performance for the mass
storage gadget (g_mass_storage) due to slow timings in the loopback hcd
driver (dummy_hcd). One of the reasons was that dummy_hcd used the old
timer API, where the interval is tied to the internal kernel timer
frequency. So I submitted the patch to migrate to the hrtimer API[^1],
which was quickly approved.
Since then, syzbot[^2][^3] and intel's kernel test bot[^4] are
detecting rcu stalls/inconsistent_lock_state due to my patch, and I'm
trying to figure out how to fix it.
Both bots indicate that the problem is around the usb_hcd_giveback_urb
function call and it's locking mechanism.
My patch just replaces the timer API calls without changing anything
else in the code, so I'm not sure if my patch is actually the root
cause here. And following the instructions to reproduce syzbot
regressions[^5] even with the provided assets (bzImage, disk image,
repro.c) it is quite inconsistent to cause this stall. I have also
tried to follow Alex Stern's advice, but have not been able to cause a
stall manually.
So I don't know what to do next. Can someone with more expertise in
timers look into this?
Any hints or help in investigating or fixing this regression would be
greatly appreciated.
Thanks
Marcello
[1]:
https://lore.kernel.org/all/57a1c2180ff74661600e010c234d1dbaba1d0d46.1712843963.git.sylv@sylv.io/
[2]:
https://syzkaller.appspot.com/bug?id=e2befc3f5c24e08345751880365468ef18fd8dc5
[3]: https://syzkaller.appspot.com/bug?extid=5127feb52165f8ab165b
[4]:
https://lore.kernel.org/oe-lkp/202406141323.413a90d2-lkp@intel.com/
[5]:
https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md
On Tue, 2024-06-04 at 14:05 +0200, Marcello Sylvester Bauer wrote:
> Greetings,
>
> I'm currently investigating this regression to properly fix it. My
> patch only replaces the corresponding timer API calls without
> actually
> changing the code. I'm trying to get it to work properly with the
> hrtimer API.
>
> Any hints on how to accomplish this are welcome.
>
> Thanks
> Marcello
>
> On Thu, 2024-05-16 at 15:01 -0700, syzbot wrote:
> > syzbot has bisected this issue to:
> >
> > commit a7f3813e589fd8e2834720829a47b5eb914a9afe
> > Author: Marcello Sylvester Bauer <sylv@...v.io>
> > Date: Thu Apr 11 14:51:28 2024 +0000
> >
> > usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler
> >
> > bisection log:
> > https://syzkaller.appspot.com/x/bisect.txt?x=119318d0980000
> > start commit: 75fa778d74b7 Add linux-next specific files for
> > 20240510
> > git tree: linux-next
> > final oops:
> > https://syzkaller.appspot.com/x/report.txt?x=139318d0980000
> > console output:
> > https://syzkaller.appspot.com/x/log.txt?x=159318d0980000
> > kernel config:
> > https://syzkaller.appspot.com/x/.config?x=ccdd3ebd6715749a
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=c793a7eca38803212c61
> > syz repro:
> > https://syzkaller.appspot.com/x/repro.syz?x=16dcd598980000
> > C reproducer:
> > https://syzkaller.appspot.com/x/repro.c?x=151d9c78980000
> >
> > Reported-by: syzbot+c793a7eca38803212c61@...kaller.appspotmail.com
> > Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer
> > transfer scheduler")
> >
> > For information about bisection process see:
> > https://goo.gl/tpsmEJ#bisection
>
Download attachment "signature.asc" of type "application/pgp-signature" (875 bytes)
Powered by blists - more mailing lists