lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANDhNCo+G4_t8jYU-QNPz42uZsKdMgEmTnr8pYSKbgm26NJUCg@mail.gmail.com>
Date: Wed, 23 Jul 2025 15:42:35 -0700
From: John Stultz <jstultz@...gle.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Joel Fernandes <joelagnelf@...dia.com>, 
	Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Valentin Schneider <vschneid@...hat.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, 
	Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>, Will Deacon <will@...nel.org>, 
	Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>, 
	Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, 
	Suleiman Souhlal <suleiman@...gle.com>, kuyo chang <kuyo.chang@...iatek.com>, hupu <hupu.gm@...il.com>, 
	kernel-team@...roid.com
Subject: Re: [RFC][PATCH v20 0/6] Donor Migration for Proxy Execution (v20)

On Wed, Jul 23, 2025 at 7:44 AM Juri Lelli <juri.lelli@...hat.com> wrote:
> On 22/07/25 07:05, John Stultz wrote:
> > Issues still to address with the full series:
> > * There’s a new quirk from recent changes for dl_server that
> >   is causing the ksched_football test in the full series to hang
> >   at boot. I’ve bisected and reverted the change for now, but I
> >   need to better understand what’s going wrong.
>
> After our quick chat on IRC, I remembered that there were additional two
> fixes for dl-server posted, but still not on tip.
>
> https://lore.kernel.org/lkml/20250615131129.954975-1-kuyo.chang@mediatek.com/
> https://lore.kernel.org/lkml/20250627035420.37712-1-yangyicong@huawei.com/
>
> So I went ahead and pushed them to
>
> git@...hub.com:jlelli/linux.git upstream/fix-dlserver
>
> Could you please check if any (or both together) of the two topmost
> changes do any good to the issue you are seeing?

Thanks for sharing these! Unfortunately they don't seem to help. :/

I'm still digging down into the behavior. I'm not 100% sure the
problem isn't just my test logic starving itself (after creating
NR_CPU RT spinners, its not surprising creating new threads might be
tough if the non-RT kthreadd can't get scheduled), but I don't quite
see how the dl_server patch cccb45d7c429 ("sched/deadline: Less
agressive dl_server handling") would be the cause of the dramatic
behavioral change - esp as this test was also functional prior to the
dl_server logic landing.  Also it's odd just re-adding the
dl_server_stop() call removed from dequeue_entities() seems to make it
work again. So I clearly need to dig more to understand the behavior.

Thanks again for your suggestions! I'm going to dig further and let
folks know when I figure this detail out

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ