lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCro84C8uxTo5MJ1xX0wGnTXE7_REm4nqp7jGQ+mN0wfkQ@mail.gmail.com>
Date: Wed, 20 Mar 2024 15:42:31 -0700
From: John Stultz <jstultz@...gle.com>
To: Vishal Chourasia <vishalc@...ux.ibm.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Daniel Bristot de Oliveira <bristot@...hat.com>, Phil Auld <pauld@...hat.com>, 
	Valentin Schneider <vschneid@...hat.com>, kernel-team@...roid.com, 
	Zimuzo Ezeozue <zezeozue@...gle.com>
Subject: Re: [RESEND][PATCH v2] sched: Add trace_sched_waking() tracepoint to sched_ttwu_pending()

On Tue, Mar 19, 2024 at 10:24 PM Vishal Chourasia <vishalc@...ux.ibm.com> wrote:
> On Mon, Mar 18, 2024 at 12:28:33PM -0700, John Stultz wrote:
> > Zimuzo reported seeing occasional cases in perfetto traces where
> > tasks went from sleeping directly to trace_sched_wakeup()
> > without always seeing a trace_sched_waking().
> >
> > Looking at the code, trace_sched_wakeup() is only called in
> > ttwu_do_wakeup()
> >
> > The call paths that get you to ttwu_do_wakeup() are:
> > try_to_wake_up() -> ttwu_do_wakeup()
> > try_to_wake_up() -> ttwu_runnable() -> ttwu_do_wakeup()
> > try_to_wake_up() -> ttwu_queue() -> ttwu_do_activate() -> ttwu_do_wakeup()
> > sched_ttwu_pending() -> ttwu_do_activate() -> ttwu_do_wakeup()
> Notably, sched_ttwu_pending() is invoked for remote wakeups.
>
> Given this, I anticipate a scenario similar to the following
> occurred: When a process (P) is to be awakened on a remote CPU,
> the scheduler adds P to the remote CPU's wakelist,a per-CPU queue,
> and sends an IPI to the remote CPU. This action triggers
> sched_ttwu_pending() on the remote CPU, which then processes the
> wakelist and wakes up the queued processes.

Hey! Thanks for taking a look here and sharing this feedback.

Indeed, looking closer here I do have a hard time seeing how we can
get to sched_ttwu_pending() without having gone through the paths in
try_to_wake_up() that would call trace_sched_wakeup():
  try_to_wake_up()->[ttwu_queue ->]
ttwu_queue_wakelist()->sched_ttwu_pending()->...

> In this scenario, the "waking trace" of P, signifying the initiation
> of the wake-up, is recorded on the CPU where try_to_wake_up was executed.
> Meanwhile, the "wakeup trace," denoting the completion of the wake-up,
> is observed on the remote CPU where sched_ttwu_pending() is executed.
>
> Is there a possibility that something other than the above occurred
> in your case?

I suspect that the case reported may be that the task went to sleep
right after the trace_sched_waking(). Which could result in the
transition from sleeping to trace_sched_wakeup() without the
trace_sched_waking() inbetween.  The added latency from the remote
wakeup probably increases the chance for this race to occur.

So I'll withdraw this change for now and sync back up to check if my
suspicions are correct or not.

Thanks so much again for pointing this out.
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ