lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b7ad9cb9-4fed-4035-a544-a4d827662c92@amd.com>
Date: Fri, 27 Dec 2024 10:24:26 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Chengming Zhou <chengming.zhou@...ux.dev>, Johannes Weiner
	<hannes@...xchg.org>, Suren Baghdasaryan <surenb@...gle.com>, Ingo Molnar
	<mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Juri Lelli
	<juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
	<linux-kernel@...r.kernel.org>
CC: Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
	<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, Chengming Zhou
	<zhouchengming@...edance.com>, Muchun Song <muchun.song@...ux.dev>, "Gautham
 R. Shenoy" <gautham.shenoy@....com>, Chuyi Zhou <zhouchuyi@...edance.com>
Subject: Re: [PATCH] psi: Fix race when task wakes up before
 psi_sched_switch() adjusts flags


On 12/27/2024 10:10 AM, Chengming Zhou wrote:
> On 2024/12/27 12:10, K Prateek Nayak wrote:
>> Hello there,
>>
> [...]
>>>
>>> Just made a quick fix and tested passed using your script.
>>
>> Thank you! The diff seems to be malformed as a result of whitespaces but
>> I was able to test if by recreating the diff. Feel free to add:
>>
>> Reported-by: K Prateek Nayak <kprateek.nayak@....com>
>> Closes: https://lore.kernel.org/lkml/20241226053441.1110-1- kprateek.nayak@....com/
>> Tested-by: K Prateek Nayak <kprateek.nayak@....com>
>>
>> If you can give your sign off, I could add a commit message and send it on
>> your behalf too.
> 
> Great, thanks for your time!
> 
> Signed-off-by: Chengming Zhou <chengming.zhou@...ux.dev>

Thank you!

> 
>>
>>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 3e5a6bf587f9..065ac76c47f9 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -6641,7 +6641,6 @@ static void __sched notrace __schedule(int sched_mode)
>>>           * as a preemption by schedule_debug() and RCU.
>>>           */
>>>          bool preempt = sched_mode > SM_NONE;
>>> -       bool block = false;
>>>          unsigned long *switch_count;
>>>          unsigned long prev_state;
>>>          struct rq_flags rf;
>>> @@ -6702,7 +6701,7 @@ static void __sched notrace __schedule(int sched_mode)
>>>                          goto picked;
>>>                  }
>>>          } else if (!preempt && prev_state) {
>>> -               block = try_to_block_task(rq, prev, prev_state);
>>> +               try_to_block_task(rq, prev, prev_state);
>>>                  switch_count = &prev->nvcsw;
>>>          }
>>>
>>> @@ -6748,7 +6747,8 @@ static void __sched notrace __schedule(int sched_mode)
>>>
>>>                  migrate_disable_switch(rq, prev);
>>>                  psi_account_irqtime(rq, prev, next);
>>> -               psi_sched_switch(prev, next, block);
>>> +               psi_sched_switch(prev, next, !task_on_rq_queued(prev) ||
>>> +                                               prev->se.sched_delayed);
>>>
>>>                  trace_sched_switch(preempt, prev, next, prev_state);
>>>
>>> diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
>>> index 8ee0add5a48a..65efe45fcc77 100644
>>> --- a/kernel/sched/stats.h
>>> +++ b/kernel/sched/stats.h
>>> @@ -150,7 +150,7 @@ static inline void psi_enqueue(struct task_struct *p, int flags)
>>>                  set = TSK_RUNNING;
>>>                  if (p->in_memstall)
>>>                          set |= TSK_MEMSTALL | TSK_MEMSTALL_RUNNING;
>>> -       } else {
>>> +       } else if (!task_on_cpu(task_rq(p), p)) {
>>
>> One small nit. here
>>
>> If the task is on CPU at this point, both set and clear are 0 but
>> psi_task_change() is still called and I don't see it bailing out if it
>> doesn't have to adjust any flags.
> 
> Yes.
> 
>>
>> Can we instead just do an early return if task_on_cpu(task_rq(p), p)
>> returns true? I've tested that version too and I haven't seen any
>> splats.
> 
> I thought it's good to preserve the current flow that:
> 
> if (restore)
>      return;
> 
> if (migrate)
>      ...
> else if (wakeup)
>      ...
> 
> As for early return when `task_on_cpu()`, it looks right to me.

Thank you for your feedback. I agree the current approach fits the flow
well but the call to psi_task_change() which does a whole lot although
the flags remain same (like seq_count updates, etc.) seems unnecessary.

> Anyway, it's not a migrate or wakeup from PSI POV.
> 
> Thanks!
> 
>>
>>>                  /* Wakeup of new or sleeping task */
>>>                  if (p->in_iowait)
>>>                          clear |= TSK_IOWAIT;
>>

-- 
Thanks and Regards,
Prateek


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ