lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADHxFxS+qpmD8r1uxru+VWLj=K616=jLKbBgUR3Ed7ZBY1gidg@mail.gmail.com>
Date: Thu, 10 Apr 2025 17:51:24 +0800
From: hupu <hupu.gm@...il.com>
To: John Stultz <jstultz@...gle.com>
Cc: linux-kernel@...r.kernel.org, juri.lelli@...hat.com, peterz@...radead.org, 
	vschneid@...hat.com, mingo@...hat.com, vincent.guittot@...aro.org, 
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, 
	mgorman@...e.de, hupu@...nssion.com
Subject: Re: [RFC 1/1] sched: Skip redundant operations for proxy tasks
 needing return migration

Hi John:
Thank you for your feedback.

On Thu, Apr 10, 2025 at 10:41 AM John Stultz <jstultz@...gle.com> wrote:
>
> Unfortunately this patch crashes pretty quickly in my testing. The
> first issue was proxy_needs_return() calls deactivate_task() w/
> DEQUEUE_NOCLOCK, which causes warnings when the update_rq_clock()
> hasn't been called. Preserving the update_rq_clock() line before
> checking proxy_needs_return() avoided that issue, but then I saw hangs
> during bootup, which I suspect is due to us shortcutting over the
> sched_delayed case.
>
> Moving the proxy_needs_return above the if(task_on_cpu())
> wakeup_preempt() logic booted ok, but I'm still a little hesitant of
> what side-effects that might cause.

I’m sorry for the confusion caused by this patch. Here is the
rationale behind my approach:

To ensure that donor tasks can get a suitable CPU and avoid negative
impacts from the Proxy-Execution on load balancing,
`proxy_needs_return()` in `ttwu_runnable()` should return false for
all donor tasks. This allows `try_to_wake_up()` to use `set_task_cpu`
to reselect a CPU for the donor tasks, unless the donor is already
running on a CPU.

This patch worked correctly on my QEMU-based test platform, it seems
our testing methods might differ. Could you please share the details
of your testing environment and methodology? I’ll try to replicate the
issue using the same approach.

In the meantime, I will carefully revisit the logic in this patch to
ensure its correctness and consistency. Once I’ve completed the
review, I look forward to further discussing the details with you.

Thank you again for your valuable feedback!

Best regards,
hupu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ