lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z49LV4I63Qeh3oSz@gmail.com>
Date: Tue, 21 Jan 2025 08:23:03 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Shrikanth Hegde <sshegde@...ux.ibm.com>, Tejun Heo <tj@...nel.org>
Subject: [GIT PULL v2] Scheduler enhancements for v6.14


* Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:

> On 20-Jan-2025 12:07:41 PM, Ingo Molnar wrote:
> > 
> > Linus,
> > 
> > Please pull the latest sched/core Git tree from:
> > 
> >    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-20
> > 
> >    # HEAD: 7d9da040575b343085287686fa902a5b2d43c7ca psi: Fix race when task wakes up before psi_sched_switch() adjusts flags
> > 
> > Scheduler enhancements for v6.14:
> 
> [...]
> 
> >  - RSEQ enhancements:
> > 
> >    - Validate read-only fields under DEBUG_RSEQ config
> >      (Mathieu Desnoyers)
> 
> FYI, a regression introduced by this commit was reported by s390x
> glibc developers testing against linux-next:
> 
> https://sourceware.org/pipermail/libc-alpha/2025-January/163993.html
> 
> I've sent a fix here:
> 
> https://lore.kernel.org/lkml/20250116205956.836074-1-mathieu.desnoyers@efficios.com/
> 
> The commit introducing the issue is in this PR, but not the fix.

Indeed - with the bug RSEQ_FLAG_UNREGISTER would fail with an incorrect 
-EFAULT return.

I've applied your fix, and updated the pull request for Linus further 
below. If Linus has already pulled I'll send a fixes pull request 
separately, or Linus can apply the fix from email directly:

  Acked-by: Ingo Molnar <mingo@...nel.org>

Or he can pull the sched-core-2025-01-21 tag below safely on top of 
sched-core-2025-01-20, which will result in a diffstat of:

  Mathieu Desnoyers (1):
      rseq: Fix rseq unregistration regression

  kernel/rseq.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Since I booted the scheduler tree on generic desktops and it was tested 
on other systems as well and nothing appeared to be broken, I presume 
RSEQ_FLAG_UNREGISTER is used only in libc syscall-testcases and in 
specific applications?

Thanks,

	Ingo

===================================>
Linus,

Please pull the latest sched/core Git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-21

   # HEAD: 40724ecafccb1fb62b66264854e8c3ad394c8f3d rseq: Fix rseq unregistration regression

Scheduler enhancements for v6.14:

 - Fair scheduler (SCHED_FAIR) enhancements:

   - Behavioral improvements:
     - Untangle NEXT_BUDDY and pick_next_task() (Peter Zijlstra)

   - Delayed-dequeue enhancements & fixes: (Vincent Guittot)

     - Rename h_nr_running into h_nr_queued
     - Add new cfs_rq.h_nr_runnable
     - Use the new cfs_rq.h_nr_runnable
     - Removed unsued cfs_rq.h_nr_delayed
     - Rename cfs_rq.idle_h_nr_running into h_nr_idle
     - Remove unused cfs_rq.idle_nr_running
     - Rename cfs_rq.nr_running into nr_queued
     - Do not try to migrate delayed dequeue task
     - Fix variable declaration position
     - Encapsulate set custom slice in a __setparam_fair() function

   - Fixes:
     - Fix race between yield_to() and try_to_wake_up() (Tianchen Ding)
     - Fix CPU bandwidth limit bypass during CPU hotplug (Vishal Chourasia)

   - Cleanups:
     - Clean up in migrate_degrades_locality() to improve
       readability (Peter Zijlstra)
     - Mark m*_vruntime() with __maybe_unused (Andy Shevchenko)
     - Update comments after sched_tick() rename (Sebastian Andrzej Siewior)
     - Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()
       (Valentin Schneider)

 - Deadline scheduler (SCHED_DL) enhancements:

   - Restore dl_server bandwidth on non-destructive root domain
     changes (Juri Lelli)

   - Correctly account for allocated bandwidth during
     hotplug (Juri Lelli)

   - Check bandwidth overflow earlier for hotplug (Juri Lelli)

   - Clean up goto label in pick_earliest_pushable_dl_task()
     (John Stultz)

   - Consolidate timer cancellation (Wander Lairson Costa)

 - Load-balancer enhancements:

   - Improve performance by prioritizing migrating eligible
     tasks in sched_balance_rq() (Hao Jia)

   - Do not compute NUMA Balancing stats unnecessarily during
     load-balancing (K Prateek Nayak)

   - Do not compute overloaded status unnecessarily during
     load-balancing (K Prateek Nayak)

 - Generic scheduling code enhancements:

   - Use READ_ONCE() in task_on_rq_queued(), to consistently use
     the WRITE_ONCE() updated ->on_rq field (Harshit Agarwal)

 - Isolated CPUs support enhancements: (Waiman Long)

   - Make "isolcpus=nohz" equivalent to "nohz_full"
   - Consolidate housekeeping cpumasks that are always identical
   - Remove HK_TYPE_SCHED
   - Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE

 - RSEQ enhancements:

   - Validate read-only fields under DEBUG_RSEQ config
     (Mathieu Desnoyers)

 - PSI enhancements:

   - Fix race when task wakes up before psi_sched_switch()
     adjusts flags (Chengming Zhou)

 - IRQ time accounting performance enhancements: (Yafang Shao)

   - Define sched_clock_irqtime as static key
   - Don't account irq time if sched_clock_irqtime is disabled

 - Virtual machine scheduling enhancements:

   - Don't try to catch up excess steal time (Suleiman Souhlal)

 - Heterogenous x86 CPU scheduling enhancements: (K Prateek Nayak)

   - Convert "sysctl_sched_itmt_enabled" to boolean
   - Use guard() for itmt_update_mutex
   - Move the "sched_itmt_enabled" sysctl to debugfs
   - Remove x86_smt_flags and use cpu_smt_flags directly
   - Use x86_sched_itmt_flags for PKG domain unconditionally

 - Debugging code & instrumentation enhancements:

   - Change need_resched warnings to pr_err() (David Rientjes)
   - Print domain name in /proc/schedstat (K Prateek Nayak)
   - Fix value reported by hot tasks pulled in /proc/schedstat (Peter Zijlstra)
   - Report the different kinds of imbalances in /proc/schedstat (Swapnil Sapkal)
   - Move sched domain name out of CONFIG_SCHED_DEBUG (Swapnil Sapkal)
   - Update Schedstat version to 17 (Swapnil Sapkal)

 Thanks,

	Ingo

------------------>
Andy Shevchenko (1):
      sched/fair: Mark m*_vruntime() with __maybe_unused

Chengming Zhou (1):
      psi: Fix race when task wakes up before psi_sched_switch() adjusts flags

David Rientjes (1):
      sched/debug: Change need_resched warnings to pr_err

Hao Jia (1):
      sched/core: Prioritize migrating eligible tasks in sched_balance_rq()

Harshit Agarwal (1):
      sched: add READ_ONCE to task_on_rq_queued

John Stultz (1):
      sched: deadline: Cleanup goto label in pick_earliest_pushable_dl_task

Juri Lelli (3):
      sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes
      sched/deadline: Correctly account for allocated bandwidth during hotplug
      sched/deadline: Check bandwidth overflow earlier for hotplug

K Prateek Nayak (8):
      sched/stats: Print domain name in /proc/schedstat
      x86/itmt: Convert "sysctl_sched_itmt_enabled" to boolean
      x86/itmt: Use guard() for itmt_update_mutex
      x86/itmt: Move the "sched_itmt_enabled" sysctl to debugfs
      x86/topology: Remove x86_smt_flags and use cpu_smt_flags directly
      x86/topology: Use x86_sched_itmt_flags for PKG domain unconditionally
      sched/fair: Do not compute NUMA Balancing stats unnecessarily during lb
      sched/fair: Do not compute overloaded status unnecessarily during lb

Mathieu Desnoyers (2):
      rseq: Validate read-only fields under DEBUG_RSEQ config
      rseq: Fix rseq unregistration regression

Peter Zijlstra (3):
      sched/fair: Untangle NEXT_BUDDY and pick_next_task()
      sched/fair: Fix value reported by hot tasks pulled in /proc/schedstat
      sched/fair: Cleanup in migrate_degrades_locality() to improve readability

Sebastian Andrzej Siewior (1):
      sched/fair: Update comments after sched_tick() rename.

Suleiman Souhlal (1):
      sched: Don't try to catch up excess steal time.

Swapnil Sapkal (3):
      sched: Report the different kinds of imbalances in /proc/schedstat
      sched: Move sched domain name out of CONFIG_SCHED_DEBUG
      docs: Update Schedstat version to 17

Tianchen Ding (1):
      sched: Fix race between yield_to() and try_to_wake_up()

Valentin Schneider (1):
      sched/fair: Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()

Vincent Guittot (10):
      sched/fair: Rename h_nr_running into h_nr_queued
      sched/fair: Add new cfs_rq.h_nr_runnable
      sched/fair: Use the new cfs_rq.h_nr_runnable
      sched/fair: Removed unsued cfs_rq.h_nr_delayed
      sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle
      sched/fair: Remove unused cfs_rq.idle_nr_running
      sched/fair: Rename cfs_rq.nr_running into nr_queued
      sched/fair: Do not try to migrate delayed dequeue task
      sched/fair: Fix variable declaration position
      sched/fair: Encapsulate set custom slice in a __setparam_fair() function

Vishal Chourasia (1):
      sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug

Waiman Long (4):
      sched/core: Remove HK_TYPE_SCHED
      sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"
      sched/isolation: Consolidate housekeeping cpumasks that are always identical
      sched: Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE

Wander Lairson Costa (1):
      sched/deadline: Consolidate Timer Cancellation

Yafang Shao (3):
      sched: Define sched_clock_irqtime as static key
      sched: Don't account irq time if sched_clock_irqtime is disabled
      sched, psi: Don't account irq time if sched_clock_irqtime is disabled


 Documentation/admin-guide/kernel-parameters.txt |   4 +-
 Documentation/scheduler/sched-stats.rst         | 126 ++++---
 arch/x86/include/asm/topology.h                 |   4 +-
 arch/x86/kernel/itmt.c                          |  81 ++---
 arch/x86/kernel/smpboot.c                       |  19 +-
 include/linux/sched.h                           |  10 +
 include/linux/sched/isolation.h                 |  21 +-
 include/linux/sched/topology.h                  |  13 +-
 kernel/rseq.c                                   |  98 ++++++
 kernel/sched/core.c                             |  94 +++--
 kernel/sched/cputime.c                          |  16 +-
 kernel/sched/deadline.c                         | 119 +++++--
 kernel/sched/debug.c                            |  25 +-
 kernel/sched/fair.c                             | 444 ++++++++++++++----------
 kernel/sched/features.h                         |   9 +
 kernel/sched/isolation.c                        |  22 +-
 kernel/sched/pelt.c                             |   4 +-
 kernel/sched/psi.c                              |   7 +-
 kernel/sched/sched.h                            |  37 +-
 kernel/sched/stats.c                            |  11 +-
 kernel/sched/stats.h                            |   4 +
 kernel/sched/syscalls.c                         |  18 +-
 kernel/sched/topology.c                         |  12 +-
 23 files changed, 720 insertions(+), 478 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ