lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250728135022.255578-1-gmonaco@redhat.com>
Date: Mon, 28 Jul 2025 15:50:12 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Nam Cao <namcao@...utronix.de>
Cc: Gabriele Monaco <gmonaco@...hat.com>,
	Tomas Glozar <tglozar@...hat.com>,
	Juri Lelli <jlelli@...hat.com>,
	Clark Williams <williams@...hat.com>,
	John Kacur <jkacur@...hat.com>
Subject: [PATCH v5 0/9] rv: Add monitors to validate task switch

This series adds three monitors to the sched collection, extends and
replaces previously existing monitors:

{tss,snroc} => sts:
Not only prove that switches occur in scheduling context and scheduling
needs interrupt disabled but also that each call to the scheduler
disables interrupts to (optionally) switch.

nrp (NEW):
* preemption requires need resched which is cleared by any switch
  (includes a non optimal workaround for /nested/ preemptions)

sssw (NEW):
* suspension requires setting the task to sleepable and, after the
  switch occurs, the task requires a wakeup to come back to runnable

opid (NEW):
* waking and need-resched operations occur with interrupts and
  preemption disabled or in IRQ without explicitly disabling preemption

Also include some minor cleanup patches (1-4) tracepoints (6) and
preparatory fixes (5) covering some corner cases:

The series is currently based on the tracing rv/for-next tree.

Patch 1 adds da_handle_start_run_event_ also to per-task monitors

Patch 2 removes a trailing whitespace from the rv tracepoint string

Patch 3 fixes an out-of-bound memory access in DA tracepoints

Patch 4 adjusts monitors to have minimised Kconfig dependencies

Patch 5 detects race conditions when rv monitors run concurrently and
retries applying the events

Patch 6 adds the need_resched and removes unused arguments from
schedule entry/exit tracepoints

Patch 7 adds the sts monitor to replace tss and sncid

Patch 8 adds the nrp and sssw monitors

Patch 9 adds the opid monitor

NOTES

The nrp and sssw monitors include workarounds for racy conditions:

* A sleeping task requires to set the state to sleepable, but in case of
  a task sleeping on an rtlock, the set sleepable and wakeup events race
  and we don't always see the right order:

 5d..2. 107.488369: event: 639: sleepable x set_sleepable -> sleepable
 4d..5. 107.488369: event: 639: sleepable x wakeup -> running (final)
 5d..3. 107.488385: error: 639: switch_suspend not expected in the state running

    wakeup()                    set_state()
        state=RUNNING
                                    trace_set_state()
        trace_wakeup()
                                    state=SLEEPING

  I added a special event (switch_block) but there may be a better way.
  Taking a pi_lock in rtlock_slowlock_locked when setting the state to
  TASK_RTLOCK_WAIT avoids this race, although this is likely not
  something we want to do.

* I consider preemption any scheduling with preempt==true and assume
  this can happen only if need resched is set.
  In practice, however, we may see a preemption where the flag
  is not set. This can happen in one specific condition:

  need_resched
                  preempt_schedule()
                                        preempt_schedule_irq()
                                            __schedule()
  !need_resched
                      __schedule()

  We start a standard preemption (e.g. from preempt_enable when the flag
  is set), an interrupts occurs before we schedule and, on its exit path,
  it schedules, which clears the need_resched flag.
  When the preempted task runs again, we continue the standard
  preemption started earlier, although the flag is no longer set.

  I added a workaround to allow the model not to fail in this condition,
  by allowing a preemption without need_resched if an interrupt is
  received. This might catch false negatives too.

Changes since V4:
* Drop already applied patches for the tools/ directory
* Avoid using smp_processor_id() from tracepoint context

Changes since V3 [1]:
* Fix condition for lines shorter than 100 columns in dot2c.
* Fix Kconfig tooltip for container monitors in rvgen.
* Improve condition not to skip pid-0 in userspace tool.
* Rearrange monitors to reduce needed tracepoints and arguments.
* Separately track errors when DAs run out of retries due to races.
* Fix issue in opid with multiple handlers in the same interrupt.
* Cleanup patches by removing, squashing and reordering.

Changes since RFC2:
 * Arrange commits to prevent failed build while bisecting.
 * Avoid dot2k generated files to reach the column limit. (Nam Cao)
 * Rearrange and simplify da_monitor retry on racing events.
 * Improve nrp monitor to handle /nested/ preemption on IRQ.
 * Added minor patches (6-10).
 * Cleanup and rearrange order.
Changes since RFC [2]:
 * Remove wakeup tracepoint in try_to_block_task and use a different
   flavour of sched_set_state
 * Split the large srs monitor in two separate monitors for preemption
   and sleep. These no longer have a concept of running task, they just
   enforce the requirements for the different types of sched out.
 * Restore the snroc monitor to describe the relationship between
   generic sched out and sched in.
 * Add opid monitor.
 * Fix some build errors and cleanup.

[1] - https://lore.kernel.org/lkml/20250715071434.22508-1-gmonaco@redhat.com
[2] - https://lore.kernel.org/lkml/20250404084512.98552-11-gmonaco@redhat.com

To: Ingo Molnar <mingo@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
To: Steven Rostedt <rostedt@...dmis.org>
To: Nam Cao <namcao@...utronix.de>
Cc: Tomas Glozar <tglozar@...hat.com>
Cc: Juri Lelli <jlelli@...hat.com>
Cc: Clark Williams <williams@...hat.com>
Cc: John Kacur <jkacur@...hat.com>

Gabriele Monaco (9):
  rv: Add da_handle_start_run_event_ to per-task monitors
  rv: Remove trailing whitespace from tracepoint string
  rv: Use strings in da monitors tracepoints
  rv: Adjust monitor dependencies
  rv: Retry when da monitor detects race conditions
  sched: Adapt sched tracepoints for RV task model
  rv: Replace tss and sncid monitors with more complete sts
  rv: Add nrp and sssw per-task monitors
  rv: Add opid per-cpu monitor

 Documentation/trace/rv/monitor_sched.rst      | 307 +++++++++++++++---
 include/linux/rv.h                            |   3 +-
 include/linux/sched.h                         |   7 +-
 include/rv/da_monitor.h                       | 131 +++++---
 include/trace/events/sched.h                  |  12 +-
 kernel/sched/core.c                           |  13 +-
 kernel/trace/rv/Kconfig                       |  11 +-
 kernel/trace/rv/Makefile                      |   6 +-
 kernel/trace/rv/monitors/{tss => nrp}/Kconfig |  12 +-
 kernel/trace/rv/monitors/nrp/nrp.c            | 138 ++++++++
 kernel/trace/rv/monitors/nrp/nrp.h            |  75 +++++
 kernel/trace/rv/monitors/nrp/nrp_trace.h      |  15 +
 kernel/trace/rv/monitors/opid/Kconfig         |  19 ++
 kernel/trace/rv/monitors/opid/opid.c          | 168 ++++++++++
 kernel/trace/rv/monitors/opid/opid.h          | 104 ++++++
 .../sncid_trace.h => opid/opid_trace.h}       |   8 +-
 kernel/trace/rv/monitors/sched/Kconfig        |   1 +
 kernel/trace/rv/monitors/sco/sco.c            |   4 +-
 kernel/trace/rv/monitors/scpd/Kconfig         |   2 +-
 kernel/trace/rv/monitors/scpd/scpd.c          |   4 +-
 kernel/trace/rv/monitors/sncid/sncid.c        |  95 ------
 kernel/trace/rv/monitors/sncid/sncid.h        |  49 ---
 kernel/trace/rv/monitors/snep/Kconfig         |   2 +-
 kernel/trace/rv/monitors/snep/snep.c          |   4 +-
 .../trace/rv/monitors/{sncid => sssw}/Kconfig |  10 +-
 kernel/trace/rv/monitors/sssw/sssw.c          | 116 +++++++
 kernel/trace/rv/monitors/sssw/sssw.h          | 105 ++++++
 kernel/trace/rv/monitors/sssw/sssw_trace.h    |  15 +
 kernel/trace/rv/monitors/sts/Kconfig          |  19 ++
 kernel/trace/rv/monitors/sts/sts.c            | 156 +++++++++
 kernel/trace/rv/monitors/sts/sts.h            | 117 +++++++
 .../{tss/tss_trace.h => sts/sts_trace.h}      |   8 +-
 kernel/trace/rv/monitors/tss/tss.c            |  90 -----
 kernel/trace/rv/monitors/tss/tss.h            |  47 ---
 kernel/trace/rv/monitors/wip/Kconfig          |   2 +-
 kernel/trace/rv/rv_trace.h                    | 114 ++++---
 tools/verification/models/sched/nrp.dot       |  29 ++
 tools/verification/models/sched/opid.dot      |  35 ++
 tools/verification/models/sched/sncid.dot     |  18 -
 tools/verification/models/sched/sssw.dot      |  30 ++
 tools/verification/models/sched/sts.dot       |  38 +++
 tools/verification/models/sched/tss.dot       |  18 -
 42 files changed, 1665 insertions(+), 492 deletions(-)
 rename kernel/trace/rv/monitors/{tss => nrp}/Kconfig (51%)
 create mode 100644 kernel/trace/rv/monitors/nrp/nrp.c
 create mode 100644 kernel/trace/rv/monitors/nrp/nrp.h
 create mode 100644 kernel/trace/rv/monitors/nrp/nrp_trace.h
 create mode 100644 kernel/trace/rv/monitors/opid/Kconfig
 create mode 100644 kernel/trace/rv/monitors/opid/opid.c
 create mode 100644 kernel/trace/rv/monitors/opid/opid.h
 rename kernel/trace/rv/monitors/{sncid/sncid_trace.h => opid/opid_trace.h} (66%)
 delete mode 100644 kernel/trace/rv/monitors/sncid/sncid.c
 delete mode 100644 kernel/trace/rv/monitors/sncid/sncid.h
 rename kernel/trace/rv/monitors/{sncid => sssw}/Kconfig (58%)
 create mode 100644 kernel/trace/rv/monitors/sssw/sssw.c
 create mode 100644 kernel/trace/rv/monitors/sssw/sssw.h
 create mode 100644 kernel/trace/rv/monitors/sssw/sssw_trace.h
 create mode 100644 kernel/trace/rv/monitors/sts/Kconfig
 create mode 100644 kernel/trace/rv/monitors/sts/sts.c
 create mode 100644 kernel/trace/rv/monitors/sts/sts.h
 rename kernel/trace/rv/monitors/{tss/tss_trace.h => sts/sts_trace.h} (67%)
 delete mode 100644 kernel/trace/rv/monitors/tss/tss.c
 delete mode 100644 kernel/trace/rv/monitors/tss/tss.h
 create mode 100644 tools/verification/models/sched/nrp.dot
 create mode 100644 tools/verification/models/sched/opid.dot
 delete mode 100644 tools/verification/models/sched/sncid.dot
 create mode 100644 tools/verification/models/sched/sssw.dot
 create mode 100644 tools/verification/models/sched/sts.dot
 delete mode 100644 tools/verification/models/sched/tss.dot


base-commit: b8a7fba39cd49eab343bfe561d85bb5dc57541af
-- 
2.50.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ