lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Nov 2022 15:19:51 +0100
From:   Peter Newman <peternewman@...gle.com>
To:     reinette.chatre@...el.com, fenghua.yu@...el.com
Cc:     bp@...en8.de, derkling@...gle.com, eranian@...gle.com,
        hpa@...or.com, james.morse@....com, jannh@...gle.com,
        kpsingh@...gle.com, linux-kernel@...r.kernel.org, mingo@...hat.com,
        tglx@...utronix.de, x86@...nel.org,
        Peter Newman <peternewman@...gle.com>
Subject: [PATCH v3 0/2] x86/resctrl: fix task CLOSID update race

Hi Reinette, Fenghua,

I've reorganized the patches for clarity, following James's guidance.

The patch series addresses the IPI race we discussed in the container
move RFD thread[1].

The first patch changes group-wide CLOSID/RMID updates to IPI all CPUs.
Now that the synchronization cost of correctly updating a single task is
more than originally thought, we believe that it's cheaper to IPI all
CPUs than forming a more precise CPU mask by synchronizing with all
tasks in an rdtgroup, especially when there is a large number of tasks
in the group. It's possible that this update could upset users who
frequently delete groups with few tasks. If anyone is aware of a use
case that frequently deletes groups, we can consider mitigations.

The second one uses the new task_call_func() interface to serialize
updating closid and rmid with any context switch of the task. AFAICT,
the implementation of this function acts like a mutex with context
switch, but I'm not certain whether it is intended to be one. If this is
not how task_call_func() is meant to be used, I will instead move the
code performing the update under sched/ where it can be done holding the
task_rq_lock() explicitly, as Reinette has suggested before[2].

Updates in v3:
 - Split the handling of multi-task and single-task operations into
   separate patches, now that they're handled differently.
 - Clarify justification in the commit message, including moving some of
   it out of inline code comment.
Updates in v2:
 - Following Reinette's suggestion: use task_call_func() for single
   task, IPI broadcast for group movements.
 - Rebased to v6.1-rc4

v1: https://lore.kernel.org/lkml/20221103141641.3055981-1-peternewman@google.com/
v2: https://lore.kernel.org/lkml/20221110135346.2209839-1-peternewman@google.com/

Thanks!
-Peter

[1] https://lore.kernel.org/all/CALPaoCg2-9ARbK+MEgdvdcjJtSy_2H6YeRkLrT97zgy8Aro3Vg@mail.gmail.com/
[2] https://lore.kernel.org/lkml/d3c06fa3-83a4-7ade-6b08-3a7259aa6c4b@intel.com/

Peter Newman (2):
  x86/resctrl: IPI all CPUs for group updates
  x86/resctrl: update task closid/rmid with task_call_func()

 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 128 +++++++++++--------------
 1 file changed, 58 insertions(+), 70 deletions(-)

--
2.38.1.493.g58b659f92b-goog

Powered by blists - more mailing lists