lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221103141641.3055981-1-peternewman@google.com>
Date:   Thu,  3 Nov 2022 15:16:40 +0100
From:   Peter Newman <peternewman@...gle.com>
To:     Reinette Chatre <reinette.chatre@...el.com>,
        Fenghua Yu <fenghua.yu@...el.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
        linux-kernel@...r.kernel.org, jannh@...gle.com, eranian@...gle.com,
        kpsingh@...gle.com, derkling@...gle.com, james.morse@....com,
        Peter Newman <peternewman@...gle.com>
Subject: [PATCH 0/1] x86/resctrl: fix task CLOSID update race

Hi Reinette, Fenghua,

Below is my patch to address the IPI race we discussed in the container
move RFD thread[1].

The patch below uses the new task_call_func() interface to serialize
updating closid and rmid with any context switch of the task. AFAICT,
the implementation of this function acts like a mutex with context
switch, but I'm not certain whether it is intended to be one. If this is
not how task_call_func() is meant to be used, I will instead move the
code performing the update under sched/ where it can be done holding the
task_rq_lock() explicitly, as Reinette has suggested before[2].

>From my own measurements, this change will double the time to complete a
mass-move operation, such as rmdir on an rdtgroup with a large task
list. But to the best of my knowedge, these large-scale reconfigurations
of the control groups are infrequent, and the baseline I'm measuring
against is racy anyways.

What's still unclear to me is, when processing a large task list, is
obtaining the pi/rq locks for thousands of tasks (all while read-locking
the tasklist_lock) better than just blindly notifying all CPUs? My guess
is that the situation where notifying all CPUs would be better is
uncommon for most users and probably more likely in Google's use case
than most others, as we have a use case for moving large container jobs
to a different MBA group.

Thanks!
-Peter

[1] https://lore.kernel.org/all/CALPaoCg2-9ARbK+MEgdvdcjJtSy_2H6YeRkLrT97zgy8Aro3Vg@mail.gmail.com/
[2] https://lore.kernel.org/lkml/d3c06fa3-83a4-7ade-6b08-3a7259aa6c4b@intel.com/

Peter Newman (1):
  x86/resctrl: serialize task CLOSID update with task_call_func()

 arch/x86/include/asm/resctrl.h         | 11 ++--
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 83 +++++++++++++++-----------
 2 files changed, 51 insertions(+), 43 deletions(-)

-- 
2.38.1.273.g43a17bfeac-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ