[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211007192346.731667417@fedora.localdomain>
Date: Thu, 07 Oct 2021 16:23:46 -0300
From: Marcelo Tosatti <mtosatti@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Nitesh Lal <nilal@...hat.com>,
Nicolas Saenz Julienne <nsaenzju@...hat.com>,
Frederic Weisbecker <frederic@...nel.org>,
Christoph Lameter <cl@...ux.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Alex Belits <abelits@...its.com>, Peter Xu <peterx@...hat.com>
Subject: [patch v4 0/8] extensible prctl task isolation interface and vmstat sync
The logic to disable vmstat worker thread, when entering
nohz full, does not cover all scenarios. For example, it is possible
for the following to happen:
1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop
Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
the mlock, vmstat shepherd can restart vmstat worker thread on
the CPU in question.
To fix this, add task isolation prctl interface to quiesce
deferred actions when returning to userspace.
The patchset is based on ideas and code from the
task isolation patchset from Alex Belits:
https://lwn.net/Articles/816298/
Please refer to Documentation/userspace-api/task_isolation.rst
(patch 1) for details.
Note: the prctl interface is independent of nohz_full=.
---------
v4:
- Switch to structures for parameters when possible
(which are more extensible).
- Switch to CFG_{S,G}ET naming and use drop
"internal configuration" prctls (Frederic Weisbecker).
- Add summary of terms to documentation (Frederic Weisbecker).
- Examples for compute and one-shot modes (Thomas G/Christoph L).
v3:
- Split in smaller patches (Nitesh Lal).
- Misc cleanups (Nitesh Lal).
- Clarify nohz_full is not a dependency (Nicolas Saenz).
- Incorrect values for prctl definitions (kernel robot).
- Save configured state, so applications
can activate externally configured
task isolation parameters.
- Remove "system default" notion (chisol should
make it obsolete).
- Update documentation: add new section with explanation
about configuration/activation and code example.
- Update samples.
- Report configuration/activation state at
/proc/pid/task_isolation.
- Condense dirty information of per-CPU vmstats counters
in a bool.
- In-kernel KVM support.
v2:
- Finer-grained control of quiescing (Frederic Weisbecker / Nicolas Saenz).
- Avoid potential regressions by allowing applications
to use ISOL_F_QUIESCE_DEFMASK (whose default value
is configurable in /sys/). (Nitesh Lal / Nicolas Saenz).
v3 can be found at:
https://lore.kernel.org/lkml/20210824152423.300346181@fuller.cnet/
v2 can be found at:
https://lore.kernel.org/patchwork/project/lkml/list/?series=510225
Powered by blists - more mailing lists