[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210727103803.464432924@fuller.cnet>
Date: Tue, 27 Jul 2021 07:38:03 -0300
From: Marcelo Tosatti <mtosatti@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Nitesh Lal <nilal@...hat.com>,
Nicolas Saenz Julienne <nsaenzju@...hat.com>,
Frederic Weisbecker <frederic@...nel.org>,
Christoph Lameter <cl@...ux.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Alex Belits <abelits@...vell.com>, Peter Xu <peterx@...hat.com>
Subject: [patch 0/4] prctl task isolation interface and vmstat sync
The logic to disable vmstat worker thread, when entering
nohz full, does not cover all scenarios. For example, it is possible
for the following to happen:
1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop
Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
the mlock, vmstat shepherd can restart vmstat worker thread on
the CPU in question.
To fix this, add task isolation prctl interface to quiesce
deferred actions when returning to userspace.
=============================
Task isolation prctl interface
=============================
Set thread isolation mode and parameters, which allows
informing the kernel that application is
executing latency sensitive code (where interruptions
are undesired).
Its composed of 4 prctl commands (passed as arg1 to
prctl):
PR_ISOL_SET: set isolation parameters for the task
PR_ISOL_GET: get isolation parameters for the task
PR_ISOL_ENTER: indicate that task should be considered
isolated from this point on
PR_ISOL_EXIT: indicate that task should not be considered
isolated from this point on
The isolation parameters and mode are not inherited by
children created by fork(2) and clone(2). The setting is
preserved across execve(2).
The meaning of isolated is specified as follows, when setting arg2 to
PR_ISOL_SET or PR_ISOL_GET, with the following arguments passed as arg3.
Isolation mode (PR_ISOL_MODE):
------------------------------
- PR_ISOL_MODE_NONE (arg4): no per-task isolation (default mode).
PR_ISOL_EXIT sets mode to PR_ISOL_MODE_NONE.
- PR_ISOL_MODE_NORMAL (arg4): applications can perform system calls normally,
and in case of interruption events, the notifications can be collected
by BPF programs.
In this mode, if system calls are performed, deferred actions initiated
by the system call will be executed before return to userspace.
Other modes, which for example send signals upon interruptions events,
can be implemented.
Example
=======
The ``samples/task_isolation/`` directory contains a sample
application.
Powered by blists - more mailing lists