[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210706130925.GC107277@lothringen>
Date: Tue, 6 Jul 2021 15:09:25 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Marcelo Tosatti <mtosatti@...hat.com>
Cc: linux-kernel@...r.kernel.org, Christoph Lameter <cl@...ux.com>,
Thomas Gleixner <tglx@...utronix.de>,
Juri Lelli <juri.lelli@...hat.com>,
Nitesh Lal <nilal@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [patch 0/5] optionally sync per-CPU vmstats counter on return to
userspace
On Fri, Jul 02, 2021 at 12:28:16PM -0300, Marcelo Tosatti wrote:
>
> Hi Frederic,
>
> On Fri, Jul 02, 2021 at 02:30:32PM +0200, Frederic Weisbecker wrote:
> > On Thu, Jul 01, 2021 at 06:03:36PM -0300, Marcelo Tosatti wrote:
> > > The logic to disable vmstat worker thread, when entering
> > > nohz full, does not cover all scenarios. For example, it is possible
> > > for the following to happen:
> > >
> > > 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
> > > 2) app runs mlock, which increases counters for mlock'ed pages.
> > > 3) start -RT loop
> > >
> > > Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
> > > the mlock, vmstat shepherd can restart vmstat worker thread on
> > > the CPU in question.
> > >
> > > To fix this, optionally sync the vmstat counters when returning
> > > from userspace, controllable by a new "vmstat_sync" isolcpus
> > > flags (default off).
> >
> > Wasn't the plan for such finegrained isolation features to do it at
> > the per task level using prctl()?
>
> Yes, but its orthogonal: when we integrate the finegrained isolation
> interface, will be able to use this code (to sync vmstat counters
> on return to userspace) only when userspace informs that it has entered
> isolated mode, so you don't incur the performance penalty of frequent
> vmstat counter writes when not using isolated apps.
>
> This is what the full task isolation task patchset mode is doing
> as well (CC'ing Alex BTW).
Right there can be two ways:
* A prctl request to sync vmstat only on exit from that prctl
* A prctl request to sync vmstat on all subsequent exit from
kernel space.
>
> This will require modifying applications (and the new kernel with the
> exposed interface).
>
> But there is demand for fixing this now, for currently existing
> binary only applications.
I would agree if it were a regression but it's not. It's merely
a new feature and we don't want to rush on a broken interface.
And I suspect some other people won't like much a new extension
to isolcpus.
Powered by blists - more mailing lists