[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210705144542.GA27275@fuller.cnet>
Date: Mon, 5 Jul 2021 11:45:42 -0300
From: Marcelo Tosatti <mtosatti@...hat.com>
To: Christoph Lameter <cl@...two.de>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Frederic Weisbecker <frederic@...nel.org>,
Juri Lelli <juri.lelli@...hat.com>,
Nitesh Lal <nilal@...hat.com>
Subject: Re: [patch 0/5] optionally sync per-CPU vmstats counter on return to
userspace
On Mon, Jul 05, 2021 at 04:26:48PM +0200, Christoph Lameter wrote:
> On Fri, 2 Jul 2021, Marcelo Tosatti wrote:
>
> > > > The logic to disable vmstat worker thread, when entering
> > > > nohz full, does not cover all scenarios. For example, it is possible
> > > > for the following to happen:
> > > >
> > > > 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
> > > > 2) app runs mlock, which increases counters for mlock'ed pages.
> > > > 3) start -RT loop
> > > >
> > > > Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
> > > > the mlock, vmstat shepherd can restart vmstat worker thread on
> > > > the CPU in question.
> > >
> > > Can we enter nohz_full after the app runs mlock?
> >
> > Hum, i don't think its a good idea to use that route, because
> > entering or exiting nohz_full depends on a number of variable
> > outside of one's control (and additional variables might be
> > added in the future).
>
> Then I do not see any need for this patch. Because after a certain time
> of inactivity (after the mlock) the system will enter nohz_full again.
> If userspace has no direct control over nohz_full and can only wait then
> it just has to do so.
Sorry, fail to see what you mean.
The problem (well its not a bug per se, but basically the current
disablement of vmstat_worker thread is not aggressive enough).
>From the initial message:
1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop
Note that any activity that triggers stat counter changes (other than
mlock, it just happens that it was mlock in the test application i was
using, just replace with any other system call that triggers writes
to per-CPU vmstat counters), will cause this.
You said:
"Because after a certain time of inactivity (after the mlock) the
system will enter nohz_full again."
Yes, but we can't tolerate any activity from vmstat worker thread
on this particular CPU.
Do you want the app to wait for an event saying: "vmstat_worker is now
disabled, as long as you don't dirty vmstat counters, vmstat_shepherd
won't wake it up".
Rather than that, what this patch does is to sync the vmstat counters on
return to userspace, so that:
"We synced per-CPU vmstat counters to global counters, and disable
local-CPU vmstat worker (on return to userspace). As long as you
don't dirty vmstat counters, vmstat_shepherd won't wake it up".
Makes sense?
> > So preparing the system to function
> > while entering nohz_full at any location seems the sane thing to do.
> >
> > And that would be at return to userspace (since, if mlocked, after
> > that point there will be no more changes to propagate to vmstat
> > counters).
> >
> > Or am i missing something else you can think of ?
>
> I assumed that the "enter nohz full" was an action by the user
> space app because I saw some earlier patches to introduce such
> functionality in the past.
No, it meant "enter nohz full" (in the current Linux codebase, for
existing applications).
Powered by blists - more mailing lists