[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200525152401.GA375786@debian-buster-darwi.lab.linutronix.de>
Date: Mon, 25 May 2020 17:24:01 +0200
From: "Ahmed S. Darwish" <a.darwish@...utronix.de>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...nel.org>,
"Sebastian A. Siewior" <bigeasy@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
linux-mm@...ck.org
Subject: Re: [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API
Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, May 19, 2020 at 11:45:24PM +0200, Ahmed S. Darwish wrote:
> > @@ -713,10 +713,20 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
> > */
> > void lru_add_drain_all(void)
> > {
>
Re-adding cut-out comment for context:
/*
* lru_drain_gen - Current generation of pages that could be in vectors
*
* (A) Definition: lru_drain_gen = x implies that all generations
* 0 < n <= x are already scheduled for draining.
*
* This is an optimization for the highly-contended use case where a
* user space workload keeps constantly generating a flow of pages
* for each CPU.
*/
> > + static unsigned int lru_drain_gen;
> > static struct cpumask has_work;
> > + static DEFINE_MUTEX(lock);
> > + int cpu, this_gen;
> >
> > /*
> > * Make sure nobody triggers this path before mm_percpu_wq is fully
> > @@ -725,21 +735,48 @@ void lru_add_drain_all(void)
> > if (WARN_ON(!mm_percpu_wq))
> > return;
> >
>
Re-adding cut-out comment for context:
/*
* (B) Cache the LRU draining generation number
*
* smp_rmb() ensures that the counter is loaded before the mutex is
* taken. It pairs with the smp_wmb() inside the mutex critical section
* at (D).
*/
> > + this_gen = READ_ONCE(lru_drain_gen);
> > + smp_rmb();
>
> this_gen = smp_load_acquire(&lru_drain_gen);
ACK. will do.
> >
> > mutex_lock(&lock);
> >
> > /*
> > + * (C) Exit the draining operation if a newer generation, from another
> > + * lru_add_drain_all(), was already scheduled for draining. Check (A).
> > */
> > + if (unlikely(this_gen != lru_drain_gen))
> > goto done;
> >
>
Re-adding cut-out comment for context:
/*
* (D) Increment generation number
*
* Pairs with READ_ONCE() and smp_rmb() at (B), outside of the critical
* section.
*
* This pairing must be done here, before the for_each_online_cpu loop
* below which drains the page vectors.
*
* Let x, y, and z represent some system CPU numbers, where x < y < z.
* Assume CPU #z is is in the middle of the for_each_online_cpu loop
* below and has already reached CPU #y's per-cpu data. CPU #x comes
* along, adds some pages to its per-cpu vectors, then calls
* lru_add_drain_all().
*
* If the paired smp_wmb() below is done at any later step, e.g. after
* the loop, CPU #x will just exit at (C) and miss flushing out all of
* its added pages.
*/
> > + WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
> > + smp_wmb();
>
> You can leave this smp_wmb() out and rely on the smp_mb() implied by
> queue_work_on()'s test_and_set_bit().
>
Won't this be too implicit?
Isn't it possible that, over the years, queue_work_on() impementation
changes and the test_and_set_bit()/smp_mb() gets removed?
If that happens, this commit will get *silently* broken and the local
CPU pages won't be drained.
> > cpumask_clear(&has_work);
> > -
> > for_each_online_cpu(cpu) {
> > struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
> >
>
> While you're here, do:
>
> s/cpumask_set_cpu/__&/
>
ACK.
Thanks,
--
Ahmed S. Darwish
Linutronix GmbH
Powered by blists - more mailing lists