[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150416180227.GB17401@gmail.com>
Date: Thu, 16 Apr 2015 20:02:27 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mel@....ul.ie>,
Rik van Riel <riel@...hat.com>,
Jason Low <jason.low2@...com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Mike Galbraith <umgwanakikbuti@...il.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Mel Gorman <mgorman@...e.de>,
Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
hideaki.kimura@...com, Aswin Chandramouleeswaran <aswin@...com>,
Scott J Norton <scott.norton@...com>
Subject: Re: [PATCH 1/3] sched, timer: Remove usages of ACCESS_ONCE in the
scheduler
* Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Apr 15, 2015 at 09:46:01AM +0200, Ingo Molnar wrote:
>
> > @@ -2088,7 +2088,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
> >
> > static void reset_ptenuma_scan(struct task_struct *p)
> > {
> > - ACCESS_ONCE(p->mm->numa_scan_seq)++;
> > + WRITE_ONCE(p->mm->numa_scan_seq, READ_ONCE(p->mm->numa_scan_seq) + 1);
>
> vs
>
> seq = ACCESS_ONCE(p->mm->numa_scan_seq);
> if (p->numa_scan_seq == seq)
> return;
> p->numa_scan_seq = seq;
>
>
> > So the original ACCESS_ONCE() barriers were misguided to begin with: I
> > think they tried to handle races with the scheduler balancing softirq
> > and tried to avoid having to use atomics for the sequence counter
> > (which would be overkill), but things like ACCESS_ONCE(x)++ never
> > guaranteed atomicity (or even coherency) of the update.
> >
> > But since in reality this is only statistical sampling code, all these
> > compiler barriers can be removed I think. Peter, Mel, Rik, do you
> > agree?
>
> ACCESS_ONCE() is not a compiler barrier
It's not a general compiler barrier (and I didn't claim so) but it is
still a compiler barrier: it's documented as a weak, variable specific
barrier in Documentation/memor-barriers.txt:
COMPILER BARRIER
----------------
The Linux kernel has an explicit compiler barrier function that prevents the
compiler from moving the memory accesses either side of it to the other side:
barrier();
This is a general barrier -- there are no read-read or write-write variants
of barrier(). However, ACCESS_ONCE() can be thought of as a weak form
for barrier() that affects only the specific accesses flagged by the
ACCESS_ONCE().
[...]
> The 'read' side uses ACCESS_ONCE() for two purposes:
> - to load the value once, we don't want the seq number to change under
> us for obvious reasons
> - to avoid load tearing and observe weird seq numbers
>
> The update side uses ACCESS_ONCE() to avoid write tearing, and
> strictly speaking it should also worry about read-tearing since its
> not hard serialized, although its very unlikely to actually have
> concurrency (IIRC).
So what bad effects can there be from the very unlikely read and write
tearing?
AFAICS nothing particularly bad. On the read side:
seq = ACCESS_ONCE(p->mm->numa_scan_seq);
if (p->numa_scan_seq == seq)
return;
p->numa_scan_seq = seq;
If p->mm->numa_scan_seq gets loaded twice (very unlikely), and two
different values happen, then we might get a 'double' NUMA placement
run - i.e. statistical noise.
On the update side:
ACCESS_ONCE(p->mm->numa_scan_seq)++;
p->mm->numa_scan_offset = 0;
If the compiler tears that up we might skip an update - again
statistical noise at worst.
Nor is compiler tearing the only theoretical worry here: in theory,
with long cache coherency latencies we might get two updates 'mixed
up' and resulting in a (single) missed update.
Only atomics would solve all the races, but I think that would be
overdoing it.
This is what I meant by that there's no harm from this race.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists