[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 4 Apr 2023 18:00:38 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Yair Podemsky <ypodemsk@...hat.com>
Cc: linux@...linux.org.uk, mpe@...erman.id.au, npiggin@...il.com,
christophe.leroy@...roup.eu, hca@...ux.ibm.com, gor@...ux.ibm.com,
agordeev@...ux.ibm.com, borntraeger@...ux.ibm.com,
svens@...ux.ibm.com, davem@...emloft.net, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, will@...nel.org,
aneesh.kumar@...ux.ibm.com, akpm@...ux-foundation.org,
arnd@...db.de, keescook@...omium.org, paulmck@...nel.org,
jpoimboe@...nel.org, samitolvanen@...gle.com, frederic@...nel.org,
ardb@...nel.org, juerg.haefliger@...onical.com,
rmk+kernel@...linux.org.uk, geert+renesas@...der.be,
tony@...mide.com, linus.walleij@...aro.org,
sebastian.reichel@...labora.com, nick.hawkins@....com,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
sparclinux@...r.kernel.org, linux-arch@...r.kernel.org,
linux-mm@...ck.org, mtosatti@...hat.com, vschneid@...hat.com,
dhildenb@...hat.com, alougovs@...hat.com,
Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI
only to CPUs in kernel mode
On Tue, Apr 04, 2023 at 05:12:17PM +0200, Peter Zijlstra wrote:
> > case 2:
> > CPU-A CPU-B
> >
> > modify pagetables
> > tlb_flush (memory barrier)
> > state == CONTEXT_USER
> > int state = atomic_read(&ct->state);
> > Kernel-enter:
> > state == CONTEXT_KERNEL
> > READ(pagetable values)
> > if (state & CT_STATE_MASK == CONTEXT_USER)
> >
Hmm, hold up; what about memory ordering, we need a store-load ordering
between the page-table write and the context trackng load, and a
store-load order on the context tracking update and software page-table
walker loads.
Now, iirc page-table modification is done under pte_lock (or
page_table_lock) and that only provides a RELEASE barrier on this end,
which is insufficient to order against a later load.
Is there anything else?
On the state tracking side, we have ct_state_inc() which is
atomic_add_return() which should provide full barrier and is sufficient.
Powered by blists - more mailing lists