[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f99dca08d332b01daec9eed7e4a55f042b551a67.camel@kernel.org>
Date: Thu, 11 Nov 2021 05:50:41 +0200
From: Jarkko Sakkinen <jarkko@...nel.org>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>,
dave.hansen@...ux.intel.com, tglx@...utronix.de, bp@...en8.de,
mingo@...hat.com, linux-sgx@...r.kernel.org, x86@...nel.org,
seanjc@...gle.com, hpa@...or.com, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH V2] x86/sgx: Fix free page accounting
On Wed, 2021-11-10 at 19:26 -0800, Luck, Tony wrote:
> On Thu, Nov 11, 2021 at 04:55:14AM +0200, Jarkko Sakkinen wrote:
> > On Wed, 2021-11-10 at 10:51 -0800, Reinette Chatre wrote:
> > > sgx_should_reclaim() would only succeed when sgx_nr_free_pages goes
> > > below the watermark. Once sgx_nr_free_pages becomes corrupted there is
> > > no clear way in which it can correct itself since it is only ever
> > > incremented or decremented.
> >
> > So one scenario would be:
> >
> > 1. CPU A does a READ of sgx_nr_free_pages.
> > 2. CPU B does a READ of sgx_nr_free_pages.
> > 3. CPU A does a STORE of sgx_nr_free_pages.
> > 4. CPU B does a STORE of sgx_nr_free_pages.
> >
> > ?
> >
> > That does corrupt the value, yes, but I don't see anything like this
> > in the commit message, so I'll have to check.
> >
> > I think the commit message is lacking a concurrency scenario, and the
> > current transcripts are a bit useless.
>
> What about this part:
>
> With sgx_nr_free_pages accessed and modified from a few places
> it is essential to ensure that these accesses are done safely but
> this is not the case. sgx_nr_free_pages is read without any
> protection and updated with inconsistent protection by any one
> of the spin locks associated with the individual NUMA nodes.
> For example:
>
> CPU_A CPU_B
> ----- -----
> spin_lock(&nodeA->lock); spin_lock(&nodeB->lock);
> ... ...
> sgx_nr_free_pages--; /* NOT SAFE */ sgx_nr_free_pages--;
>
> spin_unlock(&nodeA->lock); spin_unlock(&nodeB->lock);
>
> Maybe you missed the "NOT SAFE" hidden in the middle of
> the picture?
>
> -Tony
For me from that the ordering is not clear. E.g. compare to
https://www.kernel.org/doc/Documentation/memory-barriers.txt
/Jarkko
Powered by blists - more mailing lists