lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YYnPJ3a9PSSy/gFZ@iki.fi>
Date:   Tue, 9 Nov 2021 03:30:15 +0200
From:   Jarkko Sakkinen <jarkko@...nel.org>
To:     Reinette Chatre <reinette.chatre@...el.com>
Cc:     dave.hansen@...ux.intel.com, tglx@...utronix.de, bp@...en8.de,
        mingo@...hat.com, linux-sgx@...r.kernel.org, x86@...nel.org,
        seanjc@...gle.com, tony.luck@...el.com, hpa@...or.com,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH] x86/sgx: Fix free page accounting

On Mon, Nov 08, 2021 at 12:56:21PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 11/8/2021 12:12 PM, Jarkko Sakkinen wrote:
> > On Mon, Nov 08, 2021 at 11:48:18AM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 11/7/2021 8:47 AM, Jarkko Sakkinen wrote:
> > > > On Sun, 2021-11-07 at 18:45 +0200, Jarkko Sakkinen wrote:
> > > > > On Thu, 2021-11-04 at 11:28 -0700, Reinette Chatre wrote:
> > > > > > The consequence of sgx_nr_free_pages not being protected is that
> > > > > > its value may not accurately reflect the actual number of free
> > > > > > pages on the system, impacting the availability of free pages in
> > > > > > support of many flows. The problematic scenario is when the
> > > > > > reclaimer never runs because it believes there to be sufficient
> > > > > > free pages while any attempt to allocate a page fails because there
> > > > > > are no free pages available. The worst scenario observed was a
> > > > > > user space hang because of repeated page faults caused by
> > > > > > no free pages ever made available.
> > > > > 
> > > > > Can you go in detail with the "concrete scenario" in the commit
> > > > > message? It does not have to describe all the possible scenarios
> > > > > but at least one sequence of events.
> > > 
> > > 
> > > I provided significant detail regarding the "concrete scenario" in a
> > > separate response to Greg:
> > > https://lore.kernel.org/lkml/a636290d-db04-be16-1c86-a8dcc3719b39@intel.com/
> > > 
> > > That message details the test that was run (the test hangs before the fix
> > > and can complete after the fix), the traces captured at the time the test
> > > hung, analysis of the traces with root cause of why the system is hung,
> > > traces after fix applied demonstrating why user space is able to make
> > > progress and explaining why the test can complete.
> > 
> > For me that sequence looks like something that you could "abstract"
> > a bit and get a rough description of the concurrency scenario.
> > 
> > It is as important in this type of patch, as the code change itself,
> > not least because it helps with maintaining in the future to have
> > that info in some level of detail in the commit log.
> 
> My apologies. I understood your comment to be a concern with the change
> itself instead of just the commit message. I will add more detail about the
> failing scenario encountered to the commit message.

Yeah, I went through the log and the code change makes sense :-)

/Jarkko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ