linux-kernel - Re: [PATCH] mm: don't call lru draining in the nested lru_cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Ya6d+zC/CsYAp0Gf@google.com>
Date:   Mon, 6 Dec 2021 15:34:19 -0800
From:   Minchan Kim <minchan@...nel.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Michal Hocko <mhocko@...e.com>,
        David Hildenbrand <david@...hat.com>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        John Dias <joaodias@...gle.com>
Subject: Re: [PATCH] mm: don't call lru draining in the nested
 lru_cache_disable

On Mon, Dec 06, 2021 at 03:04:21PM -0800, Andrew Morton wrote:
> On Mon,  6 Dec 2021 14:10:06 -0800 Minchan Kim <minchan@...nel.org> wrote:
> 
> > lru_cache_disable involves IPIs to drain pagevec of each core,
> > which sometimes takes quite long time to complete depending
> > on cpu's business, which makes allocation too slow up to
> > sveral hundredth milliseconds. Furthermore, the repeated draining
> > in the alloc_contig_range makes thing worse considering caller
> > of alloc_contig_range usually tries multiple times in the loop.
> > 
> > This patch makes the lru_cache_disable aware of the fact the
> > pagevec was already disabled. With that, user of alloc_contig_range
> > can disable the lru cache in advance in their context during the
> > repeated trial so they can avoid the multiple costly draining
> > in cma allocation.
> 
> Isn't this racy?
>  
> > ...
> >
> > @@ -859,7 +869,12 @@ atomic_t lru_disable_count = ATOMIC_INIT(0);
> >   */
> >  void lru_cache_disable(void)
> >  {
> > -	atomic_inc(&lru_disable_count);
> > +	/*
> > +	 * If someone is already disabled lru_cache, just return with
> > +	 * increasing the lru_disable_count.
> > +	 */
> > +	if (atomic_inc_not_zero(&lru_disable_count))
> > +		return;
> >  #ifdef CONFIG_SMP
> >  	/*
> >  	 * lru_add_drain_all in the force mode will schedule draining on
> > @@ -873,6 +888,7 @@ void lru_cache_disable(void)
> >  #else
> >  	lru_add_and_bh_lrus_drain();
> >  #endif
> 
> There's a window here where lru_disable_count==0 and new pages can get
> added to lru?

Indeed. If __lru_add_drain_all in core A didn't run yet but increased
the disable count already, lru_cache_disable in core B will not see
those pages in the LRU. Need to be fixed it.

Thanks, Andrew.