linux-kernel - Re: [ANNOUNCE] v5.14-rc5-rt8

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c92fc2cb-03cd-d6a2-fb4a-7bc33e94e391@suse.cz>
Date:   Sun, 15 Aug 2021 11:35:53 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mike Galbraith <efault@....de>,
        Clark Williams <williams@...hat.com>
Cc:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        RT <linux-rt-users@...r.kernel.org>
Subject: Re: [ANNOUNCE] v5.14-rc5-rt8

On 8/15/21 6:17 AM, Mike Galbraith wrote:
> On Sat, 2021-08-14 at 21:08 +0200, Vlastimil Babka wrote:
>>
>> Aha! That's helpful. Hopefully it's just a small issue where we
>> opportunistically test flags on a page that's protected by the local
>> lock we didn't take yet, and I didn't realize there's the VM_BUG_ON
>> which can trigger if our page went away (which we would have realized
>> after taking the lock).
> 
> Speaking of optimistic peeking perhaps going badly, why is the below
> not true?  There's protection against ->partial going disappearing
> during a preemption... but can't it just as easily appear, so where is
> that protection?

If it appears, it appears, we don't care, we just leave it there and
won't use it.

> If the other side of that window is safe, it could use a comment so
> dummies reading this code don't end up asking mm folks why the heck
> they don't just take the darn lock and be done with it instead of tap
> dancing all around thething :)

Well, with your patch, ->partial might appear just after the unlock, so
does that really change anything?
The point is to avoid the taking the lock if it's almost certain there
will be nothing to gain.

c->partial appearing is easy to just ignore. c->page appearing, while we
got our own page, is worse as there can be only one c->page. But it's
unavoidable, we can't just keep holding the local lock while going to
the page allocator etc. That's why under retry_load_page: we have to
deactivate a c->page that appeared under us...

> ---
>  mm/slub.c |   14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2937,17 +2937,16 @@ static void *___slab_alloc(struct kmem_c
> 
>  new_slab:
> 
> +	/*
> +	 * To avoid false negative race with put_cpu_partial() during a
> +	 * preemption, we must call slub_percpu_partial() under lock.
> +	 */
> +	local_lock_irqsave(&s->cpu_slab->lock, flags);
>  	if (slub_percpu_partial(c)) {
> -		local_lock_irqsave(&s->cpu_slab->lock, flags);
>  		if (unlikely(c->page)) {
>  			local_unlock_irqrestore(&s->cpu_slab->lock, flags);
>  			goto reread_page;
>  		}
> -		if (unlikely(!slub_percpu_partial(c))) {
> -			local_unlock_irqrestore(&s->cpu_slab->lock, flags);
> -			/* we were preempted and partial list got empty */
> -			goto new_objects;
> -		}
> 
>  		page = c->page = slub_percpu_partial(c);
>  		slub_set_percpu_partial(c, page);
> @@ -2955,8 +2954,7 @@ static void *___slab_alloc(struct kmem_c
>  		stat(s, CPU_PARTIAL_ALLOC);
>  		goto redo;
>  	}
> -
> -new_objects:
> +	local_unlock_irqrestore(&s->cpu_slab->lock, flags);
> 
>  	freelist = get_partial(s, gfpflags, node, &page);
>  	if (freelist)
> 
>