lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ef64e75-2150-01a9-074d-a754348683b3@suse.cz>
Date:   Tue, 27 Oct 2020 17:31:14 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Laurent Dufour <ldufour@...ux.ibm.com>,
        Michal Hocko <mhocko@...e.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        nathanl@...ux.ibm.com, cheloha@...ux.ibm.com,
        Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        stable@...r.kernel.org
Subject: Re: [PATCH] mm/slub: fix panic in slab_alloc_node()

On 10/27/20 4:12 PM, Laurent Dufour wrote:
> Le 27/10/2020 à 16:03, Michal Hocko a écrit :
>> On Tue 27-10-20 15:39:46, Laurent Dufour wrote:
>>> Le 27/10/2020 à 15:24, Michal Hocko a écrit :
>>>> [Cc Vlastimil]
>>>>
>>>> On Tue 27-10-20 15:09:26, Laurent Dufour wrote:
>>>>
>>>> Could you be more specific? I am especially confused how the memory
>>>> hotplug is involved here. What kind of flush are we talking about?
>>>
>>> This happens when flush_cpu_slab() is called when a memory block is about to
>>> be offlined, see slab_mem_going_offline_callback() called by the
>>> MEM_GOING_OFFLINE's callback triggered by offline_pages().
>> 
>> This would be a very valuable information for the changelog. I have to
>> admit that a more detailed description would help somebody not really
>> familiar with slub internals like me.

Agreed, please include that.

>> I still fail to see why do we get an inconsistent state though. I
>> thought that no object is associated with an offlined page so how come
>> we have an object without any page?
> 
> The inconsistent state came from the IPI interrupt calling flush_cpu_slab()
> being taken between reading c->freelist and c->page.

Yes; also good to state explicitly.

>> How does this allocation path synchronizes with the offline callback?
> 
> My understanding is that this is done by the call to this_cpu_cmpxchg_double()
> done later, but I would let the slub experts detail that point.

Yes, cmpxchg will detect that c->freelist changed. If we managed to read both 
c->freelist and c->page before the interrupt (and thus not crash), 
cmpxchg_double will fail on the s->cpu_slab->tid part as flush_slab() will also 
bump the tid.

>>>>> In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
>>>>> node_match()") check on the page pointer has been removed assuming that
>>>>> page is always valid when it is called. It happens that this is not true in
>>>>> that particular case, so check for page before calling node_match() here.
>>>>>
>>>>> Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
>>>>> Signed-off-by: Laurent Dufour <ldufour@...ux.ibm.com>

With the expanded changelog,

Acked-by: Vlastimil Babka <vbabka@...e.cz>
Thanks!

>>>>> Cc: Christoph Lameter <cl@...ux.com>
>>>>> Cc: Pekka Enberg <penberg@...nel.org>
>>>>> Cc: David Rientjes <rientjes@...gle.com>
>>>>> Cc: Joonsoo Kim <iamjoonsoo.kim@....com>
>>>>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>>>>> Cc: stable@...r.kernel.org
>>>>> ---
>>>>>    mm/slub.c | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>> index 8f66de8a5ab3..7dc5c6aaf4b7 100644
>>>>> --- a/mm/slub.c
>>>>> +++ b/mm/slub.c
>>>>> @@ -2852,7 +2852,7 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
>>>>>    	object = c->freelist;
>>>>>    	page = c->page;
>>>>> -	if (unlikely(!object || !node_match(page, node))) {
>>>>> +	if (unlikely(!object || !page || !node_match(page, node))) {
>>>>>    		object = __slab_alloc(s, gfpflags, node, addr, c);
>>>>>    	} else {
>>>>>    		void *next_object = get_freepointer_safe(s, object);
>>>>> -- 
>>>>> 2.29.1
>>>>>
>>>>
>>>
>> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ