[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <714d353a-49c8-4cbd-88d6-e24ae8f78aaa@suse.cz>
Date: Tue, 4 Mar 2025 09:41:23 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Lilith Gkini <lilithpgkini@...il.com>
Cc: Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>, Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Hyeonggon Yoo <42.hyeyoo@...il.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, harry.yoo@...cle.com
Subject: Re: [PATCH] slub: Fix Off-By-One in the While condition in
on_freelist()
On 3/4/25 09:24, Lilith Gkini wrote:
> On Mon, Mar 03, 2025 at 08:06:32PM +0100, Vlastimil Babka wrote:
>> Yeah, I think bools were not used initially int the kernel, but we're fine
>> with them now and changing a function for other reasons is a good
>> opportunity to modernize. There are some guidelines in
>> Documentation/process/coding-style.rst about this (paragraphs 16 and 17).
>> int is recommended if 0 means success and -EXXX for error, bool for simple
>> true/false which is the case here.
>
> Oh! because of the emote I thought you were being sarcastic that I didnt
> report it properly.
Ah, sorry about that misunderstanding.
> Thank you for clarifying! That should be an easy fix!
Great!
>> > When it reaches nr == slab->objects and we are still in the while loop
>> > it means that fp != NULL and therefore the freelist is corrupted (note
>> > that nr starts from 0).
>> >
>> > This would add fewer lines of code and there won't be any repeating
>> > code.
>> > It will enter in the "Freechain corrupt" branch and set the tail of
>> > the freelist to NULL, inform us of the error and it won't get a chance
>> > to do the nr++ part, leaving nr == slab->objects in that particular
>> > case, because it breaks of the loop afterwards.
>> >
>> > But it will not Null-out the freelist and set inuse to objects like you
>> > suggested. If that is the desired behavior instead then we could do
>> > something like you suggested.
>>
>> We could change if (object) to if (object && nr != slab->objects) to force
>> it into the "Freepointer corrupt" variant which is better. But then the
>
> We could add a ternary operator in addition to you suggestion.
> Changing this:
> `slab_err(s, slab, "Freepointer corrupt");`
>
> to this (needs adjusting for the proper formating ofc...):
> `slab_err(s, slab, (nr == slab->objects) ? "Freelist cycle detected" : "Freepointer corrupt");`
>
> But this might be too much voodoo...
Yeah it means 3 places where we check (nr == slab->objects) so it's not very
readable.
>> message should be also adjusted depending on nr... it should really report
>
> I m not sure what you have in mind about the adjusting the message on
> nr. Do we really need to report the nr in the error? Do we need to
> mention anything besides "Freelist cycle detected" like you mentioned?
I meant just the Freelist cycle detected" message. As "nr" equals
slab->objects it's not so interesting.
>> "Freelist cycle detected", but that's adding too many conditions just to
>> reuse the cleanup code so maybe it's more readable to check that outside of
>> the while loop after all.
>
> If the ternary operator is too unreadable we could do something like you
> suggested
>
> ```
> if (fp != NULL && nr == slab->objects) {
> slab_err(s, slab, "Freelist cycle detected");
> slab->freelist = NULL;
> slab->inuse = slab->objects;
> slab_fix(s, "Freelist cleared");
> return false;
> }
> ```
Yeah looks good.
>
> What more would you like to add in the error message?
>
> In a previous email you mentioned this
>
>> >> I think there's a problem that none of this will fix or even report the
>> >> situation properly. Even worse we'll set slab->inuse to 0 and thus pretend
>> >> all objects are free. This goes contrary to the other places that respond to
>> >> slab corruption by setting all objects to used and trying not to touch the
>> >> slab again at all.
>
> If nuking it is how we should hangle corrupted freelists shouldn't we
> also do the same in the "Freechain corrupt" branch? Otherwise it
> wouldn't be consistent. Instead the code now just sets the tail to NULL.
It sets the tail to NULL but then also breaks out of the loop (btw that
break; could be moved to the if (object) branch to make it more obvious) to
the code below, which should also set slab->inuse properly. So the result
should be consistent? In that case we're able to salvage at least the
uncorrupted part of the freelist. It's likely corrupted by a use-after-free
of a single object overwriting the freepointer.
In case of the cycle we could also in theory replace the freepointer causing
the cycle to NULL. But we'd have to spend more effort to determine which it
was. Cycle is also probably due to a more serious issue than single object
overwrite - it's unlikely a random overwrite would corrupt a freepointer in
a way that it's valid but causing a cycle. So throwing out everything seems
the easiest.
> In that case we'll need to do a lot more rewriting, but it might help
> out with avoiding the reuse of cleanup code.
Powered by blists - more mailing lists