[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8a4r2mnIzTD2cZa@Arch>
Date: Tue, 4 Mar 2025 10:24:15 +0200
From: Lilith Gkini <lilithpgkini@...il.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Hyeonggon Yoo <42.hyeyoo@...il.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, harry.yoo@...cle.com
Subject: Re: [PATCH] slub: Fix Off-By-One in the While condition in
on_freelist()
On Mon, Mar 03, 2025 at 08:06:32PM +0100, Vlastimil Babka wrote:
> On 3/3/25 17:41, Lilith Gkini wrote:
> > On Mon, Mar 03, 2025 at 12:06:58PM +0100, Vlastimil Babka wrote:
> >> On 3/2/25 19:01, Lilith Persefoni Gkini wrote:
> >> > If the `search` pattern is not found in the freelist then the function
> >> > should return `fp == search` where fp is the last freepointer from the
> >> > while loop.
> >> >
> >> > If the caller of the function was searching for NULL and the freelist is
> >> > valid it should return True (1), otherwise False (0).
> >>
> >> This suggests we should change the function return value to bool :)
> >>
> >
> > Alright, If you want to be more technical it's
> > `1 (true), otherwise 0 (false).`
> > Its just easier to communicate with the true or false concepts, but in C
> > we usually don't use bools cause its just 1s or 0s.
>
> Yeah, I think bools were not used initially int the kernel, but we're fine
> with them now and changing a function for other reasons is a good
> opportunity to modernize. There are some guidelines in
> Documentation/process/coding-style.rst about this (paragraphs 16 and 17).
> int is recommended if 0 means success and -EXXX for error, bool for simple
> true/false which is the case here.
Oh! because of the emote I thought you were being sarcastic that I didnt
report it properly.
Thank you for clarifying! That should be an easy fix!
> >> I think there's a problem that none of this will fix or even report the
> >> situation properly. Even worse we'll set slab->inuse to 0 and thus pretend
> >> all objects are free. This goes contrary to the other places that respond to
> >> slab corruption by setting all objects to used and trying not to touch the
> >> slab again at all.
> >>
> >> So I think after the while loop we could determine there was a cycle if (nr
> >> == slab->objects && fp != NULL), right? In that case we could perform the
> >> same report and fix as in the "Freepointer corrupt" case?
> >
> > True! We could either add an if check after the while as you said to
> > replicate the "Freepointer corrupt" behavior...
> > Or...
> >
> > I hate to say it, or we could leave the while condition with the equal
> > sign intact, as it was, and change that `if` check from
> > `if (!check_valid_pointer(s, slab, fp)) {`
> > to
> > `if (!check_valid_pointer(s, slab, fp) || nr == slab->objects) {`
>
> You're right!
>
> > When it reaches nr == slab->objects and we are still in the while loop
> > it means that fp != NULL and therefore the freelist is corrupted (note
> > that nr starts from 0).
> >
> > This would add fewer lines of code and there won't be any repeating
> > code.
> > It will enter in the "Freechain corrupt" branch and set the tail of
> > the freelist to NULL, inform us of the error and it won't get a chance
> > to do the nr++ part, leaving nr == slab->objects in that particular
> > case, because it breaks of the loop afterwards.
> >
> > But it will not Null-out the freelist and set inuse to objects like you
> > suggested. If that is the desired behavior instead then we could do
> > something like you suggested.
>
> We could change if (object) to if (object && nr != slab->objects) to force
> it into the "Freepointer corrupt" variant which is better. But then the
We could add a ternary operator in addition to you suggestion.
Changing this:
`slab_err(s, slab, "Freepointer corrupt");`
to this (needs adjusting for the proper formating ofc...):
`slab_err(s, slab, (nr == slab->objects) ? "Freelist cycle detected" : "Freepointer corrupt");`
But this might be too much voodoo...
> message should be also adjusted depending on nr... it should really report
I m not sure what you have in mind about the adjusting the message on
nr. Do we really need to report the nr in the error? Do we need to
mention anything besides "Freelist cycle detected" like you mentioned?
> "Freelist cycle detected", but that's adding too many conditions just to
> reuse the cleanup code so maybe it's more readable to check that outside of
> the while loop after all.
If the ternary operator is too unreadable we could do something like you
suggested
```
if (fp != NULL && nr == slab->objects) {
slab_err(s, slab, "Freelist cycle detected");
slab->freelist = NULL;
slab->inuse = slab->objects;
slab_fix(s, "Freelist cleared");
return false;
}
```
What more would you like to add in the error message?
In a previous email you mentioned this
> >> I think there's a problem that none of this will fix or even report the
> >> situation properly. Even worse we'll set slab->inuse to 0 and thus pretend
> >> all objects are free. This goes contrary to the other places that respond to
> >> slab corruption by setting all objects to used and trying not to touch the
> >> slab again at all.
If nuking it is how we should hangle corrupted freelists shouldn't we
also do the same in the "Freechain corrupt" branch? Otherwise it
wouldn't be consistent. Instead the code now just sets the tail to NULL.
In that case we'll need to do a lot more rewriting, but it might help
out with avoiding the reuse of cleanup code.
Powered by blists - more mailing lists