[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z31wEtweV8FlQsVA@quatroqueijos.cascardo.eti.br>
Date: Tue, 7 Jan 2025 15:18:58 -0300
From: Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
To: Alexander Aring <aahringo@...hat.com>
Cc: gfs2@...ts.linux.dev, David Teigland <teigland@...hat.com>,
linux-kernel@...r.kernel.org, kernel-dev@...lia.com
Subject: Re: [PATCH] dlm: prevent NPD when writing a positive value to
event_done
On Mon, Jan 06, 2025 at 11:14:35AM -0500, Alexander Aring wrote:
> Hi,
>
> On Fri, Dec 20, 2024 at 9:28 AM Thadeu Lima de Souza Cascardo
> <cascardo@...lia.com> wrote:
> >
> > do_uevent returns the value written to event_done. In case it is a positive
> > value, new_lockspace would undo all the work, and lockspace would not be
> > set. __dlm_new_lockspace, however, would treat that positive value as a
> > success due to commit 8511a2728ab8 ("dlm: fix use count with multiple
> > joins").
> >
> > Down the line, device_create_lockspace would pass that NULL lockspace to
> > dlm_find_lockspace_local, leading to a NULL pointer dereference:
> >
>
> This issue starts to be an issue since commit 4db41bf4f04f ("dlm:
> remove ls_local_handle from struct dlm_ls"). Before
> dlm_find_lockspace_local() was then returning NULL and
> device_create_lockspace() returned -ENOENT as it wasn't found a NULL
> lockspace handle, see [0].
> After the patch it dlm_find_lockspace_local() will end in a
> null-pointer dereference because we don't do this lookup of pointers
> when we already should have this pointer already.
> This behaviour is kind of weird (as it makes no sense to lookup NULL)
> and yes, I didn't see that when I wrote commit 4db41bf4f04f ("dlm:
> remove ls_local_handle from struct dlm_ls"). Just this commit signals
> us we do something stupid with the error handling.
>
> May I ask you which user space software do you use? As for
> dlm_controld I can't see that there is a positive number case for
> telling the event_done attribute. [1]
I was just playing with it manually, that is, without dlm_controld. I was
writing down into configfs and sysfs manually while issuing ioctls and then
tripped over this NULL pointer dereference.
I see how commit 4db41bf4f04f ("dlm: remove ls_local_handle from struct
dlm_ls") leads to the NPD. In any case, I suppose the Fixes line could
either be the one I pointed to (since that is what leads to such positive
return values to undo lockspace creation and still be interpreted as
success) or 4db41bf4f04f, specially if we want to restrict the backports to
only the kernels with this latter commit.
Regards.
Cascardo.
>
> Your patch is fine anyway as a positive event_done should not be
> handled as an error (so far I can say that from the current DLM UAPI
> spec). We might need to review the whole error handling here again as
> the caller is only interested if it failed or not. Even kernel
> lockspaces clear positive error codes to zero whereas user space (our
> case) doesn't do that.
>
> - Alex
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm.git/tree/fs/dlm/user.c?h=dlm-6.9#n431
> [1] https://pagure.io/dlm/blob/main/f/dlm_controld/cpg.c#_1811
>
Powered by blists - more mailing lists