[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230601082928.4mk7hfi5hunaxm4y@intel.intel>
Date: Thu, 1 Jun 2023 10:29:28 +0200
From: Andi Shyti <andi.shyti@...nel.org>
To: 대인기/Tizen Platform Lab(SR)/삼성전자
<inki.dae@...sung.com>
Cc: 'lm0963' <lm0963hack@...il.com>, sw0312.kim@...sung.com,
kyungmin.park@...sung.com, airlied@...il.com, daniel@...ll.ch,
krzysztof.kozlowski@...aro.org, alim.akhtar@...sung.com,
dri-devel@...ts.freedesktop.org,
linux-arm-kernel@...ts.infradead.org,
linux-samsung-soc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/exynos: fix race condition UAF in
exynos_g2d_exec_ioctl
Hi Inki,
> > > > > > > If it is async, runqueue_node is freed in g2d_runqueue_worker on
> > another
> > > > > > > worker thread. So in extreme cases, if g2d_runqueue_worker runs
> > first, and
> > > > > > > then executes the following if statement, there will be use-
> > after-free.
> > > > > > >
> > > > > > > Signed-off-by: Min Li <lm0963hack@...il.com>
> > > > > > > ---
> > > > > > > drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
> > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > index ec784e58da5c..414e585ec7dd 100644
> > > > > > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > > > > > > @@ -1335,7 +1335,7 @@ int exynos_g2d_exec_ioctl(struct
> > drm_device *drm_dev, void *data,
> > > > > > > /* Let the runqueue know that there is work to do. */
> > > > > > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > > > > >
> > > > > > > - if (runqueue_node->async)
> > > > > > > + if (req->async)
> > > > > >
> > > > > > did you actually hit this? If you did, then the fix is not OK.
> > > > >
> > > > > No, I didn't actually hit this. I found it through code review. This
> > > > > is only a theoretical issue that can only be triggered in extreme
> > > > > cases.
> > > >
> > > > first of all runqueue is used again two lines below this, which
> > > > means that if you don't hit the uaf here you will hit it
> > > > immediately after.
> > >
> > > No, if async is true, then it will goto out, which will directly return.
> > >
> > > if (runqueue_node->async)
> > > goto out; // here, go to out, will directly return
> > >
> > > wait_for_completion(&runqueue_node->complete); // not hit
> > > g2d_free_runqueue_node(g2d, runqueue_node);
> > >
> > > out:
> > > return 0;
> >
> > that's right, sorry, I misread it.
> >
> > > > Second, if runqueue is freed, than we need to remove the part
> > > > where it's freed because it doesn't make sense to free runqueue
> > > > at this stage.
> > >
> > > It is freed by g2d_free_runqueue_node in g2d_runqueue_worker
> > >
> > > static void g2d_runqueue_worker(struct work_struct *work)
> > > {
> > > ......
> > > if (runqueue_node) {
> > > pm_runtime_mark_last_busy(g2d->dev);
> > > pm_runtime_put_autosuspend(g2d->dev);
> > >
> > > complete(&runqueue_node->complete);
> > > if (runqueue_node->async)
> > > g2d_free_runqueue_node(g2d, runqueue_node); // freed here
> >
> > this is what I'm wondering: is it correct to free a resource
> > here? The design looks to me a bit fragile and prone to mistakes.
>
> This question seems to deviate from the purpose of this patch. If you are providing additional opinions for code quality improvement unrelated to this patch, it would be more appropriate for me to answer instead of him.
It's not deviating as the question was already made in my first
review. It just looks strange to me that a piece of data shared
amongst processes can be freed up without sinchronizing. A bunch
of if's do not make it robust enough.
The patch itself, in my point of view, is not really fixing much
and won't make any difference, it's just exposing the weakness I
mentioned.
However, honestly speaking, I don't know the driver well enough
to suggest architectural changes and that's why I r-b'ed this
one. But the first thing that comes to my mind, without looking
much at the code, is using kref's as a way to make sure that a
resource doesn't magically disappear under your nose.
But, of course, this is up to you and if in your opinion this is
OK and it fixes it... then you definitely know better :)
Thanks for this discussion,
Andi
> The runqueue node - which contains command list for g2d rendering - is generated when the user calls the ioctl system call. Therefore, if the user-requested command list is rendered by g2d device then there is no longer a reason to keep it. :)
>
> >
> > The patch per se is OK. It doesn't make much difference to me
> > where you actually read async, although this patch looks a bit
> > safer:
> >
> > Reviewed-by: Andi Shyti <andi.shyti@...nel.org>
>
> Thanks,
> Inki Dae
>
> >
> > However some refactoring might be needed to make it a bit more
> > robust.
> >
> > Thanks,
> > Andi
> >
> > > }
> > >
> > > >
> > > > Finally, can you elaborate on the code review that you did so
> > > > that we all understand it?
> > >
> > > queue_work(g2d->g2d_workq, &g2d->runqueue_work);
> > > msleep(100); // add sleep here to let g2d_runqueue_worker run first
> > > if (runqueue_node->async)
> > > goto out;
> > >
> > >
> > > >
> > > > Andi
> > >
> > >
> > >
> > > --
> > > Min Li
>
>
Powered by blists - more mailing lists