[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAA-cTWYKSiWRFipi6Ka7h9c99t-RAA1kVrwbjApZseErv3gFwg@mail.gmail.com>
Date: Fri, 6 Feb 2026 11:10:22 +0100
From: Jérôme Forissier <jerome@...issier.org>
To: Jens Wiklander <jens.wiklander@...aro.org>
Cc: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>, Sumit Garg <sumit.garg@...nel.org>,
Arnd Bergmann <arnd@...db.de>, Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org,
op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
exits before the supplicant
Hi,
On Fri, Feb 6, 2026 at 9:54 AM Jens Wiklander <jens.wiklander@...aro.org> wrote:
>
> Hi Amir,
>
> On Thu, Feb 5, 2026 at 3:13 AM Amirreza Zarrabi
> <amirreza.zarrabi@....qualcomm.com> wrote:
> >
> > Hi Jens,
> >
> > On 2/4/2026 6:46 PM, Jens Wiklander wrote:
> > > Hi Amir,
> > >
> > > On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi
> > > <amirreza.zarrabi@....qualcomm.com> wrote:
> > >>
> > >> Hi Jens,
> > >>
> > >> On 2/3/2026 5:59 PM, Jens Wiklander wrote:
> > >>> Hi,
> > >>>
> > >>> On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
> > >>> <amirreza.zarrabi@....qualcomm.com> wrote:
> > >>>>
> > >>>> Hi Jens,
> > >>>>
> > >>>> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
> > >>>>> Hi Amir,
> > >>>>>
> > >>>>> On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
> > >>>>> <amirreza.zarrabi@....qualcomm.com> wrote:
> > >>>>>>
> > >>>>>> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
> > >>>>>> client wait as killable so it can be interrupted during shutdown or
> > >>>>>> after a supplicant crash. This changes the original lifetime expectations:
> > >>>>>> the client task can now terminate while the supplicant is still processing
> > >>>>>> its request.
> > >>>>>>
> > >>>>>> If the client exits first it removes the request from its queue and
> > >>>>>> kfree()s it, while the request ID remains in supp->idr. A subsequent
> > >>>>>> lookup on the supplicant path then dereferences freed memory, leading to
> > >>>>>> a use-after-free.
> > >>>>>>
> > >>>>>> Serialise access to the request with supp->mutex:
> > >>>>>>
> > >>>>>> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
> > >>>>>> looking up and touching the request.
> > >>>>>> * Let optee_supp_thrd_req() notice that the client has terminated and
> > >>>>>> signal optee_supp_send() accordingly.
> > >>>>>>
> > >>>>>> With these changes the request cannot be freed while the supplicant still
> > >>>>>> has a reference, eliminating the race.
> > >>>>>>
> > >>>>>> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
> > >>>>>> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> > >>>>>> ---
> > >>>>>> Changes in v3:
> > >>>>>> - Introduce processed flag instead of -1 for req->id.
> > >>>>>> - Update optee_supp_release() as reported by Michael Wu.
> > >>>>>> - Use mutex instead of guard.
> > >>>>>> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
> > >>>>>>
> > >>>>>> Changes in v2:
> > >>>>>> - Replace the static variable with a sentinel value.
> > >>>>>> - Fix the issue with returning the popped request to the supplicant.
> > >>>>>> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
> > >>>>>> ---
> > >>>>>> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
> > >>>>>> 1 file changed, 86 insertions(+), 36 deletions(-)
> > >>>>>
> [snip]
>
>
> > >>>>>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> > >>>>>>
> > >>>>>> mutex_lock(&supp->mutex);
> > >>>>>> req = supp_pop_req(supp, num_params, param, &num_meta);
> > >>>>>> - mutex_unlock(&supp->mutex);
> > >>>>>> -
> > >>>>>> if (IS_ERR(req)) {
> > >>>>>> + mutex_unlock(&supp->mutex);
> > >>>>>
> > >>>>> We need a way to tell the difference between an id not found and an id
> > >>>>> removed because of a killed requester.
> > >>>>> How about storing NULL for revoked requests instead of an err-pointer?
> > >>>>>
> > >>>>
> > >>>> Not sure I'm following correctly. Are you expecting supp_pop_req()
> > >>>> to return NULL instead of an err-pointer when a request has been revoked?
> > >>>
> > >>> I was looking at it again, and storing an err-pointer as you do in
> > >>> this patch has the advantage that we can tell whether the ID has been
> > >>> revoked or was never supplied. In the latter case, it suggests that
> > >>> the supplicant is doing something wrong and might as well restart in
> > >>> an attempt to recover. So, please keep using an err-pointer as a
> > >>> placeholder, but we must be able to distinguish a revoked request from
> > >>> other errors to make sure that the supplicant doesn't restart due to a
> > >>> revoked request.
> > >>>
> > >>
> > >> Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT
> > >> (which seems more natural), so it doesn't overlap with other supp_pop_req() error
> > >> codes and the supplicant can reliably detect it.
> > >
> > > Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will
> > > cause the tee-supplicant to exit. Even if we update it to ignore
> > > certain codes, we must also consider the installed base. There's not
> > > much tee-supplicant could do with this error, except logging it. But I
> > > don't think that is very useful either. Unless the tee-supplicant does
> > > anything wrong or if the device isn't working any longer, we shouldn't
> > > return an error.
> > >
> >
> > The direction of data flow in optee_supp_send is from the supplicant to optee,
> > so the only way I can return meaningful information back to the supplicant is
> > through the return value. I suppose I could simply ignore the revoked request
> > and return success, but it might be useful for the supplicant to know about it
> > in case it needs to roll back something.
> >
> > At this point I'm out of ideas :). Do you have any suggestions on how I can
> > inform the supplicant about a revoked request in optee_supp_send while returning
> > success return value?
>
> This became a bit harder than I first thought. At this point, to fix
> the possible use-after-free, we have two options:
> 1. Returning an error code: tee-supplicant will exit
> 2. Returning OK: worst case, tee-supplicant can leak memory
>
> During reboot, neither case is a problem. During normal operation,
> it's annoying if tee-supplicant exists, but you still need some
> privileges to kill the client. If we return an error, it's enough to
> update tee-supplicant to handle that error case, and we're done. The
> advantage is no added code to the kernel.
>
> I think we should do what you suggested and return an error. This will
> not happen during normal operation. We'll fix tee-supplicant to handle
> the return error properly. tee-supplicant doesn't care about what
> error code it gets. If there's an error in TEE_IOC_SUPPL_SEND, it
> knows that no one will receive whatever was sent, and cleanup is
> needed.
>
> Sumit and Jerome, what do you think?
Sounds good.
Thanks,
--
Jerome
>
> Cheers,
> Jens
Powered by blists - more mailing lists