[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHUa44GhTy=TWAuG7JY_6jjWG-uCGnLyZRn1JhGWgaWt28cH7w@mail.gmail.com>
Date: Fri, 6 Feb 2026 09:54:12 +0100
From: Jens Wiklander <jens.wiklander@...aro.org>
To: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
Cc: Sumit Garg <sumit.garg@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org,
op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org,
Jerome Forissier <jerome@...issier.org>
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
exits before the supplicant
Hi Amir,
On Thu, Feb 5, 2026 at 3:13 AM Amirreza Zarrabi
<amirreza.zarrabi@....qualcomm.com> wrote:
>
> Hi Jens,
>
> On 2/4/2026 6:46 PM, Jens Wiklander wrote:
> > Hi Amir,
> >
> > On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi
> > <amirreza.zarrabi@....qualcomm.com> wrote:
> >>
> >> Hi Jens,
> >>
> >> On 2/3/2026 5:59 PM, Jens Wiklander wrote:
> >>> Hi,
> >>>
> >>> On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
> >>> <amirreza.zarrabi@....qualcomm.com> wrote:
> >>>>
> >>>> Hi Jens,
> >>>>
> >>>> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
> >>>>> Hi Amir,
> >>>>>
> >>>>> On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
> >>>>> <amirreza.zarrabi@....qualcomm.com> wrote:
> >>>>>>
> >>>>>> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
> >>>>>> client wait as killable so it can be interrupted during shutdown or
> >>>>>> after a supplicant crash. This changes the original lifetime expectations:
> >>>>>> the client task can now terminate while the supplicant is still processing
> >>>>>> its request.
> >>>>>>
> >>>>>> If the client exits first it removes the request from its queue and
> >>>>>> kfree()s it, while the request ID remains in supp->idr. A subsequent
> >>>>>> lookup on the supplicant path then dereferences freed memory, leading to
> >>>>>> a use-after-free.
> >>>>>>
> >>>>>> Serialise access to the request with supp->mutex:
> >>>>>>
> >>>>>> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
> >>>>>> looking up and touching the request.
> >>>>>> * Let optee_supp_thrd_req() notice that the client has terminated and
> >>>>>> signal optee_supp_send() accordingly.
> >>>>>>
> >>>>>> With these changes the request cannot be freed while the supplicant still
> >>>>>> has a reference, eliminating the race.
> >>>>>>
> >>>>>> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
> >>>>>> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>>>>> ---
> >>>>>> Changes in v3:
> >>>>>> - Introduce processed flag instead of -1 for req->id.
> >>>>>> - Update optee_supp_release() as reported by Michael Wu.
> >>>>>> - Use mutex instead of guard.
> >>>>>> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
> >>>>>>
> >>>>>> Changes in v2:
> >>>>>> - Replace the static variable with a sentinel value.
> >>>>>> - Fix the issue with returning the popped request to the supplicant.
> >>>>>> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
> >>>>>> ---
> >>>>>> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
> >>>>>> 1 file changed, 86 insertions(+), 36 deletions(-)
> >>>>>
[snip]
> >>>>>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>>>>>
> >>>>>> mutex_lock(&supp->mutex);
> >>>>>> req = supp_pop_req(supp, num_params, param, &num_meta);
> >>>>>> - mutex_unlock(&supp->mutex);
> >>>>>> -
> >>>>>> if (IS_ERR(req)) {
> >>>>>> + mutex_unlock(&supp->mutex);
> >>>>>
> >>>>> We need a way to tell the difference between an id not found and an id
> >>>>> removed because of a killed requester.
> >>>>> How about storing NULL for revoked requests instead of an err-pointer?
> >>>>>
> >>>>
> >>>> Not sure I'm following correctly. Are you expecting supp_pop_req()
> >>>> to return NULL instead of an err-pointer when a request has been revoked?
> >>>
> >>> I was looking at it again, and storing an err-pointer as you do in
> >>> this patch has the advantage that we can tell whether the ID has been
> >>> revoked or was never supplied. In the latter case, it suggests that
> >>> the supplicant is doing something wrong and might as well restart in
> >>> an attempt to recover. So, please keep using an err-pointer as a
> >>> placeholder, but we must be able to distinguish a revoked request from
> >>> other errors to make sure that the supplicant doesn't restart due to a
> >>> revoked request.
> >>>
> >>
> >> Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT
> >> (which seems more natural), so it doesn't overlap with other supp_pop_req() error
> >> codes and the supplicant can reliably detect it.
> >
> > Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will
> > cause the tee-supplicant to exit. Even if we update it to ignore
> > certain codes, we must also consider the installed base. There's not
> > much tee-supplicant could do with this error, except logging it. But I
> > don't think that is very useful either. Unless the tee-supplicant does
> > anything wrong or if the device isn't working any longer, we shouldn't
> > return an error.
> >
>
> The direction of data flow in optee_supp_send is from the supplicant to optee,
> so the only way I can return meaningful information back to the supplicant is
> through the return value. I suppose I could simply ignore the revoked request
> and return success, but it might be useful for the supplicant to know about it
> in case it needs to roll back something.
>
> At this point I'm out of ideas :). Do you have any suggestions on how I can
> inform the supplicant about a revoked request in optee_supp_send while returning
> success return value?
This became a bit harder than I first thought. At this point, to fix
the possible use-after-free, we have two options:
1. Returning an error code: tee-supplicant will exit
2. Returning OK: worst case, tee-supplicant can leak memory
During reboot, neither case is a problem. During normal operation,
it's annoying if tee-supplicant exists, but you still need some
privileges to kill the client. If we return an error, it's enough to
update tee-supplicant to handle that error case, and we're done. The
advantage is no added code to the kernel.
I think we should do what you suggested and return an error. This will
not happen during normal operation. We'll fix tee-supplicant to handle
the return error properly. tee-supplicant doesn't care about what
error code it gets. If there's an error in TEE_IOC_SUPPL_SEND, it
knows that no one will receive whatever was sent, and cleanup is
needed.
Sumit and Jerome, what do you think?
Cheers,
Jens
Powered by blists - more mailing lists