linux-kernel - Re: [PATCH v3] tee: optee: prevent use-after-free when the client exits before the supplicant

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHUa44EhaztJ+5piu0c-HQbqZFX47uLPJ+VDRp=Bp6BMeeARAQ@mail.gmail.com>
Date: Wed, 4 Feb 2026 08:46:43 +0100
From: Jens Wiklander <jens.wiklander@...aro.org>
To: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
Cc: Sumit Garg <sumit.garg@...nel.org>, Arnd Bergmann <arnd@...db.de>, 
	Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org, 
	op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
 exits before the supplicant

Hi Amir,

On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi
<amirreza.zarrabi@....qualcomm.com> wrote:
>
> Hi Jens,
>
> On 2/3/2026 5:59 PM, Jens Wiklander wrote:
> > Hi,
> >
> > On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
> > <amirreza.zarrabi@....qualcomm.com> wrote:
> >>
> >> Hi Jens,
> >>
> >> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
> >>> Hi Amir,
> >>>
> >>> On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
> >>> <amirreza.zarrabi@....qualcomm.com> wrote:
> >>>>
> >>>> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
> >>>> client wait as killable so it can be interrupted during shutdown or
> >>>> after a supplicant crash. This changes the original lifetime expectations:
> >>>> the client task can now terminate while the supplicant is still processing
> >>>> its request.
> >>>>
> >>>> If the client exits first it removes the request from its queue and
> >>>> kfree()s it, while the request ID remains in supp->idr. A subsequent
> >>>> lookup on the supplicant path then dereferences freed memory, leading to
> >>>> a use-after-free.
> >>>>
> >>>> Serialise access to the request with supp->mutex:
> >>>>
> >>>>   * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
> >>>>     looking up and touching the request.
> >>>>   * Let optee_supp_thrd_req() notice that the client has terminated and
> >>>>     signal optee_supp_send() accordingly.
> >>>>
> >>>> With these changes the request cannot be freed while the supplicant still
> >>>> has a reference, eliminating the race.
> >>>>
> >>>> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
> >>>> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>>> ---
> >>>> Changes in v3:
> >>>> - Introduce processed flag instead of -1 for req->id.
> >>>> - Update optee_supp_release() as reported by Michael Wu.
> >>>> - Use mutex instead of guard.
> >>>> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
> >>>>
> >>>> Changes in v2:
> >>>> - Replace the static variable with a sentinel value.
> >>>> - Fix the issue with returning the popped request to the supplicant.
> >>>> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
> >>>> ---
> >>>>  drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
> >>>>  1 file changed, 86 insertions(+), 36 deletions(-)
> >>>
> >>> I had forgotten about this. I'd like to prioritize getting this fixed
> >>> soon. By the way, how did you test this?
> >>>
> >>
> >> Thanks for the update. I currently don't have access to the setup required to run
> >> the tests myself. My plan is to finalize the design and implementation, then
> >> ask Michael Wu to run his use case. Based on his earlier feedback, the patch
> >> appears to be working as expected.
> >>
> >> https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnertech.com/
> >
> > OK
> >
> >>
> >>>>
> >>>> diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c
> >>>> index d0f397c90242..0ec66008df19 100644
> >>>> --- a/drivers/tee/optee/supp.c
> >>>> +++ b/drivers/tee/optee/supp.c
> >>>> @@ -10,7 +10,11 @@
> >>>>  struct optee_supp_req {
> >>>>         struct list_head link;
> >>>>
> >>>> +       int id;
> >>>> +
> >>>>         bool in_queue;
> >>>> +       bool processed;
> >>>> +
> >>>>         u32 func;
> >>>>         u32 ret;
> >>>>         size_t num_params;
> >>>> @@ -19,6 +23,9 @@ struct optee_supp_req {
> >>>>         struct completion c;
> >>>>  };
> >>>>
> >>>> +/* It is temporary request used for invalid pending request in supp->idr. */
> >>>> +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
> >>>> +
> >>>>  void optee_supp_init(struct optee_supp *supp)
> >>>>  {
> >>>>         memset(supp, 0, sizeof(*supp));
> >>>> @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp)
> >>>>         /* Abort all request retrieved by supplicant */
> >>>>         idr_for_each_entry(&supp->idr, req, id) {
> >>>>                 idr_remove(&supp->idr, id);
> >>>> +               /* Skip if request was already marked invalid */
> >>>> +               if (IS_ERR(req))
> >>>> +                       continue;
> >>>> +
> >>>>                 req->ret = TEEC_ERROR_COMMUNICATION;
> >>>>                 complete(&req->c);
> >>>>         }
> >>>> @@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>>>         mutex_lock(&supp->mutex);
> >>>>         list_add_tail(&req->link, &supp->reqs);
> >>>>         req->in_queue = true;
> >>>> +       req->processed = false;
> >>>>         mutex_unlock(&supp->mutex);
> >>>>
> >>>>         /* Tell an eventual waiter there's a new request */
> >>>> @@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>>>         if (wait_for_completion_killable(&req->c)) {
> >>>>                 mutex_lock(&supp->mutex);
> >>>>                 if (req->in_queue) {
> >>>> +                       /* Supplicant has not seen this request yet. */
> >>>>                         list_del(&req->link);
> >>>>                         req->in_queue = false;
> >>>> +
> >>>> +                       ret = TEEC_ERROR_COMMUNICATION;
> >>>> +               } else if (req->processed) {
> >>>> +                       /*
> >>>> +                        * Supplicant has processed this request. Ignore the
> >>>> +                        * kill signal for now and submit the result.
> >>>> +                        */
> >>>> +                       ret = req->ret;
> >>>> +               } else {
> >>>> +                       /*
> >>>> +                        * Supplicant is in the middle of processing this
> >>>> +                        * request. Replace req with INVALID_REQ_PTR so that
> >>>> +                        * the ID remains busy, causing optee_supp_send() to
> >>>> +                        * fail on the next call to supp_pop_req() with this ID.
> >>>> +                        */
> >>>> +                       idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);
> >>>> +                       ret = TEEC_ERROR_COMMUNICATION;
> >>>>                 }
> >>>> +
> >>>>                 mutex_unlock(&supp->mutex);
> >>>> -               req->ret = TEEC_ERROR_COMMUNICATION;
> >>>> +       } else {
> >>>> +               ret = req->ret;
> >>>>         }
> >>>>
> >>>> -       ret = req->ret;
> >>>>         kfree(req);
> >>>>
> >>>>         return ret;
> >>>>  }
> >>>>
> >>>>  static struct optee_supp_req  *supp_pop_entry(struct optee_supp *supp,
> >>>> -                                             int num_params, int *id)
> >>>> +                                             int num_params)
> >>>>  {
> >>>>         struct optee_supp_req *req;
> >>>>
> >>>> @@ -153,8 +184,8 @@ static struct optee_supp_req  *supp_pop_entry(struct optee_supp *supp,
> >>>>                 return ERR_PTR(-EINVAL);
> >>>>         }
> >>>>
> >>>> -       *id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >>>> -       if (*id < 0)
> >>>> +       req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >>>> +       if (req->id < 0)
> >>>>                 return ERR_PTR(-ENOMEM);
> >>>
> >>> Since we're now storing the supplicant request ID, wouldn't it make
> >>> sense to pre-allocate the ID when allocating the request to avoid this
> >>> error case?
> >>>
> >>
> >> True, but allocating the ID at this stage has one advantage.
> >> If an ID is not available, the request can remain on the request list,
> >> allowing the supplicant to retry later when resources become available.
> >> If ID allocation fails during request creation, I have no choice but
> >> to drop the request and report an error to optee.
> >
> > We're allocating in the range 1..INT_MAX, and not more than a handful
> > are expected to be active at a time. If we run out of IDs, we have
> > bigger problems.
> >
>
> Ack.
>
> >>
> >>>>
> >>>>         list_del(&req->link);
> >>>> @@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>>         struct optee *optee = tee_get_drvdata(teedev);
> >>>>         struct optee_supp *supp = &optee->supp;
> >>>>         struct optee_supp_req *req = NULL;
> >>>> -       int id;
> >>>>         size_t num_meta;
> >>>>         int rc;
> >>>>
> >>>> @@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>>
> >>>>         while (true) {
> >>>>                 mutex_lock(&supp->mutex);
> >>>> -               req = supp_pop_entry(supp, *num_params - num_meta, &id);
> >>>> -               mutex_unlock(&supp->mutex);
> >>>>
> >>>> -               if (req) {
> >>>> -                       if (IS_ERR(req))
> >>>> -                               return PTR_ERR(req);
> >>>> -                       break;
> >>>> +               req = supp_pop_entry(supp, *num_params - num_meta);
> >>>> +               if (!req) {
> >>>> +                       mutex_unlock(&supp->mutex);
> >>>> +                       goto wait_for_request;
> >>>> +               }
> >>>> +
> >>>> +               if (IS_ERR(req)) {
> >>>> +                       rc = PTR_ERR(req);
> >>>> +                       mutex_unlock(&supp->mutex);
> >>>> +
> >>>> +                       return rc;
> >>>>                 }
> >>>>
> >>>> +               /*
> >>>> +                * Process the request while holding the lock, so that
> >>>> +                * optee_supp_thrd_req() doesn't pull the request from under us.
> >>>> +                */
> >>>> +
> >>>> +               if (num_meta) {
> >>>> +                       /*
> >>>> +                        * tee-supplicant support meta parameters ->
> >>>> +                        * requests can be processed asynchronously.
> >>>> +                        */
> >>>> +                       param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >>>> +                                     TEE_IOCTL_PARAM_ATTR_META;
> >>>> +                       param->u.value.a = req->id;
> >>>> +                       param->u.value.b = 0;
> >>>> +                       param->u.value.c = 0;
> >>>> +               } else {
> >>>> +                       supp->req_id = req->id;
> >>>> +               }
> >>>> +
> >>>> +               *func = req->func;
> >>>> +               *num_params = req->num_params + num_meta;
> >>>> +               memcpy(param + num_meta, req->param,
> >>>> +                      sizeof(struct tee_param) * req->num_params);
> >>>> +
> >>>> +               mutex_unlock(&supp->mutex);
> >>>> +               return 0;
> >>>
> >>> Do we really need to move this into the loop? The structure of the
> >>> function becomes a bit unusual and harder to read.
> >>>
> >>
> >> Ack. I'll reorganize this function.
> >>
> >>>> +
> >>>> +wait_for_request:
> >>>>                 /*
> >>>>                  * If we didn't get a request we'll block in
> >>>>                  * wait_for_completion() to avoid needless spinning.
> >>>> @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>>                  */
> >>>>                 if (wait_for_completion_interruptible(&supp->reqs_c))
> >>>>                         return -ERESTARTSYS;
> >>>> -       }
> >>>>
> >>>> -       if (num_meta) {
> >>>> -               /*
> >>>> -                * tee-supplicant support meta parameters -> requsts can be
> >>>> -                * processed asynchronously.
> >>>> -                */
> >>>> -               param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >>>> -                             TEE_IOCTL_PARAM_ATTR_META;
> >>>> -               param->u.value.a = id;
> >>>> -               param->u.value.b = 0;
> >>>> -               param->u.value.c = 0;
> >>>> -       } else {
> >>>> -               mutex_lock(&supp->mutex);
> >>>> -               supp->req_id = id;
> >>>> -               mutex_unlock(&supp->mutex);
> >>>> +               /* Check for the next request in the queue. */
> >>>>         }
> >>>>
> >>>> -       *func = req->func;
> >>>> -       *num_params = req->num_params + num_meta;
> >>>> -       memcpy(param + num_meta, req->param,
> >>>> -              sizeof(struct tee_param) * req->num_params);
> >>>> -
> >>>>         return 0;
> >>>>  }
> >>>>
> >>>> @@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp,
> >>>>         if (!req)
> >>>>                 return ERR_PTR(-ENOENT);
> >>>>
> >>>> +       /* optee_supp_thrd_req() already returned to optee. */
> >>>> +       if (IS_ERR(req))
> >>>> +               goto failed_req;
> >>>> +
> >>>>         if ((num_params - nm) != req->num_params)
> >>>>                 return ERR_PTR(-EINVAL);
> >>>>
> >>>> +       *num_meta = nm;
> >>>> +failed_req:
> >>>>         idr_remove(&supp->idr, id);
> >>>>         supp->req_id = -1;
> >>>> -       *num_meta = nm;
> >>>> +
> >>>>
> >>>>         return req;
> >>>>  }
> >>>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>>>
> >>>>         mutex_lock(&supp->mutex);
> >>>>         req = supp_pop_req(supp, num_params, param, &num_meta);
> >>>> -       mutex_unlock(&supp->mutex);
> >>>> -
> >>>>         if (IS_ERR(req)) {
> >>>> +               mutex_unlock(&supp->mutex);
> >>>
> >>> We need a way to tell the difference between an id not found and an id
> >>> removed because of a killed requester.
> >>> How about storing NULL for revoked requests instead of an err-pointer?
> >>>
> >>
> >> Not sure I'm following correctly. Are you expecting supp_pop_req()
> >> to return NULL instead of an err-pointer when a request has been revoked?
> >
> > I was looking at it again, and storing an err-pointer as you do in
> > this patch has the advantage that we can tell whether the ID has been
> > revoked or was never supplied. In the latter case, it suggests that
> > the supplicant is doing something wrong and might as well restart in
> > an attempt to recover. So, please keep using an err-pointer as a
> > placeholder, but we must be able to distinguish a revoked request from
> > other errors to make sure that the supplicant doesn't restart due to a
> > revoked request.
> >
>
> Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT
> (which seems more natural), so it doesn't overlap with other supp_pop_req() error
> codes and the supplicant can reliably detect it.

Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will
cause the tee-supplicant to exit. Even if we update it to ignore
certain codes, we must also consider the installed base. There's not
much tee-supplicant could do with this error, except logging it. But I
don't think that is very useful either. Unless the tee-supplicant does
anything wrong or if the device isn't working any longer, we shouldn't
return an error.

Cheers,
Jens

>
> Best Regards,
> Amir
>
> > Cheers,
> > Jens
> >
> >>
> >> Best Rearads,
> >> Amir
> >>
> >>> Cheers,
> >>> Jens
> >>>
> >>>>                 /* Something is wrong, let supplicant restart. */
> >>>>                 return PTR_ERR(req);
> >>>>         }
> >>>> @@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>>>                 }
> >>>>         }
> >>>>         req->ret = ret;
> >>>> -
> >>>> +       req->processed = true;
> >>>>         /* Let the requesting thread continue */
> >>>>         complete(&req->c);
> >>>> +       mutex_unlock(&supp->mutex);
> >>>>
> >>>>         return 0;
> >>>>  }
> >>>>
> >>>> ---
> >>>> base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279
> >>>> change-id: 20250604-fix-use-after-free-8ff1b5d5d774
> >>>>
> >>>> Best regards,
> >>>> --
> >>>> Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>>>
> >>
>