linux-kernel - Re: [PATCH v3] tee: optee: prevent use-after-free when the client exits before the supplicant

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHUa44FfowG=GpgLcq=sp_TqRArJtf0_V-tsJY11HH+FEA6u+w@mail.gmail.com>
Date: Tue, 3 Feb 2026 07:59:16 +0100
From: Jens Wiklander <jens.wiklander@...aro.org>
To: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
Cc: Sumit Garg <sumit.garg@...nel.org>, Arnd Bergmann <arnd@...db.de>, 
	Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org, 
	op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
 exits before the supplicant

Hi,

On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
<amirreza.zarrabi@....qualcomm.com> wrote:
>
> Hi Jens,
>
> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
> > Hi Amir,
> >
> > On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
> > <amirreza.zarrabi@....qualcomm.com> wrote:
> >>
> >> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
> >> client wait as killable so it can be interrupted during shutdown or
> >> after a supplicant crash. This changes the original lifetime expectations:
> >> the client task can now terminate while the supplicant is still processing
> >> its request.
> >>
> >> If the client exits first it removes the request from its queue and
> >> kfree()s it, while the request ID remains in supp->idr. A subsequent
> >> lookup on the supplicant path then dereferences freed memory, leading to
> >> a use-after-free.
> >>
> >> Serialise access to the request with supp->mutex:
> >>
> >>   * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
> >>     looking up and touching the request.
> >>   * Let optee_supp_thrd_req() notice that the client has terminated and
> >>     signal optee_supp_send() accordingly.
> >>
> >> With these changes the request cannot be freed while the supplicant still
> >> has a reference, eliminating the race.
> >>
> >> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
> >> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >> ---
> >> Changes in v3:
> >> - Introduce processed flag instead of -1 for req->id.
> >> - Update optee_supp_release() as reported by Michael Wu.
> >> - Use mutex instead of guard.
> >> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
> >>
> >> Changes in v2:
> >> - Replace the static variable with a sentinel value.
> >> - Fix the issue with returning the popped request to the supplicant.
> >> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
> >> ---
> >>  drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
> >>  1 file changed, 86 insertions(+), 36 deletions(-)
> >
> > I had forgotten about this. I'd like to prioritize getting this fixed
> > soon. By the way, how did you test this?
> >
>
> Thanks for the update. I currently don't have access to the setup required to run
> the tests myself. My plan is to finalize the design and implementation, then
> ask Michael Wu to run his use case. Based on his earlier feedback, the patch
> appears to be working as expected.
>
> https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnertech.com/

OK

>
> >>
> >> diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c
> >> index d0f397c90242..0ec66008df19 100644
> >> --- a/drivers/tee/optee/supp.c
> >> +++ b/drivers/tee/optee/supp.c
> >> @@ -10,7 +10,11 @@
> >>  struct optee_supp_req {
> >>         struct list_head link;
> >>
> >> +       int id;
> >> +
> >>         bool in_queue;
> >> +       bool processed;
> >> +
> >>         u32 func;
> >>         u32 ret;
> >>         size_t num_params;
> >> @@ -19,6 +23,9 @@ struct optee_supp_req {
> >>         struct completion c;
> >>  };
> >>
> >> +/* It is temporary request used for invalid pending request in supp->idr. */
> >> +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
> >> +
> >>  void optee_supp_init(struct optee_supp *supp)
> >>  {
> >>         memset(supp, 0, sizeof(*supp));
> >> @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp)
> >>         /* Abort all request retrieved by supplicant */
> >>         idr_for_each_entry(&supp->idr, req, id) {
> >>                 idr_remove(&supp->idr, id);
> >> +               /* Skip if request was already marked invalid */
> >> +               if (IS_ERR(req))
> >> +                       continue;
> >> +
> >>                 req->ret = TEEC_ERROR_COMMUNICATION;
> >>                 complete(&req->c);
> >>         }
> >> @@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>         mutex_lock(&supp->mutex);
> >>         list_add_tail(&req->link, &supp->reqs);
> >>         req->in_queue = true;
> >> +       req->processed = false;
> >>         mutex_unlock(&supp->mutex);
> >>
> >>         /* Tell an eventual waiter there's a new request */
> >> @@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>         if (wait_for_completion_killable(&req->c)) {
> >>                 mutex_lock(&supp->mutex);
> >>                 if (req->in_queue) {
> >> +                       /* Supplicant has not seen this request yet. */
> >>                         list_del(&req->link);
> >>                         req->in_queue = false;
> >> +
> >> +                       ret = TEEC_ERROR_COMMUNICATION;
> >> +               } else if (req->processed) {
> >> +                       /*
> >> +                        * Supplicant has processed this request. Ignore the
> >> +                        * kill signal for now and submit the result.
> >> +                        */
> >> +                       ret = req->ret;
> >> +               } else {
> >> +                       /*
> >> +                        * Supplicant is in the middle of processing this
> >> +                        * request. Replace req with INVALID_REQ_PTR so that
> >> +                        * the ID remains busy, causing optee_supp_send() to
> >> +                        * fail on the next call to supp_pop_req() with this ID.
> >> +                        */
> >> +                       idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);
> >> +                       ret = TEEC_ERROR_COMMUNICATION;
> >>                 }
> >> +
> >>                 mutex_unlock(&supp->mutex);
> >> -               req->ret = TEEC_ERROR_COMMUNICATION;
> >> +       } else {
> >> +               ret = req->ret;
> >>         }
> >>
> >> -       ret = req->ret;
> >>         kfree(req);
> >>
> >>         return ret;
> >>  }
> >>
> >>  static struct optee_supp_req  *supp_pop_entry(struct optee_supp *supp,
> >> -                                             int num_params, int *id)
> >> +                                             int num_params)
> >>  {
> >>         struct optee_supp_req *req;
> >>
> >> @@ -153,8 +184,8 @@ static struct optee_supp_req  *supp_pop_entry(struct optee_supp *supp,
> >>                 return ERR_PTR(-EINVAL);
> >>         }
> >>
> >> -       *id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >> -       if (*id < 0)
> >> +       req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >> +       if (req->id < 0)
> >>                 return ERR_PTR(-ENOMEM);
> >
> > Since we're now storing the supplicant request ID, wouldn't it make
> > sense to pre-allocate the ID when allocating the request to avoid this
> > error case?
> >
>
> True, but allocating the ID at this stage has one advantage.
> If an ID is not available, the request can remain on the request list,
> allowing the supplicant to retry later when resources become available.
> If ID allocation fails during request creation, I have no choice but
> to drop the request and report an error to optee.

We're allocating in the range 1..INT_MAX, and not more than a handful
are expected to be active at a time. If we run out of IDs, we have
bigger problems.

>
> >>
> >>         list_del(&req->link);
> >> @@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>         struct optee *optee = tee_get_drvdata(teedev);
> >>         struct optee_supp *supp = &optee->supp;
> >>         struct optee_supp_req *req = NULL;
> >> -       int id;
> >>         size_t num_meta;
> >>         int rc;
> >>
> >> @@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>
> >>         while (true) {
> >>                 mutex_lock(&supp->mutex);
> >> -               req = supp_pop_entry(supp, *num_params - num_meta, &id);
> >> -               mutex_unlock(&supp->mutex);
> >>
> >> -               if (req) {
> >> -                       if (IS_ERR(req))
> >> -                               return PTR_ERR(req);
> >> -                       break;
> >> +               req = supp_pop_entry(supp, *num_params - num_meta);
> >> +               if (!req) {
> >> +                       mutex_unlock(&supp->mutex);
> >> +                       goto wait_for_request;
> >> +               }
> >> +
> >> +               if (IS_ERR(req)) {
> >> +                       rc = PTR_ERR(req);
> >> +                       mutex_unlock(&supp->mutex);
> >> +
> >> +                       return rc;
> >>                 }
> >>
> >> +               /*
> >> +                * Process the request while holding the lock, so that
> >> +                * optee_supp_thrd_req() doesn't pull the request from under us.
> >> +                */
> >> +
> >> +               if (num_meta) {
> >> +                       /*
> >> +                        * tee-supplicant support meta parameters ->
> >> +                        * requests can be processed asynchronously.
> >> +                        */
> >> +                       param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >> +                                     TEE_IOCTL_PARAM_ATTR_META;
> >> +                       param->u.value.a = req->id;
> >> +                       param->u.value.b = 0;
> >> +                       param->u.value.c = 0;
> >> +               } else {
> >> +                       supp->req_id = req->id;
> >> +               }
> >> +
> >> +               *func = req->func;
> >> +               *num_params = req->num_params + num_meta;
> >> +               memcpy(param + num_meta, req->param,
> >> +                      sizeof(struct tee_param) * req->num_params);
> >> +
> >> +               mutex_unlock(&supp->mutex);
> >> +               return 0;
> >
> > Do we really need to move this into the loop? The structure of the
> > function becomes a bit unusual and harder to read.
> >
>
> Ack. I'll reorganize this function.
>
> >> +
> >> +wait_for_request:
> >>                 /*
> >>                  * If we didn't get a request we'll block in
> >>                  * wait_for_completion() to avoid needless spinning.
> >> @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>                  */
> >>                 if (wait_for_completion_interruptible(&supp->reqs_c))
> >>                         return -ERESTARTSYS;
> >> -       }
> >>
> >> -       if (num_meta) {
> >> -               /*
> >> -                * tee-supplicant support meta parameters -> requsts can be
> >> -                * processed asynchronously.
> >> -                */
> >> -               param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >> -                             TEE_IOCTL_PARAM_ATTR_META;
> >> -               param->u.value.a = id;
> >> -               param->u.value.b = 0;
> >> -               param->u.value.c = 0;
> >> -       } else {
> >> -               mutex_lock(&supp->mutex);
> >> -               supp->req_id = id;
> >> -               mutex_unlock(&supp->mutex);
> >> +               /* Check for the next request in the queue. */
> >>         }
> >>
> >> -       *func = req->func;
> >> -       *num_params = req->num_params + num_meta;
> >> -       memcpy(param + num_meta, req->param,
> >> -              sizeof(struct tee_param) * req->num_params);
> >> -
> >>         return 0;
> >>  }
> >>
> >> @@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp,
> >>         if (!req)
> >>                 return ERR_PTR(-ENOENT);
> >>
> >> +       /* optee_supp_thrd_req() already returned to optee. */
> >> +       if (IS_ERR(req))
> >> +               goto failed_req;
> >> +
> >>         if ((num_params - nm) != req->num_params)
> >>                 return ERR_PTR(-EINVAL);
> >>
> >> +       *num_meta = nm;
> >> +failed_req:
> >>         idr_remove(&supp->idr, id);
> >>         supp->req_id = -1;
> >> -       *num_meta = nm;
> >> +
> >>
> >>         return req;
> >>  }
> >> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>
> >>         mutex_lock(&supp->mutex);
> >>         req = supp_pop_req(supp, num_params, param, &num_meta);
> >> -       mutex_unlock(&supp->mutex);
> >> -
> >>         if (IS_ERR(req)) {
> >> +               mutex_unlock(&supp->mutex);
> >
> > We need a way to tell the difference between an id not found and an id
> > removed because of a killed requester.
> > How about storing NULL for revoked requests instead of an err-pointer?
> >
>
> Not sure I'm following correctly. Are you expecting supp_pop_req()
> to return NULL instead of an err-pointer when a request has been revoked?

I was looking at it again, and storing an err-pointer as you do in
this patch has the advantage that we can tell whether the ID has been
revoked or was never supplied. In the latter case, it suggests that
the supplicant is doing something wrong and might as well restart in
an attempt to recover. So, please keep using an err-pointer as a
placeholder, but we must be able to distinguish a revoked request from
other errors to make sure that the supplicant doesn't restart due to a
revoked request.

Cheers,
Jens

>
> Best Rearads,
> Amir
>
> > Cheers,
> > Jens
> >
> >>                 /* Something is wrong, let supplicant restart. */
> >>                 return PTR_ERR(req);
> >>         }
> >> @@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>                 }
> >>         }
> >>         req->ret = ret;
> >> -
> >> +       req->processed = true;
> >>         /* Let the requesting thread continue */
> >>         complete(&req->c);
> >> +       mutex_unlock(&supp->mutex);
> >>
> >>         return 0;
> >>  }
> >>
> >> ---
> >> base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279
> >> change-id: 20250604-fix-use-after-free-8ff1b5d5d774
> >>
> >> Best regards,
> >> --
> >> Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>
>