[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHUa44EhaztJ+5piu0c-HQbqZFX47uLPJ+VDRp=Bp6BMeeARAQ@mail.gmail.com>
Date: Wed, 4 Feb 2026 08:46:43 +0100
From: Jens Wiklander <jens.wiklander@...aro.org>
To: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
Cc: Sumit Garg <sumit.garg@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org,
op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
exits before the supplicant
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi
<amirreza.zarrabi@....qualcomm.com> wrote:
>
> Hi Jens,
>
> On 2/3/2026 5:59 PM, Jens Wiklander wrote:
> > Hi,
> >
> > On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
> > <amirreza.zarrabi@....qualcomm.com> wrote:
> >>
> >> Hi Jens,
> >>
> >> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
> >>> Hi Amir,
> >>>
> >>> On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
> >>> <amirreza.zarrabi@....qualcomm.com> wrote:
> >>>>
> >>>> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
> >>>> client wait as killable so it can be interrupted during shutdown or
> >>>> after a supplicant crash. This changes the original lifetime expectations:
> >>>> the client task can now terminate while the supplicant is still processing
> >>>> its request.
> >>>>
> >>>> If the client exits first it removes the request from its queue and
> >>>> kfree()s it, while the request ID remains in supp->idr. A subsequent
> >>>> lookup on the supplicant path then dereferences freed memory, leading to
> >>>> a use-after-free.
> >>>>
> >>>> Serialise access to the request with supp->mutex:
> >>>>
> >>>> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
> >>>> looking up and touching the request.
> >>>> * Let optee_supp_thrd_req() notice that the client has terminated and
> >>>> signal optee_supp_send() accordingly.
> >>>>
> >>>> With these changes the request cannot be freed while the supplicant still
> >>>> has a reference, eliminating the race.
> >>>>
> >>>> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
> >>>> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>>> ---
> >>>> Changes in v3:
> >>>> - Introduce processed flag instead of -1 for req->id.
> >>>> - Update optee_supp_release() as reported by Michael Wu.
> >>>> - Use mutex instead of guard.
> >>>> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
> >>>>
> >>>> Changes in v2:
> >>>> - Replace the static variable with a sentinel value.
> >>>> - Fix the issue with returning the popped request to the supplicant.
> >>>> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
> >>>> ---
> >>>> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
> >>>> 1 file changed, 86 insertions(+), 36 deletions(-)
> >>>
> >>> I had forgotten about this. I'd like to prioritize getting this fixed
> >>> soon. By the way, how did you test this?
> >>>
> >>
> >> Thanks for the update. I currently don't have access to the setup required to run
> >> the tests myself. My plan is to finalize the design and implementation, then
> >> ask Michael Wu to run his use case. Based on his earlier feedback, the patch
> >> appears to be working as expected.
> >>
> >> https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnertech.com/
> >
> > OK
> >
> >>
> >>>>
> >>>> diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c
> >>>> index d0f397c90242..0ec66008df19 100644
> >>>> --- a/drivers/tee/optee/supp.c
> >>>> +++ b/drivers/tee/optee/supp.c
> >>>> @@ -10,7 +10,11 @@
> >>>> struct optee_supp_req {
> >>>> struct list_head link;
> >>>>
> >>>> + int id;
> >>>> +
> >>>> bool in_queue;
> >>>> + bool processed;
> >>>> +
> >>>> u32 func;
> >>>> u32 ret;
> >>>> size_t num_params;
> >>>> @@ -19,6 +23,9 @@ struct optee_supp_req {
> >>>> struct completion c;
> >>>> };
> >>>>
> >>>> +/* It is temporary request used for invalid pending request in supp->idr. */
> >>>> +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
> >>>> +
> >>>> void optee_supp_init(struct optee_supp *supp)
> >>>> {
> >>>> memset(supp, 0, sizeof(*supp));
> >>>> @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp)
> >>>> /* Abort all request retrieved by supplicant */
> >>>> idr_for_each_entry(&supp->idr, req, id) {
> >>>> idr_remove(&supp->idr, id);
> >>>> + /* Skip if request was already marked invalid */
> >>>> + if (IS_ERR(req))
> >>>> + continue;
> >>>> +
> >>>> req->ret = TEEC_ERROR_COMMUNICATION;
> >>>> complete(&req->c);
> >>>> }
> >>>> @@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>>> mutex_lock(&supp->mutex);
> >>>> list_add_tail(&req->link, &supp->reqs);
> >>>> req->in_queue = true;
> >>>> + req->processed = false;
> >>>> mutex_unlock(&supp->mutex);
> >>>>
> >>>> /* Tell an eventual waiter there's a new request */
> >>>> @@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
> >>>> if (wait_for_completion_killable(&req->c)) {
> >>>> mutex_lock(&supp->mutex);
> >>>> if (req->in_queue) {
> >>>> + /* Supplicant has not seen this request yet. */
> >>>> list_del(&req->link);
> >>>> req->in_queue = false;
> >>>> +
> >>>> + ret = TEEC_ERROR_COMMUNICATION;
> >>>> + } else if (req->processed) {
> >>>> + /*
> >>>> + * Supplicant has processed this request. Ignore the
> >>>> + * kill signal for now and submit the result.
> >>>> + */
> >>>> + ret = req->ret;
> >>>> + } else {
> >>>> + /*
> >>>> + * Supplicant is in the middle of processing this
> >>>> + * request. Replace req with INVALID_REQ_PTR so that
> >>>> + * the ID remains busy, causing optee_supp_send() to
> >>>> + * fail on the next call to supp_pop_req() with this ID.
> >>>> + */
> >>>> + idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);
> >>>> + ret = TEEC_ERROR_COMMUNICATION;
> >>>> }
> >>>> +
> >>>> mutex_unlock(&supp->mutex);
> >>>> - req->ret = TEEC_ERROR_COMMUNICATION;
> >>>> + } else {
> >>>> + ret = req->ret;
> >>>> }
> >>>>
> >>>> - ret = req->ret;
> >>>> kfree(req);
> >>>>
> >>>> return ret;
> >>>> }
> >>>>
> >>>> static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
> >>>> - int num_params, int *id)
> >>>> + int num_params)
> >>>> {
> >>>> struct optee_supp_req *req;
> >>>>
> >>>> @@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
> >>>> return ERR_PTR(-EINVAL);
> >>>> }
> >>>>
> >>>> - *id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >>>> - if (*id < 0)
> >>>> + req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
> >>>> + if (req->id < 0)
> >>>> return ERR_PTR(-ENOMEM);
> >>>
> >>> Since we're now storing the supplicant request ID, wouldn't it make
> >>> sense to pre-allocate the ID when allocating the request to avoid this
> >>> error case?
> >>>
> >>
> >> True, but allocating the ID at this stage has one advantage.
> >> If an ID is not available, the request can remain on the request list,
> >> allowing the supplicant to retry later when resources become available.
> >> If ID allocation fails during request creation, I have no choice but
> >> to drop the request and report an error to optee.
> >
> > We're allocating in the range 1..INT_MAX, and not more than a handful
> > are expected to be active at a time. If we run out of IDs, we have
> > bigger problems.
> >
>
> Ack.
>
> >>
> >>>>
> >>>> list_del(&req->link);
> >>>> @@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>> struct optee *optee = tee_get_drvdata(teedev);
> >>>> struct optee_supp *supp = &optee->supp;
> >>>> struct optee_supp_req *req = NULL;
> >>>> - int id;
> >>>> size_t num_meta;
> >>>> int rc;
> >>>>
> >>>> @@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>>
> >>>> while (true) {
> >>>> mutex_lock(&supp->mutex);
> >>>> - req = supp_pop_entry(supp, *num_params - num_meta, &id);
> >>>> - mutex_unlock(&supp->mutex);
> >>>>
> >>>> - if (req) {
> >>>> - if (IS_ERR(req))
> >>>> - return PTR_ERR(req);
> >>>> - break;
> >>>> + req = supp_pop_entry(supp, *num_params - num_meta);
> >>>> + if (!req) {
> >>>> + mutex_unlock(&supp->mutex);
> >>>> + goto wait_for_request;
> >>>> + }
> >>>> +
> >>>> + if (IS_ERR(req)) {
> >>>> + rc = PTR_ERR(req);
> >>>> + mutex_unlock(&supp->mutex);
> >>>> +
> >>>> + return rc;
> >>>> }
> >>>>
> >>>> + /*
> >>>> + * Process the request while holding the lock, so that
> >>>> + * optee_supp_thrd_req() doesn't pull the request from under us.
> >>>> + */
> >>>> +
> >>>> + if (num_meta) {
> >>>> + /*
> >>>> + * tee-supplicant support meta parameters ->
> >>>> + * requests can be processed asynchronously.
> >>>> + */
> >>>> + param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >>>> + TEE_IOCTL_PARAM_ATTR_META;
> >>>> + param->u.value.a = req->id;
> >>>> + param->u.value.b = 0;
> >>>> + param->u.value.c = 0;
> >>>> + } else {
> >>>> + supp->req_id = req->id;
> >>>> + }
> >>>> +
> >>>> + *func = req->func;
> >>>> + *num_params = req->num_params + num_meta;
> >>>> + memcpy(param + num_meta, req->param,
> >>>> + sizeof(struct tee_param) * req->num_params);
> >>>> +
> >>>> + mutex_unlock(&supp->mutex);
> >>>> + return 0;
> >>>
> >>> Do we really need to move this into the loop? The structure of the
> >>> function becomes a bit unusual and harder to read.
> >>>
> >>
> >> Ack. I'll reorganize this function.
> >>
> >>>> +
> >>>> +wait_for_request:
> >>>> /*
> >>>> * If we didn't get a request we'll block in
> >>>> * wait_for_completion() to avoid needless spinning.
> >>>> @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
> >>>> */
> >>>> if (wait_for_completion_interruptible(&supp->reqs_c))
> >>>> return -ERESTARTSYS;
> >>>> - }
> >>>>
> >>>> - if (num_meta) {
> >>>> - /*
> >>>> - * tee-supplicant support meta parameters -> requsts can be
> >>>> - * processed asynchronously.
> >>>> - */
> >>>> - param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
> >>>> - TEE_IOCTL_PARAM_ATTR_META;
> >>>> - param->u.value.a = id;
> >>>> - param->u.value.b = 0;
> >>>> - param->u.value.c = 0;
> >>>> - } else {
> >>>> - mutex_lock(&supp->mutex);
> >>>> - supp->req_id = id;
> >>>> - mutex_unlock(&supp->mutex);
> >>>> + /* Check for the next request in the queue. */
> >>>> }
> >>>>
> >>>> - *func = req->func;
> >>>> - *num_params = req->num_params + num_meta;
> >>>> - memcpy(param + num_meta, req->param,
> >>>> - sizeof(struct tee_param) * req->num_params);
> >>>> -
> >>>> return 0;
> >>>> }
> >>>>
> >>>> @@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp,
> >>>> if (!req)
> >>>> return ERR_PTR(-ENOENT);
> >>>>
> >>>> + /* optee_supp_thrd_req() already returned to optee. */
> >>>> + if (IS_ERR(req))
> >>>> + goto failed_req;
> >>>> +
> >>>> if ((num_params - nm) != req->num_params)
> >>>> return ERR_PTR(-EINVAL);
> >>>>
> >>>> + *num_meta = nm;
> >>>> +failed_req:
> >>>> idr_remove(&supp->idr, id);
> >>>> supp->req_id = -1;
> >>>> - *num_meta = nm;
> >>>> +
> >>>>
> >>>> return req;
> >>>> }
> >>>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>>>
> >>>> mutex_lock(&supp->mutex);
> >>>> req = supp_pop_req(supp, num_params, param, &num_meta);
> >>>> - mutex_unlock(&supp->mutex);
> >>>> -
> >>>> if (IS_ERR(req)) {
> >>>> + mutex_unlock(&supp->mutex);
> >>>
> >>> We need a way to tell the difference between an id not found and an id
> >>> removed because of a killed requester.
> >>> How about storing NULL for revoked requests instead of an err-pointer?
> >>>
> >>
> >> Not sure I'm following correctly. Are you expecting supp_pop_req()
> >> to return NULL instead of an err-pointer when a request has been revoked?
> >
> > I was looking at it again, and storing an err-pointer as you do in
> > this patch has the advantage that we can tell whether the ID has been
> > revoked or was never supplied. In the latter case, it suggests that
> > the supplicant is doing something wrong and might as well restart in
> > an attempt to recover. So, please keep using an err-pointer as a
> > placeholder, but we must be able to distinguish a revoked request from
> > other errors to make sure that the supplicant doesn't restart due to a
> > revoked request.
> >
>
> Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT
> (which seems more natural), so it doesn't overlap with other supp_pop_req() error
> codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will
cause the tee-supplicant to exit. Even if we update it to ignore
certain codes, we must also consider the installed base. There's not
much tee-supplicant could do with this error, except logging it. But I
don't think that is very useful either. Unless the tee-supplicant does
anything wrong or if the device isn't working any longer, we shouldn't
return an error.
Cheers,
Jens
>
> Best Regards,
> Amir
>
> > Cheers,
> > Jens
> >
> >>
> >> Best Rearads,
> >> Amir
> >>
> >>> Cheers,
> >>> Jens
> >>>
> >>>> /* Something is wrong, let supplicant restart. */
> >>>> return PTR_ERR(req);
> >>>> }
> >>>> @@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >>>> }
> >>>> }
> >>>> req->ret = ret;
> >>>> -
> >>>> + req->processed = true;
> >>>> /* Let the requesting thread continue */
> >>>> complete(&req->c);
> >>>> + mutex_unlock(&supp->mutex);
> >>>>
> >>>> return 0;
> >>>> }
> >>>>
> >>>> ---
> >>>> base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279
> >>>> change-id: 20250604-fix-use-after-free-8ff1b5d5d774
> >>>>
> >>>> Best regards,
> >>>> --
> >>>> Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
> >>>>
> >>
>
Powered by blists - more mailing lists