[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c9e4f5f-e0f3-42aa-852f-064f4024af26@oss.qualcomm.com>
Date: Wed, 4 Feb 2026 09:56:12 +1100
From: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
To: Jens Wiklander <jens.wiklander@...aro.org>
Cc: Sumit Garg <sumit.garg@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Michael Wu <michael@...winnertech.com>, linux-arm-msm@...r.kernel.org,
op-tee@...ts.trustedfirmware.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tee: optee: prevent use-after-free when the client
exits before the supplicant
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
> Hi,
>
> On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi
> <amirreza.zarrabi@....qualcomm.com> wrote:
>>
>> Hi Jens,
>>
>> On 2/2/2026 10:36 PM, Jens Wiklander wrote:
>>> Hi Amir,
>>>
>>> On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi
>>> <amirreza.zarrabi@....qualcomm.com> wrote:
>>>>
>>>> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the
>>>> client wait as killable so it can be interrupted during shutdown or
>>>> after a supplicant crash. This changes the original lifetime expectations:
>>>> the client task can now terminate while the supplicant is still processing
>>>> its request.
>>>>
>>>> If the client exits first it removes the request from its queue and
>>>> kfree()s it, while the request ID remains in supp->idr. A subsequent
>>>> lookup on the supplicant path then dereferences freed memory, leading to
>>>> a use-after-free.
>>>>
>>>> Serialise access to the request with supp->mutex:
>>>>
>>>> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while
>>>> looking up and touching the request.
>>>> * Let optee_supp_thrd_req() notice that the client has terminated and
>>>> signal optee_supp_send() accordingly.
>>>>
>>>> With these changes the request cannot be freed while the supplicant still
>>>> has a reference, eliminating the race.
>>>>
>>>> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop")
>>>> Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
>>>> ---
>>>> Changes in v3:
>>>> - Introduce processed flag instead of -1 for req->id.
>>>> - Update optee_supp_release() as reported by Michael Wu.
>>>> - Use mutex instead of guard.
>>>> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.qualcomm.com
>>>>
>>>> Changes in v2:
>>>> - Replace the static variable with a sentinel value.
>>>> - Fix the issue with returning the popped request to the supplicant.
>>>> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.qualcomm.com
>>>> ---
>>>> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++--------------
>>>> 1 file changed, 86 insertions(+), 36 deletions(-)
>>>
>>> I had forgotten about this. I'd like to prioritize getting this fixed
>>> soon. By the way, how did you test this?
>>>
>>
>> Thanks for the update. I currently don't have access to the setup required to run
>> the tests myself. My plan is to finalize the design and implementation, then
>> ask Michael Wu to run his use case. Based on his earlier feedback, the patch
>> appears to be working as expected.
>>
>> https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnertech.com/
>
> OK
>
>>
>>>>
>>>> diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c
>>>> index d0f397c90242..0ec66008df19 100644
>>>> --- a/drivers/tee/optee/supp.c
>>>> +++ b/drivers/tee/optee/supp.c
>>>> @@ -10,7 +10,11 @@
>>>> struct optee_supp_req {
>>>> struct list_head link;
>>>>
>>>> + int id;
>>>> +
>>>> bool in_queue;
>>>> + bool processed;
>>>> +
>>>> u32 func;
>>>> u32 ret;
>>>> size_t num_params;
>>>> @@ -19,6 +23,9 @@ struct optee_supp_req {
>>>> struct completion c;
>>>> };
>>>>
>>>> +/* It is temporary request used for invalid pending request in supp->idr. */
>>>> +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
>>>> +
>>>> void optee_supp_init(struct optee_supp *supp)
>>>> {
>>>> memset(supp, 0, sizeof(*supp));
>>>> @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp)
>>>> /* Abort all request retrieved by supplicant */
>>>> idr_for_each_entry(&supp->idr, req, id) {
>>>> idr_remove(&supp->idr, id);
>>>> + /* Skip if request was already marked invalid */
>>>> + if (IS_ERR(req))
>>>> + continue;
>>>> +
>>>> req->ret = TEEC_ERROR_COMMUNICATION;
>>>> complete(&req->c);
>>>> }
>>>> @@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
>>>> mutex_lock(&supp->mutex);
>>>> list_add_tail(&req->link, &supp->reqs);
>>>> req->in_queue = true;
>>>> + req->processed = false;
>>>> mutex_unlock(&supp->mutex);
>>>>
>>>> /* Tell an eventual waiter there's a new request */
>>>> @@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params,
>>>> if (wait_for_completion_killable(&req->c)) {
>>>> mutex_lock(&supp->mutex);
>>>> if (req->in_queue) {
>>>> + /* Supplicant has not seen this request yet. */
>>>> list_del(&req->link);
>>>> req->in_queue = false;
>>>> +
>>>> + ret = TEEC_ERROR_COMMUNICATION;
>>>> + } else if (req->processed) {
>>>> + /*
>>>> + * Supplicant has processed this request. Ignore the
>>>> + * kill signal for now and submit the result.
>>>> + */
>>>> + ret = req->ret;
>>>> + } else {
>>>> + /*
>>>> + * Supplicant is in the middle of processing this
>>>> + * request. Replace req with INVALID_REQ_PTR so that
>>>> + * the ID remains busy, causing optee_supp_send() to
>>>> + * fail on the next call to supp_pop_req() with this ID.
>>>> + */
>>>> + idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);
>>>> + ret = TEEC_ERROR_COMMUNICATION;
>>>> }
>>>> +
>>>> mutex_unlock(&supp->mutex);
>>>> - req->ret = TEEC_ERROR_COMMUNICATION;
>>>> + } else {
>>>> + ret = req->ret;
>>>> }
>>>>
>>>> - ret = req->ret;
>>>> kfree(req);
>>>>
>>>> return ret;
>>>> }
>>>>
>>>> static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
>>>> - int num_params, int *id)
>>>> + int num_params)
>>>> {
>>>> struct optee_supp_req *req;
>>>>
>>>> @@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
>>>> return ERR_PTR(-EINVAL);
>>>> }
>>>>
>>>> - *id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
>>>> - if (*id < 0)
>>>> + req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);
>>>> + if (req->id < 0)
>>>> return ERR_PTR(-ENOMEM);
>>>
>>> Since we're now storing the supplicant request ID, wouldn't it make
>>> sense to pre-allocate the ID when allocating the request to avoid this
>>> error case?
>>>
>>
>> True, but allocating the ID at this stage has one advantage.
>> If an ID is not available, the request can remain on the request list,
>> allowing the supplicant to retry later when resources become available.
>> If ID allocation fails during request creation, I have no choice but
>> to drop the request and report an error to optee.
>
> We're allocating in the range 1..INT_MAX, and not more than a handful
> are expected to be active at a time. If we run out of IDs, we have
> bigger problems.
>
Ack.
>>
>>>>
>>>> list_del(&req->link);
>>>> @@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
>>>> struct optee *optee = tee_get_drvdata(teedev);
>>>> struct optee_supp *supp = &optee->supp;
>>>> struct optee_supp_req *req = NULL;
>>>> - int id;
>>>> size_t num_meta;
>>>> int rc;
>>>>
>>>> @@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
>>>>
>>>> while (true) {
>>>> mutex_lock(&supp->mutex);
>>>> - req = supp_pop_entry(supp, *num_params - num_meta, &id);
>>>> - mutex_unlock(&supp->mutex);
>>>>
>>>> - if (req) {
>>>> - if (IS_ERR(req))
>>>> - return PTR_ERR(req);
>>>> - break;
>>>> + req = supp_pop_entry(supp, *num_params - num_meta);
>>>> + if (!req) {
>>>> + mutex_unlock(&supp->mutex);
>>>> + goto wait_for_request;
>>>> + }
>>>> +
>>>> + if (IS_ERR(req)) {
>>>> + rc = PTR_ERR(req);
>>>> + mutex_unlock(&supp->mutex);
>>>> +
>>>> + return rc;
>>>> }
>>>>
>>>> + /*
>>>> + * Process the request while holding the lock, so that
>>>> + * optee_supp_thrd_req() doesn't pull the request from under us.
>>>> + */
>>>> +
>>>> + if (num_meta) {
>>>> + /*
>>>> + * tee-supplicant support meta parameters ->
>>>> + * requests can be processed asynchronously.
>>>> + */
>>>> + param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
>>>> + TEE_IOCTL_PARAM_ATTR_META;
>>>> + param->u.value.a = req->id;
>>>> + param->u.value.b = 0;
>>>> + param->u.value.c = 0;
>>>> + } else {
>>>> + supp->req_id = req->id;
>>>> + }
>>>> +
>>>> + *func = req->func;
>>>> + *num_params = req->num_params + num_meta;
>>>> + memcpy(param + num_meta, req->param,
>>>> + sizeof(struct tee_param) * req->num_params);
>>>> +
>>>> + mutex_unlock(&supp->mutex);
>>>> + return 0;
>>>
>>> Do we really need to move this into the loop? The structure of the
>>> function becomes a bit unusual and harder to read.
>>>
>>
>> Ack. I'll reorganize this function.
>>
>>>> +
>>>> +wait_for_request:
>>>> /*
>>>> * If we didn't get a request we'll block in
>>>> * wait_for_completion() to avoid needless spinning.
>>>> @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
>>>> */
>>>> if (wait_for_completion_interruptible(&supp->reqs_c))
>>>> return -ERESTARTSYS;
>>>> - }
>>>>
>>>> - if (num_meta) {
>>>> - /*
>>>> - * tee-supplicant support meta parameters -> requsts can be
>>>> - * processed asynchronously.
>>>> - */
>>>> - param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |
>>>> - TEE_IOCTL_PARAM_ATTR_META;
>>>> - param->u.value.a = id;
>>>> - param->u.value.b = 0;
>>>> - param->u.value.c = 0;
>>>> - } else {
>>>> - mutex_lock(&supp->mutex);
>>>> - supp->req_id = id;
>>>> - mutex_unlock(&supp->mutex);
>>>> + /* Check for the next request in the queue. */
>>>> }
>>>>
>>>> - *func = req->func;
>>>> - *num_params = req->num_params + num_meta;
>>>> - memcpy(param + num_meta, req->param,
>>>> - sizeof(struct tee_param) * req->num_params);
>>>> -
>>>> return 0;
>>>> }
>>>>
>>>> @@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp,
>>>> if (!req)
>>>> return ERR_PTR(-ENOENT);
>>>>
>>>> + /* optee_supp_thrd_req() already returned to optee. */
>>>> + if (IS_ERR(req))
>>>> + goto failed_req;
>>>> +
>>>> if ((num_params - nm) != req->num_params)
>>>> return ERR_PTR(-EINVAL);
>>>>
>>>> + *num_meta = nm;
>>>> +failed_req:
>>>> idr_remove(&supp->idr, id);
>>>> supp->req_id = -1;
>>>> - *num_meta = nm;
>>>> +
>>>>
>>>> return req;
>>>> }
>>>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
>>>>
>>>> mutex_lock(&supp->mutex);
>>>> req = supp_pop_req(supp, num_params, param, &num_meta);
>>>> - mutex_unlock(&supp->mutex);
>>>> -
>>>> if (IS_ERR(req)) {
>>>> + mutex_unlock(&supp->mutex);
>>>
>>> We need a way to tell the difference between an id not found and an id
>>> removed because of a killed requester.
>>> How about storing NULL for revoked requests instead of an err-pointer?
>>>
>>
>> Not sure I'm following correctly. Are you expecting supp_pop_req()
>> to return NULL instead of an err-pointer when a request has been revoked?
>
> I was looking at it again, and storing an err-pointer as you do in
> this patch has the advantage that we can tell whether the ID has been
> revoked or was never supplied. In the latter case, it suggests that
> the supplicant is doing something wrong and might as well restart in
> an attempt to recover. So, please keep using an err-pointer as a
> placeholder, but we must be able to distinguish a revoked request from
> other errors to make sure that the supplicant doesn't restart due to a
> revoked request.
>
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT
(which seems more natural), so it doesn't overlap with other supp_pop_req() error
codes and the supplicant can reliably detect it.
Best Regards,
Amir
> Cheers,
> Jens
>
>>
>> Best Rearads,
>> Amir
>>
>>> Cheers,
>>> Jens
>>>
>>>> /* Something is wrong, let supplicant restart. */
>>>> return PTR_ERR(req);
>>>> }
>>>> @@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
>>>> }
>>>> }
>>>> req->ret = ret;
>>>> -
>>>> + req->processed = true;
>>>> /* Let the requesting thread continue */
>>>> complete(&req->c);
>>>> + mutex_unlock(&supp->mutex);
>>>>
>>>> return 0;
>>>> }
>>>>
>>>> ---
>>>> base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279
>>>> change-id: 20250604-fix-use-after-free-8ff1b5d5d774
>>>>
>>>> Best regards,
>>>> --
>>>> Amirreza Zarrabi <amirreza.zarrabi@....qualcomm.com>
>>>>
>>
Powered by blists - more mailing lists