[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260206100218.4163478-1-joonwonkang@google.com>
Date: Fri, 6 Feb 2026 10:02:18 +0000
From: Joonwon Kang <joonwonkang@...gle.com>
To: jassisinghbrar@...il.com
Cc: alexey.klimov@....com, jonathanh@...dia.com, joonwonkang@...gle.com,
linux-kernel@...r.kernel.org, linux-tegra@...r.kernel.org,
stable@...r.kernel.org, sudeep.holla@....com, thierry.reding@...il.com
Subject: Re: [PATCH 1/2 RESEND] mailbox: Use per-thread completion to fix
> > Previously, a sender thread in mbox_send_message() could be woken up at
> > a wrong time in blocking mode. It is because there was only a single
> > completion for a channel whereas messages from multiple threads could be
> > sent on the same channel in any order; since the shared completion could
> > be signalled in any order, it could wake up a wrong sender thread.
> >
> > This commit resolves the false wake-up issue with the following changes:
> > - Completions are created just as many as the number of concurrent sender
> > threads
> > - A completion is created on a sender thread's stack
> > - Each slot of the message queue, i.e. `msg_data`, contains a pointer to
> > its target completion
> > - tx_tick() signals the completion of the currently active slot of the
> > message queue
> >
> Mailbox API does not support shared channels. Each channel is supposed
> to be owned by one client. Though a client can serve multiple users of
> the channel, but then it will have to serialize access to the channel.
> The implication is mailbox_send_message should not be called before
> the last call returns (in blocking mode).
This sounds like a suddenly big change to the mailbox API after all other docs
or discussion, e.g. Link in the commit message, imply that it supports multi-
thread use case. I think it would be better to make it support multi-thread not
to cause breaking changes to the mailbox client drivers. At least, we may be
able to implement mbox_send_message() using mutex or other locks to still
support multi-thread.
> Even with this patch, consider when threadA is active and threadB too
> is waiting next. If the tx_tout races with threadA's transmission,
> threadB may timeout and call tx_tick() on the channel thereby
> affecting threadA. Which also eventually proceeds to complete on
> threadB's tx_complete which was on the stack and hence no more exists
> thereby causing UAF.
Indeed, thanks for this input. Here we need to consider how tx_tick() is
is called in conjunction with other events like timeout. tx_tick() can be
called either when timeout occurs or by client or controller. Below is the
break down of the cases.
Case 1) Thread A is active and no timeout occurs to Thread B
In this case, tx_tick() will be called once the tx is done for Thread A and
then Thread B will go next. So, no problem.
Case 2) Thread A is active but timeout occurs to Thread B
This is the case that you pointed out. In this case, we could cancel the
request for Thread B not disrupting the active request of Thread A and it
should be okay to cancel it since it is before sending it to the controller,
i.e. before the call to mbox->ops->send_data(). By not sending it, tx_tick()
should also not be called either by client or controller. So, it should be no
problem.
Will create a new version of patch with this change.
Case 3) Thread A is active but timeout occurs to Thread A and tx_tick() is
called for Thread A
Case 4) Thread A is done with tx, Thread B is now active but timeout occurs to
Thread B and tx_tick() is called for Thread B
These cases could occur, e.g. when there is no way for controller to know a tx
done interrupt that it receives is for the currently active request or for the
previous one, or when the timeout occurrs just before the controller is about
to call mbox_chan_txdone(). If it happens anyway, it could also cause
inconsistency of the mailbox internal status, but UAF will not occur with
the patch which handles Case 2. However, these cases are out of topic for
multi-thread support. It is more about the timeout support. It could occur even
in a single thread as follows.
- Thread A is active with a request and timeout occurs.
- Thread A goes with the next request and is active again.
- However, the controller calls mbox_chan_txdone(), thus tx_tick(), intending
for the first request.
- The mailbox internal status goes inconsistent!
- The controller may soon call another mbox_chan_txdone() for the second
request.
So, these timeout cases are existing issues regardless of whether it is single-
thread or multi-thread and should be considered orthogonally.
Powered by blists - more mailing lists