linux-hardening - Re: [PATCH v2 05/10] mailbox: add MediaTek GPUEB IPI mailbox

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8577914.T7Z3S40VBb@workhorse>
Date: Mon, 15 Sep 2025 14:38:02 +0200
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Chia-I Wu <olvaffe@...il.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
 Boris Brezillon <boris.brezillon@...labora.com>,
 Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
 Conor Dooley <conor+dt@...nel.org>,
 Matthias Brugger <matthias.bgg@...il.com>,
 MyungJoo Ham <myungjoo.ham@...sung.com>,
 Kyungmin Park <kyungmin.park@...sung.com>,
 Chanwoo Choi <cw00.choi@...sung.com>, Jassi Brar <jassisinghbrar@...il.com>,
 Kees Cook <kees@...nel.org>, "Gustavo A. R. Silva" <gustavoars@...nel.org>,
 Chen-Yu Tsai <wenst@...omium.org>, kernel@...labora.com,
 dri-devel@...ts.freedesktop.org, devicetree@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 linux-mediatek@...ts.infradead.org, linux-pm@...r.kernel.org,
 linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2 05/10] mailbox: add MediaTek GPUEB IPI mailbox

On Saturday, 13 September 2025 00:11:10 Central European Summer Time Chia-I Wu wrote:
> On Fri, Sep 12, 2025 at 11:38 AM Nicolas Frattaroli
> <nicolas.frattaroli@...labora.com> wrote:
> <snipped>
> > +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> > +{
> > +       struct mtk_gpueb_mbox_chan *ch = data;
> > +       int status;
> > +
> > +       status = atomic_cmpxchg(&ch->rx_status,
> > +                               MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> > +       if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> > +               mtk_gpueb_mbox_read_rx(ch);
> > +               writel(BIT(ch->num), ch->ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +               mbox_chan_received_data(&ch->ebm->mbox.chans[ch->num],
> > +                                       ch->rx_buf);
> Given what other drivers do, and how mtk_mfg consumes the data, we should
> 
>   char buf[MAX_OF_RX_LEN]; //  MAX_OF_RX_LEN is 32; we can also
> allocate it during probe
>   mtk_gpueb_mbox_read_rx(ch);
>   mbox_chan_received_data(..., buf);
> 
> mtx_mfg makes a copy eventually anyway.

We don't right now, at least not until after the callback returns.
So we need to have the copy in the mtk_mfg callback, not after the
completion. That's fine and I do want to do this as this is what
the mailbox framework seems to expect clients to do.

> We don't need to maintain any
> extra copy.
> 
> Then we might not need rx_status.

We can probably get rid of it if we keep the per-channel
interrupt handler. Otherwise, we may still need clogged,
as we don't want to process interrupts on channels we have
no user for.

> 
> > +               atomic_set(&ch->rx_status, 0);
> > +               return IRQ_HANDLED;
> > +       }
> > +
> > +       return IRQ_NONE;
> > +}
> > +
> > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > +{
> > +       struct mtk_gpueb_mbox_chan *ch = chan->con_priv;
> > +       int i;
> > +       u32 *values = data;
> > +
> > +       if (atomic_read(&ch->rx_status))
> > +               return -EBUSY;
> > +
> > +       /*
> > +        * We don't want any fancy nonsense, just write the 32-bit values in
> > +        * order. memcpy_toio/__iowrite32_copy don't work here, because fancy.
> > +        */
> > +       for (i = 0; i < ch->c->tx_len; i += 4)
> > +               writel(values[i / 4], ch->ebm->mbox_mmio + ch->c->tx_offset + i);
> > +
> > +       writel(BIT(ch->num), ch->ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> > +
> > +       return 0;
> > +}
> > +
> > +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> > +{
> > +       struct mtk_gpueb_mbox_chan *ch = chan->con_priv;
> > +       int ret;
> > +
> > +       atomic_set(&ch->rx_status, 0);
> > +
> > +       ret = clk_enable(ch->ebm->clk);
> > +       if (ret) {
> > +               dev_err(ch->ebm->dev, "Failed to enable EB clock: %pe\n",
> > +                       ERR_PTR(ret));
> > +               goto err_clog;
> > +       }
> > +
> > +       writel(BIT(ch->num), ch->ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > +
> > +       ret = devm_request_threaded_irq(ch->ebm->dev, ch->ebm->irq, mtk_gpueb_mbox_isr,
> > +                                       mtk_gpueb_mbox_thread, IRQF_SHARED | IRQF_ONESHOT,
> > +                                       ch->full_name, ch);
> I don't think this warrants a per-channel irq thread.
> 
> mbox_chan_received_data is atomic. I think wecan start simple with
> just a devm_request_irq for all channels. mtk_gpueb_mbox_isr can
> 
>   read bits from MBOX_CTL_RX_STS
>   for each bit set:
>     read data from rx
>     mbox_chan_received_data
>   write bits to MBOX_CTL_IRQ_CLR
> 

I don't like this approach. It brings us back to having to process
multiple channels per ISR, keep track of when the interrupt should
be enabled and disabled based on how many channels are in use, and
also is not in line with what e.g. omap-mailbox.c does.

Remember that `mbox_chan_received_data` synchronously calls the
mailbox client's rx_callback. In mediatek_mfg's case, this is
fairly small, though with the request to not make the rx buffer
persist beyond the rx_callback it will gain an additional memory
copy. But we can't guarantee that someone isn't going to put a
slow operation in the path. Sure, it's going to be atomic, but
waiting for a spinlock is atomic and not something an ISR would
enjoy. I don't think mailbox clients would expect that if they
take their time they'll stall the interrupt handler for every
other channel.

So we'd keep the interrupt disabled for all channels until the
client that received a message has processed it.

I can see myself getting rid of the handler and just having the
thread function as the bottom half, but I'd really like to keep
the one-IRQ-request-per-channel thing I've got going now as it
made the code a lot easier to reason about. However, doing this
would mean the interrupt is re-enabled after the generic upper
half, when all the business logic that needs to not run
concurrently for an individual channel is in the bottom half.

As far as I can tell, this would then mean we'd have to add
some concurrency exclusion mechanism to the bottom half.

Moving all the logic into the upper half handler function
would make that handler somewhat longer, and I don't know
if IRQF_ONESHOT masks the interrupt for all users of that
IRQ number or just for those with that dev_id. If it's per
dev_id, then I'm fine with moving stuff up there. But from
my reading of the core IRQ handling code, that does not
appear to be the case; one channel getting a reply would
mask *all* channels of the mailbox until the upper half is
completed, and if the upper half calls into a driver
callback synchronously, that may take a hot minute.

Put differently: Is there a problem with one thread per used
channel, or are we going off vibes here? The way it currently
works uses the shared interrupt to mark just that one channel
as busy with rx_status before letting the IRQ for all channels
be unmasked again, which seems ideal to me.