lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251218134012.1.I614e18fb4f7b9121e9702d15ffc715cb26ee9afc@changeid>
Date: Thu, 18 Dec 2025 13:40:13 -0800
From: Douglas Anderson <dianders@...omium.org>
To: Jassi Brar <jassisinghbrar@...il.com>
Cc: Douglas Anderson <dianders@...omium.org>,
	linux-kernel@...r.kernel.org
Subject: [PATCH] mailbox: Document the behavior of mbox_send_message() w/ NULL mssg

The way the mailbox core behaves when you pass a NULL `mssg` parameter
to mbox_send_message() seems a little questionable. Specifically, the
mailbox core stores the currently active message directly in its
`active_req` field. In at least two places it decides that if this
field is `NULL` then there is no active request. That means if `mssg`
is NULL it will always think there is no active request. The two
places where it does this are:

1. If a client calls mbox_send_message(), if `active_req` is NULL then
   it will call the mailbox controller to send the new message even if
   the mailbox controller hasn't yet called mbox_chan_txdone() on the
   previous (NULL) message.
2. The mailbox core will never call the client's `tx_done()` callback
   with a NULL message because `tx_tick()` returns early whenever the
   message is NULL.

The above could be seen as bugs and perhaps could be fixed. However,
doing a `git grep mbox_send_message.*NULL` shows 14 hits in mainline
today and people may be relying on the current behavior. It is,
perhaps, better to accept the current behavior.

The current behavior can actually serve the purpose of providing a
simple way to assert an edge-triggered interrupt to the remote
processor on the other side of the mailbox. Specifically:

1. Like a normal edge-triggered interrupt, if multiple edges arrive
   before the interrupt is Acked they are coalesced.
2. Like a normal edge-triggered interrupt, as long as the receiver
   (the remote processor in this case) "Ack"s the interrupt _before_
   checking for work and the sender (the mailbox client in this case)
   posts the interrupt _after_ adding new work then we can always be
   certain that new work will be noticed. This assumes that the
   mailbox clienut and remote processor have some out-of-band way to
   communicate work and the mailbox is just being used as an
   interrupt.

Document the current behavior so that people can rely on it and know
that it will keep working the same way.

NOTE: if a given mailbox client mixes and matches some NULL and some
non-NULL messages, things could get loopy without additional code
changes and rules. Without code changes, if we transfer a non-NULL
message then we'd stop coalescing future NULL messages until the queue
clears. Also: if we were transferring a NULL message and a non-NULL
came in, we'd send it right away but potentially report `tx_done()`
too early. For now, document mixing and matching NULL and non-NULL
messages as undefined.

Signed-off-by: Douglas Anderson <dianders@...omium.org>
---
This feels hacky, but I'm worried that if we do something else we'll
break people. I haven't spent tons of time analyzing all of the
existing mailbox users that pass NULL for the message, but I do know
that at least downstream some Pixel code does it and seems to rely on
the current "don't queue up another message but instead just make sure
the remote processor will get an interrupt in the future."

 drivers/mailbox/mailbox.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 2acc6ec229a4..80861caeb848 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -238,6 +238,15 @@ EXPORT_SYMBOL_GPL(mbox_client_peek_data);
  * This function could be called from atomic context as it simply
  * queues the data and returns a token against the request.
  *
+ * NOTE: If 'mssg' is NULL, the function has some rather different behavior.
+ *	 - The mailbox controller will be informed of the message but it
+ *	   won't be considered "busy". Future messages will continue to be
+ *	   passed onto the controller even if it never called mbox_chan_txdone()
+ *	 - The client's tx_done() callback will never be called.
+ *	 The above rules allow asserting an "edge-triggered" interrupt to the
+ *	 remote processor in a race-free way. Behavior is undefined if a given
+ *	 channel sometimes has NULL message and sometimes doesn't.
+ *
  * Return: Non-negative integer for successful submission (non-blocking mode)
  *	or transmission over chan (blocking mode).
  *	Negative value denotes failure.
-- 
2.52.0.322.g1dd061c0dc-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ