netdev - Re: MCTP - Socket Queue Behavior

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202197c5a0b755c155828ef406d6250611815678.camel@codeconstruct.com.au>
Date: Tue, 20 Feb 2024 10:21:07 +0800
From: Jeremy Kerr <jk@...econstruct.com.au>
To: "Ramaiah, DharmaBhushan" <Dharma.Ramaiah@...l.com>, 
	"netdev@...r.kernel.org"
	 <netdev@...r.kernel.org>, "matt@...econstruct.com.au"
	 <matt@...econstruct.com.au>
Subject: Re: MCTP - Socket Queue Behavior

Hi Dharma,

> Linux implementation of MCTP uses socket for communication with MCTP
> capable EP's. Socket calls can be made ASYNC by using fcntl. I have a
> query based on ASYNC properties of the MCTP socket.

Some of your questions aren't really specific to non-blocking sockets;
it seems like you're assuming that the blocking send case will wait for
a response before returning; that's not the case, as sendmsg() will
complete once the outgoing message is queued (more on what that means
below).

So, you still have the following case, still using a blocking socket:

  sendmsg(message1)
  sendmsg(message2)

  recvmsg() -> reply 1
  recvmsg() -> reply 2

- as it's entirely possible to have multiple messages in flight - either
  as queued skbs, or having being sent to the remote endpoint.

> 1. Does kernel internally maintain queue, for the ASYNC requests?

There is no difference between blocking or non-blocking mode in the
queueing implementation. There is no MCTP-protocol-specific queue for
sent messages.

(the blocking/nonblocking mode may affect how we wait to allocate a
skb, but it doesn't sound like that's what you're asking here)

However, once a message is packetised (possibly being fragmented into
multiple packets), those those *packets* may be queued to the device by
the netdev core. The transport device driver may have its own queues as
well.

In the case where you have multiple concurrent sendmsg() calls
(typically through separate threads, and either on one or multiple
sockets), it may be possible for packets belonging to two messages to be
interleaved on the wire. That scenario is well-supported by the MCTP
protocol through the packet tag mechanism.

> a. If so, what is the queue depth (can one send multiple requests
> without waiting for the response 

The device queue depth depends on a few things, but has no impact on
ordering of requests to responses. It's certainly possible to have
multiple requests in flight at any one time: just call sendmsg()
multiple times, even in blocking mode.

(the practical limit for pending messages is 8, limited by the number of
MCTP tag values for any (remote-EID, local-EID, tag) tuple)

> and expect reply in order of requests)?

We have no control over reply ordering. It's entirely possible that
replies are sent out of sequence by the remote endpoint:

  local application          remote endpoint

  sendmsg(message 1)
  sendmsg(message 2)
                             receives message 1
                             receives message 2
                             sends a reply 2 to message 2
                             sends a reply 1 to message 1
  recvmsg() -> reply 2
  recvmsg() -> reply 1

So if a userspace application sends multiple messages concurrently, it
must have some mechanism to correlate the incoming replies with the
original request state. All of the upper-layer protocols that I have
seen have facilities for this (Instance ID in MCTP Control protocol,
Command Slot Index in NVMe-MI, Instance ID in PLDM, ...)

(You could also use the MCTP tags to achieve this correlation, but there
are very likely better ways in the upper-layer protocol)

> b. Does the Kernel maintain queue per socket connection?

MCTP is datagram-oriented, there is no "connection".

In terms of per-socket queues: there is the incoming socket queue
that holds received messages that are waiting for userspace to dequeue
via recvmsg() (or similar). However, there is nothing MCTP-specific
about this, it's all generic socket code.

> 2. Is FASYNC a mechanism for handling asynchronous events associated
> with a file descriptor and it doesn't provide parallelism for
> multiple send operation?

The non-blocking socket interfaces (FASYNC, O_NONBLOCK, MSG_DONTWAIT)
are mostly unrelated to whether your application sends multiple messages
at once. It's entirely possible to have multiple messages in flight
while using the blocking interfaces.

Cheers,

Jeremy