netdev - Re: [PATCH net-next v15 03/15] net: homa: create shared Homa header files

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce4f62a8-1114-47b9-af08-51656e08c2b5@redhat.com>
Date: Tue, 26 Aug 2025 11:05:51 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: John Ousterhout <ouster@...stanford.edu>, netdev@...r.kernel.org
Cc: edumazet@...gle.com, horms@...nel.org, kuba@...nel.org
Subject: Re: [PATCH net-next v15 03/15] net: homa: create shared Homa header
 files

On 8/18/25 10:55 PM, John Ousterhout wrote:
> +/**
> + * struct homa_net - Contains Homa information that is specific to a
> + * particular network namespace.
> + */
> +struct homa_net {
> +	/** @net: Network namespace corresponding to this structure. */
> +	struct net *net;
> +
> +	/** @homa: Global Homa information. */
> +	struct homa *homa;

It's not clear why the above 2 fields are needed. You could access
directly the global struct homa instance, and 'struct net' is usually
available when struct home_net is avail.


> +/**
> + * is_homa_pkt() - Return true if @skb is a Homa packet, false otherwise.
> + * @skb:    Packet buffer to check.
> + * Return:  see above.
> + */
> +static inline bool is_homa_pkt(struct sk_buff *skb)
> +{
> +	int protocol;
> +
> +	/* If the network header hasn't been created yet, assume it's a
> +	 * Homa packet (Homa never generates any non-Homa packets).
> +	 */
> +	if (skb->network_header == 0)
> +		return true;
> +	protocol = (skb_is_ipv6(skb)) ? ipv6_hdr(skb)->nexthdr :
> +					ip_hdr(skb)->protocol;
> +	return protocol == IPPROTO_HOMA;
> +}

This helper is apparently unused in this series, just drop it / add it
later

> +#define UNIT_LOG(...)
> +#define UNIT_HOOK(...)

Also apparently unused.

> +extern unsigned int homa_net_id;
> +
> +/**
> + * homa_net_from_net() - Return the struct homa_net associated with a particular
> + * struct net.
> + * @net:     Get the Homa data for this net namespace.
> + * Return:   see above.
> + */
> +static inline struct homa_net *homa_net_from_net(struct net *net)

The customary name for this kind of helper is homa_net()

> +{
> +	return (struct homa_net *)net_generic(net, homa_net_id);
> +}
> +
> +/**
> + * homa_from_skb() - Return the struct homa associated with a particular
> + * sk_buff.
> + * @skb:     Get the struct homa for this packet buffer.
> + * Return:   see above.
> + */
> +static inline struct homa *homa_from_skb(struct sk_buff *skb)
> +{
> +	struct homa_net *hnet;
> +
> +	hnet = net_generic(dev_net(skb->dev), homa_net_id);
> +	return hnet->homa;

You can implement this using homa_net_from_skb(), avoiding some code
duplication

> +}
> +
> +/**
> + * homa_net_from_skb() - Return the struct homa_net associated with a particular
> + * sk_buff.
> + * @skb:     Get the struct homa for this packet buffer.
> + * Return:   see above.
> + */
> +static inline struct homa_net *homa_net_from_skb(struct sk_buff *skb)
> +{
> +	struct homa_net *hnet;
> +
> +	hnet = net_generic(dev_net(skb->dev), homa_net_id);
> +	return hnet;

You can implement this using homa_net(), avoid some code duplication.

> +}
> +
> +/**
> + * homa_clock() - Return a fine-grain clock value that is monotonic and
> + * consistent across cores.
> + * Return: see above.
> + */
> +static inline u64 homa_clock(void)
> +{
> +	/* As of May 2025 there does not appear to be a portable API that
> +	 * meets Homa's needs:
> +	 * - The Intel X86 TSC works well but is not portable.
> +	 * - sched_clock() does not guarantee monotonicity or consistency.
> +	 * - ktime_get_mono_fast_ns and ktime_get_raw_fast_ns are very slow
> +	 *   (27 ns to read, vs 8 ns for TSC)
> +	 * Thus we use a hybrid approach that uses TSC (via get_cycles) where
> +	 * available (which should be just about everywhere Homa runs).
> +	 */
> +#ifdef CONFIG_X86_TSC
> +	return get_cycles();
> +#else
> +	return ktime_get_mono_fast_ns();
> +#endif /* CONFIG_X86_TSC */
> +}

ktime_get*() variant are fast enough to allow e.g. pktgen deals with
millions of packets x seconds. Both tsc() and ktime_get_mono_fast_ns()
suffer of various inconsistencies which will cause the most unexpected
issues in the most dangerous situation. I strongly advice against this
early optimization.

> +/**
> + * homa_usecs_to_cycles() - Convert from units of microseconds to units of
> + * homa_clock().
> + * @usecs:   A time measurement in microseconds
> + * Return:   The time in homa_clock() units corresponding to @usecs.
> + */
> +static inline u64 homa_usecs_to_cycles(u64 usecs)
> +{
> +	u64 tmp;
> +
> +	tmp = usecs * homa_clock_khz();
> +	do_div(tmp, 1000);
> +	return tmp;
> +}

Apparently not used in this series.
FWIW do_div() would likely be much more costly than timestamp fetching.

> +
> +/* Homa Locking Strategy:
> + *
> + * (Note: this documentation is referenced in several other places in the
> + * Homa code)
> + *
> + * In the Linux TCP/IP stack the primary locking mechanism is a sleep-lock
> + * per socket. However, per-socket locks aren't adequate for Homa, because
> + * sockets are "larger" in Homa. In TCP, a socket corresponds to a single
> + * connection between two peers; an application can have hundreds or
> + * thousands of sockets open at once, so per-socket locks leave lots of
> + * opportunities for concurrency. With Homa, a single socket can be used for
> + * communicating with any number of peers, so there will typically be just
> + * one socket per thread. As a result, a single Homa socket must support many
> + * concurrent RPCs efficiently, and a per-socket lock would create a bottleneck
> + * (Homa tried this approach initially).
> + *
> + * Thus, the primary locks used in Homa spinlocks at RPC granularity. This
> + * allows operations on different RPCs for the same socket to proceed
> + * concurrently. Homa also has socket locks (which are spinlocks different
> + * from the official socket sleep-locks) but these are used much less
> + * frequently than RPC locks.
> + *
> + * Lock Ordering:
> + *
> + * There are several other locks in Homa besides RPC locks, all of which
> + * are spinlocks. When multiple locks are held, they must be acquired in a
> + * consistent order in order to prevent deadlock. Here are the rules for Homa:
> + * 1. Except for RPC and socket locks, all locks should be considered
> + *    "leaf" locks: don't acquire other locks while holding them.
> + * 2. The lock order is:
> + *    * RPC lock
> + *    * Socket lock
> + *    * Other lock
> + * 3. It is not safe to wait on an RPC lock while holding any other lock.
> + * 4. It is safe to wait on a socket lock while holding an RPC lock, but
> + *    not while holding any other lock.

The last 2 points are not needed: are obviously implied by the previous
ones.

> + *
> + * It may seem surprising that RPC locks are acquired *before* socket locks,
> + * but this is essential for high performance. Homa has been designed so that
> + * many common operations (such as processing input packets) can be performed
> + * while holding only an RPC lock; this allows operations on different RPCs
> + * to proceed in parallel. Only a few operations, such as handing off an
> + * incoming message to a waiting thread, require the socket lock. If socket
> + * locks had to be acquired first, any operation that might eventually need
> + * the socket lock would have to acquire it before the RPC lock, which would
> + * severely restrict concurrency.

FWIW, I think the above scheme can offer good performances if and only
if any operation requiring the socket lock is slow-path: multiple
RPCs/cores can content the same socket lock experiencing false sharing
and cache contention/misses and will hit performances badly.

If the operations requiring the socket lock are slow-path, the
RPC/socket lock order should be irrelevant for performances.

[...]
> +static inline void homa_skb_get(struct sk_buff *skb, void *dest, int offset,
> +				int length)
> +{
> +	memcpy(dest, skb_transport_header(skb) + offset, length);
> +}

Apparently unused.

/P