[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmxwZ7Vc1UFfNDxNKvhizL8ks0gw8GosiJkc66V+XUUWHw@mail.gmail.com>
Date: Wed, 7 May 2025 11:30:40 -0700
From: John Ousterhout <ouster@...stanford.edu>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org,
kuba@...nel.org
Subject: Re: [PATCH net-next v8 06/15] net: homa: create homa_sock.h and homa_sock.c
On Mon, May 5, 2025 at 9:46 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 5/3/25 1:37 AM, John Ousterhout wrote:
> > + /* Initialize Homa-specific fields. */
> > + spin_lock_bh(&socktab->write_lock);
> > + atomic_set(&hsk->protect_count, 0);
> > + spin_lock_init(&hsk->lock);
> > + atomic_set(&hsk->protect_count, 0);
>
> Duplicate 'atomic_set(&hsk->protect_count, 0);' statement above
Oops; fixed now.
> > + hsk->port = homa->prev_default_port;
> > + hsk->inet.inet_num = hsk->port;
> > + hsk->inet.inet_sport = htons(hsk->port);
>
> The above code looks like a bind() operation, but it's unclear why it's
> peformend at init time.
All Homa sockets are automatically assigned a port at creation time,
so there's no need for them to call bind in the common case where they
are being used for the client side only. Bind only needs to be called
if the application wants to use a well-known port number.
> > + for (i = 0; i < HOMA_CLIENT_RPC_BUCKETS; i++) {
> > + struct homa_rpc_bucket *bucket = &hsk->client_rpc_buckets[i];
> > +
> > + spin_lock_init(&bucket->lock);
> > + bucket->id = i;
> > + INIT_HLIST_HEAD(&bucket->rpcs);
> > + }
> > + for (i = 0; i < HOMA_SERVER_RPC_BUCKETS; i++) {
> > + struct homa_rpc_bucket *bucket = &hsk->server_rpc_buckets[i];
> > +
> > + spin_lock_init(&bucket->lock);
> > + bucket->id = i + 1000000;
> > + INIT_HLIST_HEAD(&bucket->rpcs);
> > + }
>
> I'm under the impression that using rhashtable for both the client and
> the server rpcs will deliver both more efficient memory usage and better
> performances.
I wasn't aware of rhashtable; I'll take a look.
> > +
> > + tx_memory = refcount_read(&hsk->sock.sk_wmem_alloc);
> > + if (tx_memory != 1) {
> > + pr_err("%s found sk_wmem_alloc %llu bytes, port %d\n",
> > + __func__, tx_memory, hsk->port);
> > + }
>
> Just:
> WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc) != 1);
Done (but this is a bit unsatisfying because it generates less useful
information in the log).
> > +/**
> > + * homa_sock_destroy() - Destructor for homa_sock objects. This function
> > + * only cleans up the parts of the object that are owned by Homa.
> > + * @hsk: Socket to destroy.
> > + */
> > +void homa_sock_destroy(struct homa_sock *hsk)
> > +{
> > + homa_sock_shutdown(hsk);
> > + sock_set_flag(&hsk->inet.sk, SOCK_RCU_FREE);
>
> Why the flag is set only now and not at creation time?
No reason that I can think of; I've now moved it to creation time.
After asking ChatGPT about this flag, I'm no longer certain that Homa
needs it. Can you help me understand the conditions that would make
the flag necessary/unnecessary?
> [...]
> > +/**
> > + * struct homa_socktab - A hash table that maps from port numbers (either
> > + * client or server) to homa_sock objects.
> > + *
> > + * This table is managed exclusively by homa_socktab.c, using RCU to
> > + * minimize synchronization during lookups.
> > + */
> > +struct homa_socktab {
> > + /**
> > + * @write_lock: Controls all modifications to this object; not needed
> > + * for socket lookups (RCU is used instead). Also used to
> > + * synchronize port allocation.
> > + */
> > + spinlock_t write_lock;
> > +
> > + /**
> > + * @buckets: Heads of chains for hash table buckets. Chains
> > + * consist of homa_sock objects.
> > + */
> > + struct hlist_head buckets[HOMA_SOCKTAB_BUCKETS];
> > +};
>
> This is probably a bit too large to be unconditionally allocated for
> each netns. You are probably better off with a global hash table, with
> the lookup key including the netns itself.
OK, will do.
> [...]
> > +/**
> > + * homa_sock_lock() - Acquire the lock for a socket.
> > + * @hsk: Socket to lock.
> > + */
> > +static inline void homa_sock_lock(struct homa_sock *hsk)
> > + __acquires(&hsk->lock)
> > +{
> > + spin_lock_bh(&hsk->lock);
>
> I was wondering how the hsk socket lock could be nested under a
> spinlock... The above can't work, unless you prevent any core and
> inet-related operations on hsk sockets. That in turn means duplicate
> entirely a lot of code or preventing a lot of basic stuff from working
> on homa sockets.
>
> Home sockets are still inet ones. It's expected that SOL_SOCKET and
> SOL_IP[V6] socket options work on them (or at least could be implemented).
>
> I think this point is very critical.
Can you provide an example of a specific situation that you think will
be problematic? My belief (hope?) is that Homa does not use any socket
operations that require the official socket lock. Homa implements
socket options using its own spinlock.
> Somewhat related: the patch order makes IMHO the review complex, because
> I often need to look to the following patches to get needed context.
Unfortunately I don't know how to fix this problem. I'm a bit
skeptical that it is even possible to understand this much code in a
purely linear fashion. If you have suggestions for how I can organize
the patches to make them easier to review, I'd be happy to hear them.
At the same time, it's been a struggle for me to extract these patches
cleanly from the full Homa source and evolve them while allowing
concurrent development of the full source; this has led to some of the
awkward chunks of code you have noticed.
-John-
Powered by blists - more mailing lists