netdev - Re: [PATCH net-next v15 06/15] net: homa: create homa_sock.h and homa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmyLPOaTJ9Dap-CZ9XT=K8LpnjCJCveNh4B_pvqHELeqjg@mail.gmail.com>
Date: Sun, 31 Aug 2025 16:29:40 -0700
From: John Ousterhout <ouster@...stanford.edu>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org, 
	kuba@...nel.org
Subject: Re: [PATCH net-next v15 06/15] net: homa: create homa_sock.h and homa_sock.c

On Tue, Aug 26, 2025 at 3:11 AM Paolo Abeni <pabeni@...hat.com> wrote:

> > +/**
> > + * homa_sock_init() - Constructor for homa_sock objects. This function
> > + * initializes only the parts of the socket that are owned by Homa.
> > + * @hsk:    Object to initialize. The Homa-specific parts must have been
> > + *          initialized to zeroes by the caller.
> > + *
> > + * Return: 0 for success, otherwise a negative errno.
> > + */
> > +int homa_sock_init(struct homa_sock *hsk)
> > +{
> > +    ...
> > +     /* Pick a default port. Must keep the socktab locked from now
> > +      * until the new socket is added to the socktab, to ensure that
> > +      * no other socket chooses the same port.
> > +      */
> > +     spin_lock_bh(&socktab->write_lock);
> > +     starting_port = hnet->prev_default_port;
> > +     while (1) {
> > +             hnet->prev_default_port++;
> > +             if (hnet->prev_default_port < HOMA_MIN_DEFAULT_PORT)
> > +                     hnet->prev_default_port = HOMA_MIN_DEFAULT_PORT;
> > +             other = homa_sock_find(hnet, hnet->prev_default_port);
> > +             if (!other)
> > +                     break;
> > +             sock_put(&other->sock);
> > +             if (hnet->prev_default_port == starting_port) {
> > +                     spin_unlock_bh(&socktab->write_lock);
> > +                     hsk->shutdown = true;
> > +                     hsk->homa = NULL;
> > +                     result = -EADDRNOTAVAIL;
> > +                     goto error;
> > +             }
>
> You likely need to add a cond_resched here (releasing and re-acquiring
> the lock as needed)

Done.

> Do all the above initialization steps need to be done under the socktab
> lock?

No; I have now reorganized to minimize the amount of time the socktab
lock is held. Socket creation is a pretty rare event in Homa
(typically once per process) so this optimization probably doesn't
matter much...

> > +int homa_sock_bind(struct homa_net *hnet, struct homa_sock *hsk,
> > +                u16 port)
> > +{
> > +     ...
> > +     owner = homa_sock_find(hnet, port);
> > +     if (owner) {
> > +             sock_put(&owner->sock);
>
> homa_sock_find() is used is multiple places to check for port usage. I
> think it would be useful to add a variant of such helper not
> incremention the socket refcount.

It's only used this way in 2 places, both in this file. The
alternatives (either add another parameter to homa_sock_find or create
a separate method homa_port_in_use) both seem like they would add more
complexity than the current approach. My preference is to leave it as
is. If you feel strongly about this, let me know which option you
prefer and I'll implement it (note also that adding another parameter
to homa_sock_find would be awkward because it could result in a socket
being returned without its reference count incremented, meaning that
it isn't really safe for the caller to use it; I worry that this will
lead people to write buggy code).

> > +/**
> > + * homa_sock_wait_wmem() - Block the thread until @hsk's usage of tx
> > + * packet memory drops below the socket's limit.
> > + * @hsk:          Socket of interest.
> > + * @nonblocking:  If there's not enough memory, return -EWOLDBLOCK instead
> > + *                of blocking.
> > + * Return: 0 for success, otherwise a negative errno.
> > + */
> > +int homa_sock_wait_wmem(struct homa_sock *hsk, int nonblocking)
> > +{
> > +     long timeo = hsk->sock.sk_sndtimeo;
> > +     int result;
> > +
> > +     if (nonblocking)
> > +             timeo = 0;
> > +     set_bit(SOCK_NOSPACE, &hsk->sock.sk_socket->flags);
> > +     result = wait_event_interruptible_timeout(*sk_sleep(&hsk->sock),
> > +                             homa_sock_wmem_avl(hsk) || hsk->shutdown,
> > +                             timeo);
> > +     if (signal_pending(current))
> > +             return -EINTR;
> > +     if (result == 0)
> > +             return -EWOULDBLOCK;
> > +     return 0;
> > +}
>
> Perhaps you could use sock_wait_for_wmem() ?

sock_wait_for_wmem is not accessible to modules (it's declared "static").

> > diff --git a/net/homa/homa_sock.h b/net/homa/homa_sock.h
>
> > +/**
> > + * struct homa_sock - Information about an open socket.
> > + */
> > +struct homa_sock {
> > +     ...
> > +
> > +     /**
> > +      * @homa: Overall state about the Homa implementation. NULL
> > +      * means this socket was never initialized or has been deleted.
> > +      */
> > +     struct homa *homa;
> > +
> > +     /**
> > +      * @hnet: Overall state specific to the network namespace for
> > +      * this socket.
> > +      */
> > +     struct homa_net *hnet;
>
> Both the above should likely be removed

What is the motivation for removing them? The homa field can be
accessed through hnet, so I suppose it could be removed, but that will
result in extra instructions (both time and icache space) every time
it is accessed (and there are a bunch of accesses). hnet is more
expensive to remove: it can be accessed through the socket, but the
code path is longer, which, again, wastes time and icache space.

> > +     /**
> > +      * @client_rpc_buckets: Hash table for fast lookup of client RPCs.
> > +      * Modifications are synchronized with bucket locks, not
> > +      * the socket lock.
> > +      */
> > +     struct homa_rpc_bucket client_rpc_buckets[HOMA_CLIENT_RPC_BUCKETS];
> > +
> > +     /**
> > +      * @server_rpc_buckets: Hash table for fast lookup of server RPCs.
> > +      * Modifications are synchronized with bucket locks, not
> > +      * the socket lock.
> > +      */
> > +     struct homa_rpc_bucket server_rpc_buckets[HOMA_SERVER_RPC_BUCKETS];
>
> The above 2 array are quite large, and should be probably allocated
> separately.

What's the benefit of doing multiple allocations? The individual
arrays will still be 16KB, which isn't exactly small. Is there general
advice on how to decide whether large objects should be split up into
smaller ones?

> > +/**
> > + * homa_sock_wakeup_wmem() - Invoked when tx packet memory has been freed;
> > + * if memory usage is below the limit and there are tasks waiting for memory,
> > + * wake them up.
> > + * @hsk:   Socket of interest.
> > + */
> > +static inline void homa_sock_wakeup_wmem(struct homa_sock *hsk)
> > +{
> > +     if (test_bit(SOCK_NOSPACE, &hsk->sock.sk_socket->flags) &&
> > +         homa_sock_wmem_avl(hsk)) {
> > +             clear_bit(SOCK_NOSPACE, &hsk->sock.sk_socket->flags);
> > +             wake_up_interruptible_poll(sk_sleep(&hsk->sock), EPOLLOUT);
>
> Can hsk be orphaned at this point? I think so.

I don't think so. This function is invoked only from homa_rpc_reap,
and I don't believe homa_rpc_reap can be invoked once a socket is
orphaned (that would be problematic on its own). Also, as part of
socket shutdown all RPCs are deleted and homa_rpc_reap is invoked to
free their resources, so it will wake up anyone waiting for wmem. By
the time socket cleanup has completed (a) there are no RPCs, and (b)
there should be no-one waiting for wmem.

Do you have a particular pathway in mind by which
homa_sock_wakeup_wmem could be invoked after a socket has been
orphaned?

-John-