netdev - Re: [PATCH net-next v5 07/12] net: homa: create homa_sock.h and homa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGXJAmyYmizvm350vSGmJqdOt8d+d0soP95FGhBUQ5nr8kNqnw@mail.gmail.com>
Date: Fri, 10 Jan 2025 16:19:24 -0800
From: John Ousterhout <ouster@...stanford.edu>
To: "D. Wythe" <alibuda@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, pabeni@...hat.com, edumazet@...gle.com, 
	horms@...nel.org, kuba@...nel.org
Subject: Re: [PATCH net-next v5 07/12] net: homa: create homa_sock.h and homa_sock.c

On Fri, Jan 10, 2025 at 1:25 AM D. Wythe <alibuda@...ux.alibaba.com> wrote:
>
> > +void homa_sock_unlink(struct homa_sock *hsk)
> > +{
> > +     struct homa_socktab *socktab = hsk->homa->port_map;
> > +     struct homa_socktab_scan *scan;
> > +
> > +     /* If any scans refer to this socket, advance them to refer to
> > +      * the next socket instead.
> > +      */
> > +     spin_lock_bh(&socktab->write_lock);
> > +     list_for_each_entry(scan, &socktab->active_scans, scan_links) {
> > +             if (!scan->next || scan->next->sock != hsk)
> > +                     continue;
> > +             scan->next = (struct homa_socktab_links *)
> > +                             rcu_dereference(hlist_next_rcu(&scan->next->hash_links));
> > +     }
>
> I can't get it.. Why not just mark this sock as unavailable and skip it
> when the iterator accesses it ?
>
> The iterator was used under rcu and given that your sock has the
> SOCK_RCU_FREE flag set, it appears that there should be no concerns
> regarding dangling pointers.

The RCU lock needn't be held for the entire lifetime of an iterator,
but rather only when certain functions are invoked, such as
homa_socktab_next. Thus it's possible for a socket to be reclaimed and
freed while a scan is in progress. This is described in the comments
for homa_socktab_start_scan. This behavior is necessary because of
homa_timer, which needs to call schedule in the middle of a scan and
that can't be done without releasing the RCU lock. I don't like this
complexity but I haven't been able to find a better alternative.

> > +     hsk->shutdown = true;
>
> From the actual usage of the shutdown member, I think you should use
> sock_set_flag(SOCK_DEAD), and to check it with sock_flag(SOCK_DEAD).

I wasn't aware of SOCK_DEAD until your email. After poking around a
bit to learn more about SOCK_DEAD, I am nervous about following your
advice. I'm still not certain exactly when SOCK_DEAD is set or who is
allowed to set it. The best information I could find was from ChatGPT
which says this:

"The SOCK_DEAD flag indicates that the socket is no longer referenced
by any user-space file descriptors or kernel entities. Essentially,
the socket is considered "dead" and ready to be cleaned up."

If ChatGPT isn't hallucinating, this would suggest that Homa shouldn't
set SOCK_DEAD, since the conditions above might not yet be true when
homa_sock_shutdown is invoked.

Moreover, I'm concerned that some other entity might set SOCK_DEAD
before homa_sock_shutdown is invoked, in which case homa_sock_shutdown
would not cleanup the socket properly.

Thus, it seems safest to me for Homa to have its own shutdown flag.

Let me know if you still think Homa should use SOCK_DEAD.

> > +
> > +     while (!list_empty(&hsk->dead_rpcs))
> > +             homa_rpc_reap(hsk, 1000);
>
> I take a quick look at homa_rpc_reap, although there is no possibility
> of an infinite loop founded currently, it still raises concerns.
>
> It might be better to let homa_rpc_reap() handle this kind of actions by itself.
> For example, code like that:
>
> homa_rpc_reap(hsk, 0, flags=RPC_FORCE_REAP|RPC_REAP_ALL);
>
> In this way, anyone making modifications to homa_rpc_reap() in the future will
> at least be aware that there is such a case that needs to be handled well.

I have changed the API for homa_rpc_reap to this:

int homa_rpc_reap(struct homa_sock *hsk, bool reap_all)

The caller can no longer specify a count. When reap_all isn't
specified, homa_rpc_reap determines for itself what represents a
"small amount of work" to perform; no need for the caller to figure
this out.

-John-