[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240409190244.26402-1-kuniyu@amazon.com>
Date: Tue, 9 Apr 2024 12:02:44 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <mhal@...x.co>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>
Subject: Re: [PATCH net 1/2] af_unix: Fix garbage collector racing against connect()
From: Michal Luczaj <mhal@...x.co>
Date: Tue, 9 Apr 2024 11:16:35 +0200
> On 4/9/24 02:22, Kuniyuki Iwashima wrote:
> > From: Michal Luczaj <mhal@...x.co>
> > Date: Tue, 9 Apr 2024 01:25:23 +0200
> >> On 4/8/24 23:18, Kuniyuki Iwashima wrote:
> >>> From: Michal Luczaj <mhal@...x.co>
> >>> Date: Mon, 8 Apr 2024 17:58:45 +0200
> >> ...
> >>>> list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
> >
> > Please move sk declaration here and
> >
> >>>> - long total_refs;
> >>>> -
> >>>> - total_refs = file_count(u->sk.sk_socket->file);
> >
> > keep these 3 lines for reverse xmax tree order.
>
> Tricky to have them all 3 in reverse xmax. Did you mean
>
> struct sock *sk = &u->sk;
> long total_refs;
>
> total_refs = file_count(sk->sk_socket->file);
>
> ?
Yes, it's netdev convention.
https://docs.kernel.org/process/maintainer-netdev.html#local-variable-ordering-reverse-xmas-tree-rcs
>
> >>> connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc()
> >>> ---------------- ------------------------- -----------
> >>> NS = unix_create1()
> >>> skb1 = sock_wmalloc(NS)
> >>> L = unix_find_other(addr)
> >>> for u in gc_inflight_list:
> >>> if (total_refs == inflight_refs)
> >>> add u to gc_candidates
> >>> // L was already traversed
> >>> // in a previous iteration.
> >>> unix_state_lock(L)
> >>> unix_peer(S) = NS
> >>>
> >>> // gc_candidates={L, V}
> >>>
> >>> for u in gc_candidates:
> >>> scan_children(u, dec_inflight)
> >>>
> >>> // embryo (skb1) was not
> >>> // reachable from L yet, so V's
> >>> // inflight remains unchanged
> >>> __skb_queue_tail(L, skb1)
> >>> unix_state_unlock(L)
> >>> for u in gc_candidates:
> >>> if (u.inflight)
> >>> scan_children(u, inc_inflight_move_tail)
> >>>
> >>> // V count=1 inflight=2 (!)
> >>
> >> If I understand your question, in this case L's queue technically does change
> >> between scan_children()s: embryo appears, but that's meaningless. __unix_gc()
> >> already holds unix_gc_lock, so the enqueued embryo can not carry any SCM_RIGHTS
> >> (i.e. it doesn't affect the inflight graph). Note that unix_inflight() takes the
> >> same unix_gc_lock.
> >>
> >> Is there something I'm missing?
> >
> > Ah exactly, you are right.
> >
> > Could you repost this patch only with my comment above addressed ?
>
> Yeah, sure. One question though: what I wrote above is basically a rephrasing of
> the commit message:
>
> (...) After flipping the lock, a possibly SCM-laden embryo is already
> enqueued. And if there is another connect() coming, its embryo won't
> carry SCM_RIGHTS as we already took the unix_gc_lock.
>
> As I understand, the important missing part was the clarification that embryo,
> even though enqueued after the lock flipping, won't affect the inflight graph,
> right? So how about:
>
> (...) After flipping the lock, a possibly SCM-laden embryo is already
> enqueued. And if there is another embryo coming, it can not possibly carry
> SCM_RIGHTS. At this point, unix_inflight() can not happen because
> unix_gc_lock is already taken. Inflight graph remains unaffected.
Sounds good to me.
Thanks!
Powered by blists - more mailing lists