[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c3f212f8-01a5-4037-af76-39170aa6a6ce@rbox.co>
Date: Tue, 9 Apr 2024 01:25:23 +0200
From: Michal Luczaj <mhal@...x.co>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
netdev@...r.kernel.org, pabeni@...hat.com
Subject: Re: [PATCH net 1/2] af_unix: Fix garbage collector racing against
connect()
On 4/8/24 23:18, Kuniyuki Iwashima wrote:
> From: Michal Luczaj <mhal@...x.co>
> Date: Mon, 8 Apr 2024 17:58:45 +0200
...
>> list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
>> - long total_refs;
>> -
>> - total_refs = file_count(u->sk.sk_socket->file);
>> + struct sock *sk = &u->sk;
>> + long total_refs = file_count(sk->sk_socket->file);
>>
>> WARN_ON_ONCE(!u->inflight);
>> WARN_ON_ONCE(total_refs < u->inflight);
>> @@ -286,6 +295,11 @@ static void __unix_gc(struct work_struct *work)
>> list_move_tail(&u->link, &gc_candidates);
>> __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags);
>> __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags);
>> +
>> + if (sk->sk_state == TCP_LISTEN) {
>> + unix_state_lock(sk);
>> + unix_state_unlock(sk);
>
> Less likely though, what if the same connect() happens after this ?
>
> connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc()
> ---------------- ------------------------- -----------
> NS = unix_create1()
> skb1 = sock_wmalloc(NS)
> L = unix_find_other(addr)
> for u in gc_inflight_list:
> if (total_refs == inflight_refs)
> add u to gc_candidates
> // L was already traversed
> // in a previous iteration.
> unix_state_lock(L)
> unix_peer(S) = NS
>
> // gc_candidates={L, V}
>
> for u in gc_candidates:
> scan_children(u, dec_inflight)
>
> // embryo (skb1) was not
> // reachable from L yet, so V's
> // inflight remains unchanged
> __skb_queue_tail(L, skb1)
> unix_state_unlock(L)
> for u in gc_candidates:
> if (u.inflight)
> scan_children(u, inc_inflight_move_tail)
>
> // V count=1 inflight=2 (!)
If I understand your question, in this case L's queue technically does change
between scan_children()s: embryo appears, but that's meaningless. __unix_gc()
already holds unix_gc_lock, so the enqueued embryo can not carry any SCM_RIGHTS
(i.e. it doesn't affect the inflight graph). Note that unix_inflight() takes the
same unix_gc_lock.
Is there something I'm missing?
> As you pointed out, this GC's assumption is basically wrong; the GC
> works correctly only when the set of traversed sockets does not change
> over 3 scan_children() calls.
>
> That's why I reworked the GC not to rely on receive queue.
> https://lore.kernel.org/netdev/20240325202425.60930-1-kuniyu@amazon.com/
Right, I'll try to get my head around your series :)
Powered by blists - more mailing lists