[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240510050341.27782-1-kuniyu@amazon.com>
Date: Fri, 10 May 2024 14:03:41 +0900
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <billy@...rlabs.sg>, <davem@...emloft.net>, <edumazet@...gle.com>,
<kuba@...nel.org>, <kuni1840@...il.com>, <kuniyu@...zon.com>,
<netdev@...r.kernel.org>
Subject: Re: [PATCH v1 net] af_unix: Update unix_sk(sk)->oob_skb under sk_receive_queue lock.
From: Paolo Abeni <pabeni@...hat.com>
Date: Thu, 09 May 2024 11:12:38 +0200
> On Tue, 2024-05-07 at 10:00 -0700, Kuniyuki Iwashima wrote:
> > Billy Jheng Bing-Jhong reported a race between __unix_gc() and
> > queue_oob().
> >
> > __unix_gc() tries to garbage-collect close()d inflight sockets,
> > and then if the socket has MSG_OOB in unix_sk(sk)->oob_skb, GC
> > will drop the reference and set NULL to it locklessly.
> >
> > However, the peer socket still can send MSG_OOB message to the
> > GC candidate and queue_oob() can update unix_sk(sk)->oob_skb
> > concurrently, resulting in NULL pointer dereference. [0]
> >
> > To avoid the race, let's update unix_sk(sk)->oob_skb under the
> > sk_receive_queue's lock.
>
> I'm sorry to delay this fix but...
>
> AFAICS every time AF_UNIX touches the ooo_skb, it's under the receiver
> unix_state_lock. The only exception is __unix_gc. What about just
> acquiring such lock there?
In the new GC, there is unix_state_lock -> gc_lock ordering, and
we need another fix then.
That's why I chose locking recvq for old GC too.
https://lore.kernel.org/netdev/20240507172606.85532-1-kuniyu@amazon.com/
Also, Linus says:
I really get the feeling that 'sb->oob_skb' should actually be forced
to always be in sync with the receive queue by always doing the
accesses under the receive_queue lock.
( That's in the security@ thread I added you, but I just noticed
Linus replied to the previous mail. I'll forward the mails to you. )
> Otherwise there are other chunk touching the ooo_skb is touched where
> this patch does not add the receive queue spin lock protection e.g. in
> unix_stream_recv_urg(), making the code a bit inconsistent.
Yes, now the receive path is protected by unix_state_lock() and the
send path is by unix_state_lock() and recvq lock.
Ideally, as Linus suggested, we should acquire recvq lock everywhere
touching oob_skb and remove the additional refcount by skb_get(), but
I thought it's too much as a fix and I would do that refactoring in
the next cycle.
What do you think ?
Powered by blists - more mailing lists