[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b29d7ead-6e2c-4a52-9a0a-56892e0015b6@rbox.co>
Date: Thu, 20 Jun 2024 22:35:55 +0200
From: Michal Luczaj <mhal@...x.co>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: cong.wang@...edance.com, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, kuni1840@...il.com, netdev@...r.kernel.org,
pabeni@...hat.com
Subject: Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under
unix_state_lock() for truly disconencted peer.
On 6/19/24 21:19, Kuniyuki Iwashima wrote:
> From: Michal Luczaj <mhal@...x.co>
> Date: Wed, 19 Jun 2024 20:14:48 +0200
>> On 6/17/24 20:21, Kuniyuki Iwashima wrote:
>>> From: Michal Luczaj <mhal@...x.co>
>>> Date: Mon, 17 Jun 2024 01:28:52 +0200
>>>> (...)
>>>> Another AF_UNIX sockmap issue is with OOB. When OOB packet is sent, skb is
>>>> added to recv queue, but also u->oob_skb is set. Here's the problem: when
>>>> this skb goes through bpf_sk_redirect_map() and is moved between socks,
>>>> oob_skb remains set on the original sock.
>>>
>>> Good catch!
>>>
>>>>
>>>> [ 23.688994] WARNING: CPU: 2 PID: 993 at net/unix/garbage.c:351 unix_collect_queue+0x6c/0xb0
>>>> [ 23.689019] CPU: 2 PID: 993 Comm: kworker/u32:13 Not tainted 6.10.0-rc2+ #137
>>>> [ 23.689021] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>>>> [ 23.689024] Workqueue: events_unbound __unix_gc
>>>> [ 23.689027] RIP: 0010:unix_collect_queue+0x6c/0xb0
>>>>
>>>> I wanted to write a patch, but then I realized I'm not sure what's the
>>>> expected behaviour. Should the oob_skb setting follow to the skb's new sock
>>>> or should it be dropped (similarly to what is happening today with
>>>> scm_fp_list, i.e. redirect strips inflights)?
>>>
>>> The former will require large refactoring as we need to check if the
>>> redirect happens for BPF_F_INGRESS and if the redirected sk is also
>>> SOCK_STREAM etc.
>>>
>>> So, I'd go with the latter. Probably we can check if skb is u->oob_skb
>>> and drop OOB data and retry next in unix_stream_read_skb(), and forbid
>>> MSG_OOB in unix_bpf_recvmsg().
>>> (...)
>>
>> Yeah, sounds reasonable. I'm just not sure I understand the retry part. For
>> each skb_queue_tail() there's one ->sk_data_ready() (which does
>> ->read_skb()). Why bother with a retry?
>
> Exactly.
>
>
>>
>> This is what I was thinking:
>>
>
> When you post it, please make sure to CC bpf@ and sockmap maintainers too.
Done: https://lore.kernel.org/netdev/20240620203009.2610301-1-mhal@rbox.co/
Thanks!
In fact, should I try to document those not-so-obvious OOB/sockmap
interaction? And speaking of documentation, an astute reader noted that
`man unix` is lying:
Sockets API
...
UNIX domain sockets do not support the transmission of out-of-band
data (the MSG_OOB flag for send(2) and recv(2)).
NOTES
...
UNIX domain stream sockets do not support the notion of out-of-band
data.
Powered by blists - more mailing lists