[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240709021839.53278-1-kuniyu@amazon.com>
Date: Mon, 8 Jul 2024 19:18:39 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <john.fastabend@...il.com>
CC: <Rao.Shoaib@...cle.com>, <bpf@...r.kernel.org>, <cong.wang@...edance.com>,
<davem@...emloft.net>, <edumazet@...gle.com>, <jakub@...udflare.com>,
<kuba@...nel.org>, <kuniyu@...zon.com>, <mhal@...x.co>,
<netdev@...r.kernel.org>, <pabeni@...hat.com>, <takamitz@...zon.co.jp>
Subject: Re: [PATCH bpf v3 1/4] af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
From: John Fastabend <john.fastabend@...il.com>
Date: Mon, 08 Jul 2024 18:24:02 -0700
> Kuniyuki Iwashima wrote:
> > From: Michal Luczaj <mhal@...x.co>
> > Date: Sun, 7 Jul 2024 23:28:22 +0200
> > > AF_UNIX socket tracks the most recent OOB packet (in its receive queue)
> > > with an `oob_skb` pointer. BPF redirecting does not account for that: when
> > > an OOB packet is moved between sockets, `oob_skb` is left outdated. This
> > > results in a single skb that may be accessed from two different sockets.
> > >
> > > Take the easy way out: silently drop MSG_OOB data targeting any socket that
> > > is in a sockmap or a sockhash. Note that such silent drop is akin to the
> > > fate of redirected skb's scm_fp_list (SCM_RIGHTS, SCM_CREDENTIALS).
> > >
> > > For symmetry, forbid MSG_OOB in unix_bpf_recvmsg().
> > >
> > > Suggested-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > > Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> > > Signed-off-by: Michal Luczaj <mhal@...x.co>
> >
>
> Why does af_unix put the oob data on the sk_receive_queue()? Wouldn't it
> be enough to just have the ousk->oob_skb hold the reference to the skb?
We need to remember when oob_skb was sent for some reasons:
1. OOB data must stop recv() at the point
2. ioctl(SIOCATMARK) must return true if OOB data is at the head of
the recvq regardless that the data is already consumed or not
See tools/testing/selftests/net/af_unix/msg_oob.c
And actually TCP has OOB data in the normal skb ququed in recvq.
TCP also holds the copied data in tp->urg_data and SEQ# in tp->urg_seq.
Later when recv() passes through the OOB SEQ#, the recv() is stopped at
OOB data.
Note that the implementation has some bugs, e.g. it doesn't have a flag
to indicate if each OOB data is consumed, and we can recv() the stale
OOB data twice.
(CCed Takamitsu who is working on the URG bugs on the TCP side)
> I think for TCP/UDP at least I'll want to handle MSG_OOB data correctly.
UDP doesn't support MSG_OOB.
> For redirect its probably fine to just drop or skip it, but when we are
> just reading recv msgs and parsing/observing it would be nice to not change
> how the application works. In practice I don't recall anyone reporting
> issues on TCP side though from incorrectly handling URG data.
IIUC, the normal read case is processed by tcp_recvmsg(), right ?
Then, OOB data is handled at the found_ok_skb/skip_copy: labels.
>
> From TCP side I believe we can fix the OOB case by checking the oob queue
> before doing the recvmsg handling. If the urg data wasn't on the general
> sk_receive_queue we could do similar here for af_unix? My argument for
> URG not working for redirect would be to let userspace handle it if they
> cared.
For the redirect cse, in tcp_read_skb(), we need to check tp->urg_data and
tp->{copied,urg}_seq and clear them if needed as done in tcp_recvmsg().
Powered by blists - more mailing lists