netdev - Re: [PATCH] bpf/sockmap: read psock ingress_msg before sk_receive

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5e1620c332f3c_159a2af0aa9505b861@john-XPS-13-9370.notmuch>
Date:   Wed, 08 Jan 2020 10:34:43 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Lingpeng Chen <forrest0579@...il.com>
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH] bpf/sockmap: read psock ingress_msg before
 sk_receive_queue

Daniel Borkmann wrote:
> On 1/8/20 7:01 PM, John Fastabend wrote:
> > Daniel Borkmann wrote:
> >> On Wed, Jan 08, 2020 at 12:57:08PM +0800, Lingpeng Chen wrote:
> >>> Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue
> >>> if not empty than psock->ingress_msg otherwise. If a FIN packet arrives
> >>> and there's also some data in psock->ingress_msg, the data in
> >>> psock->ingress_msg will be purged. It is always happen when request to a
> >>> HTTP1.0 server like python SimpleHTTPServer since the server send FIN
> >>> packet after data is sent out.
> >>>
> >>> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> >>> Reported-by: Arika Chen <eaglesora@...il.com>
> >>> Suggested-by: Arika Chen <eaglesora@...il.com>
> >>> Signed-off-by: Lingpeng Chen <forrest0579@...il.com>
> >>> Signed-off-by: John Fastabend <john.fastabend@...il.com>
> >>> ---
> >>>   net/ipv4/tcp_bpf.c | 7 ++++---
> >>>   1 file changed, 4 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> >>> index e38705165ac9..f7e902868fce 100644
> >>> --- a/net/ipv4/tcp_bpf.c
> >>> +++ b/net/ipv4/tcp_bpf.c
> >>> @@ -123,12 +123,13 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> >>>   
> >>>   	if (unlikely(flags & MSG_ERRQUEUE))
> >>>   		return inet_recv_error(sk, msg, len, addr_len);
> >>
> >> Shouldn't we also move the error queue handling below the psock test as
> >> well and let tcp_recvmsg() natively do it in case of !psock?
> >>
> > 
> > You mean the MSG_ERRQUEUE flag handling? If the user sets MSG_ERRQUEUE
> > they expect to receive any queued errors it would be wrong to return
> > psock data in this case if psock is attached and has data on queue and
> > user passes MSG_ERRQUEUE flag.
> > 
> >   MSG_ERRQUEUE (since Linux 2.2)
> >    This flag specifies that queued errors should be received from the socket
> >    error queue.  The error is passed in an ancillary message with a type
> >    dependent on the protocol (for IPv4 IP_RECVERR).  The user should supply
> >    a buffer of sufficient size. See cmsg(3) and ip(7) for more information.
> >    The payload of the original packet that caused the error is passed as
> >    normal data via msg_iovec. The original destination address of the
> >    datagram that caused the error is supplied via msg_name.
> > 
> > I believe it needs to be where it is.
> 
> I meant that it should have looked as follows (aka moving both below the
> psock test) ...
> 
>          psock = sk_psock_get(sk);
>          if (unlikely(!psock))
>              return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
>          if (unlikely(flags & MSG_ERRQUEUE))
>              return inet_recv_error(sk, msg, len, addr_len);
> 	if (!skb_queue_empty(&sk->sk_receive_queue) && [...]
> 
> ... since when detached it's handled already via tcp_recvmsg() internals.
> 

Ah, OK the 'if (unlikely(!psock))' is a very rare case so I'm not sure
it matters much. But, I agree it is slightly nicer.

Lingpeng, can you spin a v2 moving the MSG_ERRQUEUE check down and also
keep the ACK and stable tags I don't think its a very big change the
Acks can stick around IMO.

Thanks.