lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpV=XPW08hS3UyakLxPZrujS_HV-BB9bRbnZ1m+vWQytcQ@mail.gmail.com>
Date:   Wed, 19 May 2021 13:17:47 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Cong Wang <cong.wang@...edance.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jakub Sitnicki <jakub@...udflare.com>,
        Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [Patch bpf] udp: fix a memory leak in udp_read_sock()

On Wed, May 19, 2021 at 12:06 PM John Fastabend
<john.fastabend@...il.com> wrote:
>
> Cong Wang wrote:
> > On Tue, May 18, 2021 at 12:56 PM John Fastabend
> > <john.fastabend@...il.com> wrote:
> > >
> > > Cong Wang wrote:
> > > > On Mon, May 17, 2021 at 10:36 PM John Fastabend
> > > > <john.fastabend@...il.com> wrote:
> > > > >
> > > > > Cong Wang wrote:
> > > > > > From: Cong Wang <cong.wang@...edance.com>
> > > > > >
> > > > > > sk_psock_verdict_recv() clones the skb and uses the clone
> > > > > > afterward, so udp_read_sock() should free the original skb after
> > > > > > done using it.
> > > > >
> > > > > The clone only happens if sk_psock_verdict_recv() returns >0.
> > > >
> > > > Sure, in case of error, no one uses the original skb either,
> > > > so still need to free it.
> > >
> > > But the data is going to be dropped then. I'm questioning if this
> > > is the best we can do or not. Its simplest sure, but could we
> > > do a bit more work and peek those skbs or requeue them? Otherwise
> > > if you cross memory limits for a bit your likely to drop these
> > > unnecessarily.
> >
> > What are the benefits of not dropping it? When sockmap takes
> > over sk->sk_data_ready() it should have total control over the skb's
> > in the receive queue. Otherwise user-space recvmsg() would race
> > with sockmap when they try to read the first skb at the same time,
> > therefore potentially user-space could get duplicated data (one via
> > recvmsg(), one via sockmap). I don't see any benefits but races here.
>
> The benefit of _not_ dropping it is the packet gets to the receiver
> side. We've spent a bit of effort to get a packet across the network,
> received on the stack, and then we drop it at the last point is not
> so friendly.

Well, at least udp_recvmsg() could drop packets too in various
scenarios, for example, a copy error. So, I do not think sockmap
is special.

>
> About races is the socket is locked by the caller here? Or is this
> not the case for UDP.

Unlike TCP, the sock is not locked during BH for UDP receive path.
Locking it is not the answer here, because 1) we certainly do not want
to slow down UDP fast path; 2) UDP lacks sk->sk_backlog_rcv().

>
> Its OK in the end to say "its UDP and lossy" but ideally we don't
> make things worse by adding sockmap into the stack. We had these
> problems already on TCP side, where they are much more severe
> because sender believes retransmits will happen, and fixed them
> by now. It would be nice if UDP side also didn't introduce
> drops.

Like I said, the normal UDP receive path drops packets too,
sockmap is not different here. TCP does peek packets, for two
reasons: 1) it has to support splice(); 2) it has locked the socket
during BH receive. UDP has none of them, so UDP can't peek
packets here.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ