lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Mar 2021 12:27:17 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...com>, bpf <bpf@...r.kernel.org>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [bpf PATCH 2/2] bpf, sockmap: fix incorrect fwd_alloc accounting

On Wed, Mar 24, 2021 at 7:46 PM John Fastabend <john.fastabend@...il.com> wrote:
>
> Cong Wang wrote:
> > On Wed, Mar 24, 2021 at 2:00 PM John Fastabend <john.fastabend@...il.com> wrote:
> > >
> > > Incorrect accounting fwd_alloc can result in a warning when the socket
> > > is torn down,
> > >
>
> [...]
>
> > > To resolve lets only account for sockets on the ingress queue that are
> > > still associated with the current socket. On the redirect case we will
> > > check memory limits per 6fa9201a89898, but will omit fwd_alloc accounting
> > > until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
> > > or received with sk_psock_skb_ingress memory will be claimed on psock_other.
>                      ^^^^^^^^^^^^^^^^^^^^
> >
> > You mean sk_psock_skb_ingress(), right?
>
> Yes.

skb_send_sock_locked() actually allocates its own skb when sending, hence
it uses a different skb for memory accounting.

>
> [...]
>
> > > @@ -880,12 +876,13 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
> > >                 kfree_skb(skb);
> > >                 goto out;
> > >         }
> > > -       skb_set_owner_r(skb, sk);
> > >         prog = READ_ONCE(psock->progs.skb_verdict);
> > >         if (likely(prog)) {
> > > +               skb->sk = psock->sk;
> >
> > Why is skb_orphan() not needed here?
>
> These come from strparser which do not have skb->sk set.

Hmm, but sk_psock_verdict_recv() passes a clone too, like
strparser, so either we need it for both, or not at all. Clones
do not have skb->sk, so I think you can remove the one in
sk_psock_verdict_recv() too.


>
> >
> > Nit: You can just use 'sk' here, so "skb->sk = sk".
>
> Sure that is a bit nicer, will respin with this.
>
> >
> >
> > >                 tcp_skb_bpf_redirect_clear(skb);
> > >                 ret = sk_psock_bpf_run(psock, prog, skb);
> > >                 ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
> > > +               skb->sk = NULL;
> >
> > Why do you want to set it to NULL here?
>
> So we don't cause the stack to throw other errors later if we
> were to call skb_orphan for example. Various places in the skb
> helpers expect both skb->sk and skb->destructor to be set together
> and here we are just using it as a mechanism to feed the sk into
> the BPF program side. The above skb_set_owner_r for example
> would likely BUG().

Sounds reasonable.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ