linux-kernel - Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CA+FuTSd+wGq2V2G1uDBdmyUGe1cquBXdVtSW_7BCwYQEd8E3ag@mail.gmail.com>
Date:   Fri, 14 Aug 2020 17:44:53 +0200
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     linmiaohe <linmiaohe@...wei.com>
Cc:     David Miller <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or
 RAW uarg

On Fri, Aug 14, 2020 at 10:17 AM linmiaohe <linmiaohe@...wei.com> wrote:
>
> Willem de Bruijn <willemdebruijn.kernel@...il.com> wrote:
> >On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmiaohe@...wei.com> wrote:
> >>
> >> The var extra_uref is introduced to pass the initial reference taken
> >> in sock_zerocopy_alloc to the first generated skb. But now we may fail
> >> to pass the initial reference with newly allocated UDP or RAW uarg
> >> when the skb is zcopied.
> >
> >extra_uref is true if there is no previous skb to append to or there is a previous skb, but that does not have zerocopy data associated yet (because the previous call(s) did not set MSG_ZEROCOPY).
> >
> >In other words, when first (allocating and) associating a zerocopy struct with the skb.
>
> Many thanks for your explaination. The var extra_uref plays the role as you say. I just borrowed the description of var extra_uref from previous commit log here.
>
> >
> >> -               extra_uref = !skb_zcopy(skb);   /* only ref on new uarg */
> >> +               /* Only ref on newly allocated uarg. */
> >> +               if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
> >> +                       extra_uref = true;
> >
> >SOCK_STREAM does not use __ip_append_data.
> >
> >This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg.
> >
> >This function can only acquire a uarg through sock_zerocopy_realloc, which on skb_zcopy(skb) only returns the existing uarg or NULL (for not SOCK_STREAM).
> >
> >So I don't see when that condition can happen.
> >
>
> On skb_zcopy(skb), we returns the existing uarg iff (uarg->id + uarg->len == atomic_read(&sk->sk_zckey)) in sock_zerocopy_realloc. So we may get a newly allocated
> uarg via sock_zerocopy_alloc(). Though we may not trigger this codepath now, it's still a potential problem that we may missed the right trace to uarg.

I don't think that can happen.

The question is when this branch is false

                next = (u32)atomic_read(&sk->sk_zckey);
                if ((u32)(uarg->id + uarg->len) == next) {

I cannot come up with a case. I think it might be vestigial. The goal
is to ensure to append only a consecutive range of notification IDs.
Each notification ID corresponds to a sendmsg invocation with
MSG_ZEROCOPY. In both TCP and UDP with corking, data is ordered and
access to changes to these fields happen together as a transaction:

                /* realloc only when socket is locked (TCP, UDP cork),
                 * so uarg->len and sk_zckey access is serialized
                 */