netdev - Re: [PATCH bpf-next v3 03/12] bpf: XDP metadata RX kfuncs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a5a636cc-5b03-686f-4be0-000383b05cfc@linux.dev>
Date:   Thu, 8 Dec 2022 14:53:37 -0800
From:   Martin KaFai Lau <martin.lau@...ux.dev>
To:     Stanislav Fomichev <sdf@...gle.com>
Cc:     ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
        song@...nel.org, yhs@...com, john.fastabend@...il.com,
        kpsingh@...nel.org, haoluo@...gle.com, jolsa@...nel.org,
        David Ahern <dsahern@...il.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Willem de Bruijn <willemb@...gle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Anatoly Burakov <anatoly.burakov@...el.com>,
        Alexander Lobakin <alexandr.lobakin@...el.com>,
        Magnus Karlsson <magnus.karlsson@...il.com>,
        Maryam Tahhan <mtahhan@...hat.com>, xdp-hints@...-project.net,
        netdev@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH bpf-next v3 03/12] bpf: XDP metadata RX kfuncs

On 12/8/22 11:07 AM, Stanislav Fomichev wrote:
>>> @@ -102,11 +112,25 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr)
>>>        if (err)
>>>                goto err_maybe_put;
>>>
>>> +     prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_HAS_METADATA);
>>> +
>>
>> If I read the set correctly, bpf prog can either use metadata kfunc or offload
>> but not both. It is fine to start with only supporting metadata kfunc when there
>> is no offload but will be useful to understand the reason. I assume an offloaded
>> bpf prog should still be able to call the bpf helpers like adjust_head/tail and
>> the same should go for any kfunc?
> 
> Yes, I'm assuming there should be some work on the offloaded device
> drivers to support metadata kfuncs.
> Offloaded kfuncs, in general, seem hard (how do we call kernel func
> from the device-offloaded prog?); so refusing kfuncs early for the
> offloaded case seems fair for now?

Ah, ok.  I was actually thinking the HW offloaded prog can just use the software 
ndo_* kfunc (like other bpf-helpers).  From skimming some 
bpf_prog_offload_ops:prepare implementation, I think you are right and it seems 
BPF_PSEUDO_KFUNC_CALL has not been recognized yet.

[ ... ]

>>> @@ -226,10 +263,17 @@ static void __bpf_prog_offload_destroy(struct bpf_prog *prog)
>>>
>>>    void bpf_prog_offload_destroy(struct bpf_prog *prog)
>>>    {
>>> +     struct net_device *netdev = NULL;
>>> +
>>>        down_write(&bpf_devs_lock);
>>> -     if (prog->aux->offload)
>>> +     if (prog->aux->offload) {
>>> +             netdev = prog->aux->offload->netdev;
>>>                __bpf_prog_offload_destroy(prog);
>>> +     }
>>>        up_write(&bpf_devs_lock);
>>> +
>>> +     if (netdev)
>>
>> May be I have missed a refcnt or lock somewhere.  Is it possible that netdev may
>> have been freed?
> 
> Yeah, with the offload framework, there are no refcnts. We put an
> "offloaded" device into a separate hashtable (protected by
> rtnl/semaphore).
> maybe_remove_bound_netdev will re-grab the locks (due to ordering:
> rtnl->bpf_devs_lock) and remove the device from the hashtable if it's
> still there.
> At least this is how, I think, it should work; LMK if something is
> still fishy here...
> 
> Or is the concern here that somebody might allocate new netdev reusing
> the same address? I think I have enough checks in
> maybe_remove_bound_netdev to guard against that. Or, at least, to make
> it safe :-)

Race is ok because ondev needs to be removed anyway when '!ondev->offdev && 
list_empty(&ondev->progs)'?  hmmm... tricky, please add a comment. :)

Why it cannot be done together in the bpf_devs_lock above?  The above cannot 
take an extra rtnl_lock before bpf_devs_lock?