lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200325191454.ub5x3kayowsc75vg@ast-mbp>
Date:   Wed, 25 Mar 2020 12:14:54 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Andrii Nakryiko <andriin@...com>,
        "David S. Miller" <davem@...emloft.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Andrey Ignatov <rdna@...com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] xdp: Support specifying expected existing
 program when attaching XDP

On Wed, Mar 25, 2020 at 11:20:05AM -0700, Jakub Kicinski wrote:
> On Wed, 25 Mar 2020 11:06:38 -0700 Alexei Starovoitov wrote:
> > On Tue, Mar 24, 2020 at 07:15:54PM -0700, Jakub Kicinski wrote:
> > > It is the way to configure XDP today, so it's only natural to
> > > scrutinize the attempts to replace it.   
> > 
> > No one is replacing it.
> 
> You're blocking extensions to the existing API, that means that part 
> of the API is frozen and is being replaced.

two things are wrong in the above stmt:
1. extensions are not frozen in general.
2. api is not being replaced. ownership is lacking. It needs to be added.
   It's a new concept. Not a replacement.

> > > Also I personally don't think you'd see this much push back trying to
> > > add bpf_link-based stuff to cls_bpf, that's an add-on. XDP is
> > > integrated very fundamentally with the networking stack at this point.
> > >   
> > > > Details are important and every case is different. So imo:
> > > > converting ethtool to netlink - great stuff.
> > > > converting netdev irq/queue management to netlink - great stuff too.
> > > > adding more netlink api for xdp - really bad idea.  
> > > 
> > > Why is it a bad idea?  
> > 
> > I explained in three other emails. tldr: lack of ownership.
> 
> Those came later, I think, thanks.
> 
> Fine, maybe one day someone will find the extension you're proposing
> useful. To me that's not a justification to freeze the existing API
> (you said "adding more netlink api for xdp - really bad idea").
> 
> Besides, if you look at Toke's libxdp work (which exists), what's the
> ownership of the attached program? Whichever application touched it
> last?
> 
> The whole auto-detachment thing may work nicely in cls_bpf and
> sub-programs attached to the root XDP program, but it's a bit hard 
> to imagine how its useful for the singleton root XDP program.

bpf_link introduces two new things: 1. ownership 2. auto-detach
They are both useful. Looks like the use case for 2 is obvious, but
1 can exist without being FD based.

> 
> > > There are plenty things which will only be available over netlink.
> > > Configuring the interface so installing the XDP program is possible
> > > (disabling features, configuring queues etc.). Chances are user gets
> > > the ifindex of the interface to attach to over netlink in the first
> > > place. The queue configuration (which you agree belongs in netlink)
> > > will definitely get more complex to allow REDIRECTs to work more
> > > smoothly. AF_XDP needs all sort of netlink stuff.  
> > 
> > sure. that has nothing to do with ownership of attachment.
> 
> AFAICT the allure to John is the uniform API, and no need for netlink.
> I was explaining how that's a bad goal to have.

You clearly misunderstood. Neither John nor I were saying that there is
no need for netlink.

> 
> > > Netlink gives us the notification mechanism which is how we solve
> > > coordination across daemons (something that BPF subsystem is only 
> > > now trying to solve).  
> > 
> > I don't care about notifications on attachment and no one is trying to
> > solve that as far as I can see. It's not a problem to solve in the first place.
> 
> Well, it's the existing solution to the "ownership" problem.
> I think most people simply didn't know about it.

Toke's set introduces the same thing to XDP as
commit 7dd68b3279f1 ("bpf: Support replacing cgroup-bpf program in MULTI mode")
did for cgroup-bpf.
Both are trying to address the same issue and both are NOT doing.
That cgroup-bpf commit looked like a great solution just three month ago.
Now it's clear it's not fixing the underlying issue.
Same thing with Toke's fix. It feels good now, but going to be uselss
without introducing ownership.

Why that cgroup-bpf commit not fixing it?
Take a look at that commit. The first paragraph is
"
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.
"
Then the description goes into explaining how one service wants to replace its prog.
In this case it sort of works because it's single c++ service with multiple
progs that do different things. There is a 'centralized daemon' (kinda) that
can try to orchestrate. It breaks when there are two c++ services.
That replace_bpf_fd is trying to be a link identifier. But the kernel lacks
that identifier.
I think it would be simpler to understand the ownership if bpf_link had
its own IDR for every link. Every attachment(link) would be an object with its
own id. We could have iterated over all attachments with GET_NEXT_ID, for example.
But that's nice to have. Not strictly necessary.
The ownership of the attachment needs to be permanent. It needs to belong
to a task and other tasks should not be able to break that attachment.
That cgroup-bpf commit addressing part of the issue by "inventing" an identifier
for the attachment (in the form of prog_fd that suppose to be there in that
attachment), but not addressing the owner part of the attachment.
Only the task(s) that own that attachment should be able to modify the attachment.

One can imagine how attachment ID can be completely implemented with netlink.
Is it good idea? Not really, because there is no mechanism to transfer the ownership.
Having an FD that points to a kernel object that represents the ownership makes it
easy for user space to pass the ownership (by passing an FD).
Auto-detach part comes for free with FD based bpf_link, but that's not the main feature.
May be we will add a flag to disable auto-detach too.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ