lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <602c19dd5b1c2_6b719208e6@john-XPS-13-9370.notmuch>
Date:   Tue, 16 Feb 2021 11:15:41 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        John Fastabend <john.fastabend@...il.com>,
        Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        daniel@...earbox.net, ast@...nel.org, bpf@...r.kernel.org,
        netdev@...r.kernel.org
Cc:     andrii@...nel.org, magnus.karlsson@...el.com,
        ciara.loftus@...el.com
Subject: Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link

Toke Høiland-Jørgensen wrote:
> Björn Töpel <bjorn.topel@...el.com> writes:
> 
> > On 2021-02-15 21:49, John Fastabend wrote:
> >> Maciej Fijalkowski wrote:
> >>> Currently, if there are multiple xdpsock instances running on a single
> >>> interface and in case one of the instances is terminated, the rest of
> >>> them are left in an inoperable state due to the fact of unloaded XDP
> >>> prog from interface.
> >>>
> >>> To address that, step away from setting bpf prog in favour of bpf_link.
> >>> This means that refcounting of BPF resources will be done automatically
> >>> by bpf_link itself.
> >>>
> >>> When setting up BPF resources during xsk socket creation, check whether
> >>> bpf_link for a given ifindex already exists via set of calls to
> >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> >>> and comparing the ifindexes from bpf_link and xsk socket.
> >>>
> >>> If there's no bpf_link yet, create one for a given XDP prog and unload
> >>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> >>>
> >>> If bpf_link is already at a given ifindex and underlying program is not
> >>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> >>> XDP_FLAGS_UPDATE_IF_NOEXIST.
> >>>
> >>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
> >>> ---

[...]

> >>> -	ctx->prog_fd = prog_fd;
> >>> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> >>> +	if (link_fd < 0) {
> >>> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> >>> +		return link_fd;
> >>> +	}
> >>> +
> >> 
> >> This can leave the system in a bad state where it unloaded the XDP program
> >> above, but then failed to create the link. So we should somehow fix that
> >> if possible or at minimum put a note somewhere so users can't claim they
> >> shouldn't know this.
> >> 
> >> Also related, its not good for real systems to let XDP program go missing
> >> for some period of time. I didn't check but we should make
> >> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
> >>
> >
> > This is the default for XDP sockets library. The
> > "bpf_set_link_xdp_fd(...-1)" way is only when a user sets it explicitly.
> > One could maybe argue that the "force remove" would be out of scope for
> > AF_XDP; Meaning that if an XDP program is running, attached via netlink,
> > the AF_XDP library simply cannot remove it. The user would need to rely
> > on some other mechanism.
> 
> Yeah, I'd tend to agree with that. In general, I think the proliferation
> of "just force-remove (or override) the running program" in code and
> instructions has been a mistake; and application should only really be
> adding and removing its own program...
> 
> -Toke
> 

I'll try to consolidate some of my opinions from a couple threads here. It
looks to me many of these issues are self-inflicted by the API. We built
the API with out the right abstractions or at least different abstractions
from the rest of the BPF APIs. Not too surprising seeing the kernel side
and user side were all being built up at once.

For example this specific issue where the xsk API also deletes the XDP
program seems to be due to merging the xsk with the creation of the XDP
programs.

I'm not a real user of AF_XDP (yet.), but here is how I would expect it
to work based on how the sockmap pieces work, which are somewhat similar
given they also deal with sockets.

Program 
(1) load and pin an XDP BPF program
    - obj = bpf_object__open(prog);
    - bpf_object__load_xattr(&attr);
    - bpf_program__pin()
(2) pin the map, find map_xsk using any of the map APIs
    - bpf_map__pin(map_xsk, path_to_pin)
(3) attach to XDP
    - link = bpf_program__attach_xdp()
    - bpf_link__pin()

At this point you have a BPF program loaded, a xsk map, and a link all
pinned and ready. And we can add socks using the process found in
`enter_xsks_into_map` in the sample. This can be the same program that
loaded/pinned the XDP program or some other program it doesn't really
matter.

 - create xsk fd
      . xsk_umem__create()
      . xsk_socket__create
 - open map @ pinned path
 - bpf_map_update_elem(xsks_map, &key, &fd, 0);

Then it looks like we don't have any conflicts? The XDP program is pinned
and exists in its normal scope. The xsk elements can be added/deleted
as normal. If the XDP program is removed and the map referencing (using
normal ref rules) reaches zero its also deleted. Above is more or less
the same flow we use for any BPF program so looks good to me.

The trouble seems to pop up when using the higher abstraction APIs
xsk_setup_xdp_prog and friends I guess? I just see above as already
fairly easy to use and we have good helpers to create the sockets it looks
like. Maybe I missed some design considerations. IMO higher level
abstractions should go in new libxdp and above should stay in libbpf.

/rant off ;)

Thanks,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ