netdev - Re: Are BPF tail calls only supposed to work with pinned maps?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190926125347.GB6563@pc-63.home>
Date:   Thu, 26 Sep 2019 14:53:47 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Toke Høiland-Jørgensen <toke@...hat.com>
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: Are BPF tail calls only supposed to work with pinned maps?

Hi Toke,

On Thu, Sep 26, 2019 at 01:23:38PM +0200, Toke Høiland-Jørgensen wrote:
[...]
> While working on a prototype of the XDP chain call feature, I ran into
> some strange behaviour with tail calls: If I create a userspace program
> that loads two XDP programs, one of which tail calls the other, the tail
> call map would appear to be empty even though the userspace program
> populates it as part of the program loading.
> 
> I eventually tracked this down to this commit:
> c9da161c6517 ("bpf: fix clearing on persistent program array maps")

Correct.

> Which clears PROG_ARRAY maps whenever the last uref to it disappears
> (which it does when my loader exits after attaching the XDP program).
> 
> This effectively means that tail calls only work if the PROG_ARRAY map
> is pinned (or the process creating it keeps running). And as far as I
> can tell, the inner_map reference in bpf_map_fd_get_ptr() doesn't bump
> the uref either, so presumably if one were to create a map-in-map
> construct with tail call pointer in the inner map(s), each inner map
> would also need to be pinned (haven't tested this case)?

There is no map in map support for tail calls today.

> Is this really how things are supposed to work? From an XDP use case PoV
> this seems somewhat surprising...
> 
> Or am I missing something obvious here?

The way it was done like this back then was in order to break up cyclic
dependencies as otherwise the programs and maps involved would never get
freed as they reference themselves and live on in the kernel forever
consuming potentially large amount of resources, so orchestration tools
like Cilium typically just pin the maps in bpf fs (like most other maps
it uses and accesses from agent side) in order to up/downgrade the agent
while keeping BPF datapath intact.

Thanks,
Daniel