[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180711034017.o2ehf27tv5hpl3td@ast-mbp.dhcp.thefacebook.com>
Date: Tue, 10 Jul 2018 20:40:19 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Lorenzo Colitti <lorenzo@...gle.com>
Cc: Chenbo Feng <fengc@...gle.com>, dancol@...gle.com,
mathieu.desnoyers@...icios.com, Joel Fernandes <joelaf@...gle.com>,
Alexei Starovoitov <ast@...com>,
lkml <linux-kernel@...r.kernel.org>,
Tim Murray <timmurray@...gle.com>,
Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Subject: Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command
On Wed, Jul 11, 2018 at 11:46:19AM +0900, Lorenzo Colitti wrote:
> On Wed, Jul 11, 2018 at 8:52 AM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > we need to make sure we have detailed description of BPF_SYNC_MAP_ACCESS
> > in uapi/bpf.h, since I feel the confusion regarding its usage is starting already.
> > This new cmd will only make sense for map-in-map type of maps.
> > Expecting that BPF_SYNC_MAP_ACCESS is somehow implies the end of
> > the program or doing some other map synchronization is not correct.
> > Commit log of this patch got it right:
> > """
> > For example, userspace can update a map->map entry to point to a new map,
> > use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to
> > complete, and then drain the old map without fear that BPF programs
> > may still be updating it.
> > """
>
> +1 for detailed documentation. For example, consider what happens if
> we have two map fds, one active and one standby, and a map-in-map with
> one element that contains a pointer to the currently-active map fd.
yes. that's exactly the use case that folks use.
> The kernel program might do:
>
> =====
> const int current_map_key = 1;
> void *current_map = bpf_map_lookup_elem(outer_map, ¤t_map_key);
>
> int stats_key = 42;
> uint64_t *stats_value = bpf_map_lookup_elem(current_map, &stats_key);
> __sync_fetch_and_add(&stats_value, 1);
> =====
>
> If a userspace does:
>
> 1. Write new fd to outer_map[1].
> 2. Call BPF_SYNC_MAP_ACCESS.
> 3. Start deleting everything in the old map.
>
> How can we guarantee that the __sync_fetch_and_add will not add to the
> old map?
without any changes to the kernel sys_membarrier will work.
And that's what folks use already.
BPF_SYNC_MAP_ACCESS implemented via synchronize_rcu() will work
as well whether in the current implementation where rcu_lock/unlock
is done outside of the program and in the future when
rcu_lock/unlock are called by the program itself.
> Will the verifier automatically
> hold the RCU lock for as long as a pointer to an inner map is valid?
the verifier will guarantee the equivalency of future explicit
lock/unlock by the program vs current situation of implicit
lock/unlock by the kernel.
The verifier will track that bpf_map_lookup_elem() is done
after rcu_lock and that the value returned by this helper is
not accessed after rcu_unlock. Baby steps of dataflow analysis.
Powered by blists - more mailing lists