[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180428010439.qryq3ejdyyhtb25u@ast-mbp>
Date: Fri, 27 Apr 2018 18:04:41 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Chenbo Feng <fengc@...gle.com>
Cc: netdev@...r.kernel.org, Daniel Borkmann <daniel@...earbox.net>,
Lorenzo Colitti <lorenzo@...gle.com>,
Joel Fernandes <joelaf@...gle.com>
Subject: Re: Suggestions on iterating eBPF maps
On Fri, Apr 27, 2018 at 06:33:56PM +0000, Chenbo Feng wrote:
> resend with plain text
>
> On Fri, Apr 27, 2018 at 11:22 AM Chenbo Feng <fengc@...gle.com> wrote:
>
> > Hi net-next,
>
> > When doing the eBPF tools user-space development I noticed that the map
> iterating process in user-space have some little flaws. If we want to dump
> the whole map. The only way now I know is to use a null key to start the
> iteration and keep calling bpf_get_next_key and bpf_look_up_elem for each
> new key value pair until we reach the end of the map. I noticed the
> bpftools recently added used the similar approach.
>
> > The overhead of repeating syscalls is acceptable, but the race problem
> come with this iteration process is a little annoying. If the current key
> we are using get deleted before we do the syscall to get the next key . The
> next key returned will start from the beginning of the map again and some
> entry will be dumped again depending on the position of the key deleted. If
> the racing problem is within the same userspace process, it can be easily
> fixed by adding some read/write locks. However, if multiple processes is
> reading the map through pinned fd while there is one process is editing the
> map entry or the kernel program is deleting entries, it become harder to
> get a consistent and correct map dump.
>
> > We are wondering if there is already implementation we didn't notice in
> mainline kernel that help improved this iteration process and addressed the
> racing problem mentioned above? If not, what can be down to address the
> issue above. One thing we came up with is to use a single entry bpf map as
> a across process lock to prevent multiple userspace process to read/write
> other maps at the same time. But I don't know how safe this solution is
> since there will still be a race to read the lock map value and setup the
> lock.
to avoid seeing duplicate keys due to parallel removal one can walk all
keys with get_next first. Remove duplicate keys and then lookup their values.
By that time some elements could be removed and lookups will be failing.
Another approach could be to use map-in-map and have almost atomic
replace of the whole map with new potentially empty map. The prog
can continue using the new map, while user space walks no longer
accessed old map.
yet another approach would be to introduce a knob to the prog
that user space controls and make program obey that knob.
When it's on the prog won't be deleting/updating maps.
Powered by blists - more mailing lists