netdev - Re: [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map entries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ab11a9f2-0fbd-d35f-fee1-784554a2705a@netronome.com>
Date:   Wed, 14 Aug 2019 10:42:05 +0100
From:   Quentin Monnet <quentin.monnet@...ronome.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>, bpf@...r.kernel.org,
        netdev@...r.kernel.org, oss-drivers@...ronome.com
Subject: Re: [RFC bpf-next 0/3] tools: bpftool: add subcommand to count map
 entries

2019-08-13 18:51 UTC-0700 ~ Alexei Starovoitov
<alexei.starovoitov@...il.com>
> On Tue, Aug 13, 2019 at 02:09:18PM +0100, Quentin Monnet wrote:
>> This series adds a "bpftool map count" subcommand to count the number of
>> entries present in a BPF map. This results from a customer request for a
>> tool to count the number of entries in BPF maps used in production (for
>> example, to know how many free entries are left in a given map).
>>
>> The first two commits actually contain some clean-up in preparation for the
>> new subcommand.
>>
>> The third commit adds the new subcommand. Because what data should count as
>> an entry is not entirely clear for all map types, we actually dump several
>> counters, and leave it to the users to interpret the values.
>>
>> Sending as a RFC because I'm looking for feedback on the approach. Is
>> printing several values the good thing to do? Also, note that some map
>> types such as queue/stack maps do not support any type of counting, this
>> would need to be implemented in the kernel I believe.
>>
>> More generally, we have a use case where (hash) maps are under pressure
>> (many additions/deletions from the BPF program), and counting the entries
>> by iterating other the different keys is not at all reliable. Would that
>> make sense to add a new bpf() subcommand to count the entries on the kernel
>> side instead of cycling over the entries in bpftool? If so, we would need
>> to agree on what makes an entry for each kind of map.
> 
> I don't mind new bpftool sub-command, but against adding kernel interface.
> Can you elaborate what is the actual use case?

Hi Alexei, thanks for your feedback.

The use case is a network processing application (close to a NAT), where
a hash map is used to keep track of flows, many of them being
short-lived. The BPF program spends a good chunk of time adding and
deleting entries to/from the map. The overall size (number of entries)
increases slowly, and when it grows past a certain threshold some action
must be taken (some flows are deleted from user space, possibly copied
to another map or whatever) to ensure we still have some room for new
incoming flows.

> The same can be achieved by 'bpftool map dump|grep key|wc -l', no?

To some extent (with subtleties for some other map types); and we use a
similar command line as a workaround for now. But because of the rate of
inserts/deletes in the map, the process often reports a number higher
than the max number of entries (we observed up to ~750k when max_entries
is 500k), even is the map is only half-full on average during the count.
On the worst case (though not frequent), an entry is deleted just before
we get the next key from it, and iteration starts all over again. This
is not reliable to determine how much space is left in the map.

I cannot see a solution that would provide a more accurate count from
user space, when the map is under pressure?

> 
>> Note that we are also facing similar issues for purging map from their
>> entries (deleting all entries at once). We can iterate on the keys and
>> delete elements one by one, but this is very inefficient when entries are
>> being added/removed in parallel from the BPF program, and having another
>> dedicated command accessible from the bpf() system call might help here as
>> well.
> 
> I think that fits into the batch processing of map commands discussion.
> 

This is also what we do at the moment, but we hit similar limitations
when iterating over the keys.

Thanks,
Quentin