linux-kernel - Re: [PATCH v3] Add BPF_SYNCHRONIZE_MAP_TO_MAP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKOZueuACjK0W4byW7omPvEJ-H-8MK1OU3ORrAfNtsEi8NxoPQ@mail.gmail.com>
Date:   Tue, 14 Aug 2018 13:37:12 -0700
From:   Daniel Colascione <dancol@...gle.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Joel Fernandes <joelaf@...gle.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Tim Murray <timmurray@...gle.com>,
        netdev <netdev@...r.kernel.org>,
        Lorenzo Colitti <lorenzo@...gle.com>,
        Chenbo Feng <fengc@...gle.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Alexei Starovoitov <ast@...com>
Subject: Re: [PATCH v3] Add BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES bpf(2) command

On Fri, Aug 10, 2018 at 3:52 PM, Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
> On Tue, Jul 31, 2018 at 02:36:39AM -0700, Daniel Colascione wrote:
>>
>> > An API command name
>> > such as BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES is simply non-generic, and
>> > exposes specific map details (here: map-in-map) into the UAPI whereas it
>> > should reside within a specific implementation instead similar to other ops
>> > we have for maps.
>>
>> But synchronize isn't conceptually a command that applies to a
>> specific map. It waits on all references. Did you address my point
>> about your proposed map-specific interface requiring redundant
>> synchronize_rcu calls in the case where we swap multiple maps and want
>> to wait for all the references to drain? Under my proposal, you'd just
>> BPF_SYNCHRONIZE_WHATEVER and call schedule_rcu once. Under your
>> proposal, we'd make it a per-map operation, so we'd issue one
>> synchronize_rcu per map.
>
> optimizing for multi-map sync sounds like premature optimization.

Maybe, but the per-map proposal is less efficient *and* more
complicated! I don't want to spend more code just to go slower.

> I believe the only issue being discussed is user space doesn't know
> when it's ok to start draining the inner map when it was replaced
> by bpf_map_update syscall command with another map, right?

Yes.

> If we agree on that, should bpf_map_update handle it then?
> Wouldn't it be much easier to understand and use from user pov?
> No new commands to learn. map_update syscall replaced the map
> and old map is no longer accessed by the program via this given map-in-map.

Maybe with a new BPF_SYNCHRONIZE flag for BPF_MAP_UPDATE_ELEM and
BPF_MAP_DELETE_ELEM. Otherwise, it seems wrong to make every user of
these commands pay for synchronization that only a few will need.

> But if the replaced map is used directly or it sits in some other
> map-in-map slot the progs can still access it.
>
> My issue with DanielC SYNC cmd that it exposes implementation details

What implementation details? The command semantics are defined
entirely in terms of existing user-visible primitives.

> and introduces complex 'synchronization' semantics. To majority of
> the users it won't be obvious what is being synchronized.
>
> My issue with DanielB WAIT_REF map_fd cmd that it needs to wait for all refs
> to this map to be dropped. I think combination of usercnt and refcnt
> can answer that, but feels dangerous to sleep potentially forever
> in a syscall until all prog->map references are gone, though such
> cmd is useful beyond map-in-map use case.

In what scenarios?

In any case, can we submit _something_?