netdev - Re: [PATCH net-next v3 4/7] netfilter: flowtable: allow updating offloaded rules asynchronously

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87o7qoc9sg.fsf@nvidia.com>
Date:   Tue, 24 Jan 2023 15:57:17 +0200
From:   Vlad Buslov <vladbu@...dia.com>
To:     Pablo Neira Ayuso <pablo@...filter.org>
CC:     <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
        <netdev@...r.kernel.org>, <netfilter-devel@...r.kernel.org>,
        <jhs@...atatu.com>, <xiyou.wangcong@...il.com>, <jiri@...nulli.us>,
        <ozsh@...dia.com>, <marcelo.leitner@...il.com>,
        <simon.horman@...igine.com>
Subject: Re: [PATCH net-next v3 4/7] netfilter: flowtable: allow updating
 offloaded rules asynchronously

On Tue 24 Jan 2023 at 11:19, Vlad Buslov <vladbu@...dia.com> wrote:
> On Tue 24 Jan 2023 at 09:41, Pablo Neira Ayuso <pablo@...filter.org> wrote:
>> Hi Vlad,
>>
>> On Tue, Jan 24, 2023 at 09:06:13AM +0200, Vlad Buslov wrote:
>>> 
>>> On Fri 20 Jan 2023 at 12:41, Pablo Neira Ayuso <pablo@...filter.org> wrote:
>>> > Hi Vlad,
>>> >
>>> > On Thu, Jan 19, 2023 at 08:51:01PM +0100, Vlad Buslov wrote:
>>> >> Following patches in series need to update flowtable rule several times
>>> >> during its lifetime in order to synchronize hardware offload with actual ct
>>> >> status. However, reusing existing 'refresh' logic in act_ct would cause
>>> >> data path to potentially schedule significant amount of spurious tasks in
>>> >> 'add' workqueue since it is executed per-packet. Instead, introduce a new
>>> >> flow 'update' flag and use it to schedule async flow refresh in flowtable
>>> >> gc which will only be executed once per gc iteration.
>>> >
>>> > So the idea is to use a NF_FLOW_HW_UPDATE which triggers the update
>>> > from the garbage collector. I understand the motivation here is to
>>> > avoid adding more work to the workqueue, by simply letting the gc
>>> > thread pick up for the update.
>>> >
>>> > I already proposed in the last year alternative approaches to improve
>>> > the workqueue logic, including cancelation of useless work. For
>>> > example, cancel a flying "add" work if "delete" just arrive and the
>>> > work is still sitting in the queue. Same approach could be use for
>>> > this update logic, ie. cancel an add UDP unidirectional or upgrade it
>>> > to bidirectional if, by the time we see traffic in both directions,
>>> > then work is still sitting in the queue.
>>> 
>>> Thanks for the suggestion. I'll try to make this work over regular
>>> workqueues without further extending the flow flags and/or putting more
>>> stuff into gc.
>>
>> Let me make a second pass to sort out thoughts on this.
>>
>> Either we use regular workqueues (without new flags) or we explore
>> fully consolidating this hardware offload workqueue infrastructure
>> around flags, ie. use flags not only for update events, but also for
>> new and delete.
>>
>> This would go more in the direction of your _UPDATE flag idea:
>>
>> - Update the hardware offload workqueue to iterate over the
>>   flowtable. The hardware offload workqueue would be "scanning" for
>>   entries in the flowtable that require some sort of update in the
>>   hardware. The flags would tell what kind of action is needed.
>>
>> - Add these flags:
>>
>> NF_FLOW_HW_NEW
>> NF_FLOW_HW_UPDATE
>> NF_FLOW_HW_DELETE
>>
>> and remove the work object (flow_offload_work) and the existing list.
>> If the workqueue finds an entry with:
>>
>> NEW|DELETE, this means this is short lived flow, not worth to waste
>> cycles to offload it.
>> NEW|UPDATE, this means this is an UDP flow that is bidirectional.
>>
>> Then, there will be no more work allocation + "flying" work objects to
>> the hardware offload workqueue. Instead, the hardware offload
>> workqueue will be iterating.
>>
>> This approach would need _DONE flags to annotate if the offload
>> updates have been applied to hardware already (similar to the
>> conntrack _DONE flags).
>>
>> (Oh well, this proposal is adding even more flags. But I think flags
>> are not the issue, but the mixture of the existing flow_offload_work
>> approach with this new _UPDATE flag and the gc changes).
>>
>> If flow_offload_work is removed, we would also need to add a:
>>
>>  struct nf_flowtable *flowtable;
>>
>> field to the flow_offload entry, which is an entry field that is
>> passed via flow_offload_work. So it is one extra field for the each
>> flow_offload entry.
>>
>> The other alternative is to use the existing nf_flow_offload_add_wq
>> with UPDATE command, which might result in more flying objects in
>> turn. I think this is what you are trying to avoid with the _UPDATE
>> flag approach.
>
> This looks interesting, but is very ambitious and will probably be a
> bigger change than this whole series. I have an idea how we can leverage
> existing 'refresh' mechanism for updating flow state that doesn't
> involve large-scale refactoring of existing offload infrastructure,
> which I would prefer to try first. WDYT?

Update: to illustrate this point I prepared V4 that uses regular refresh
to update the flow and also tries to prevent excessive wq spam or
updating flow offload to a state that is already outdated.