[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbfd0eby6qv.fsf@mellanox.com>
Date: Fri, 1 Nov 2019 15:08:12 +0000
From: Vlad Buslov <vladbu@...lanox.com>
To: John Hurley <john.hurley@...ronome.com>
CC: Vlad Buslov <vladbu@...lanox.com>, Jiri Pirko <jiri@...lanox.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"simon.horman@...ronome.com" <simon.horman@...ronome.com>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"louis.peens@...ronome.com" <louis.peens@...ronome.com>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules
from tcf_proto destroy race
On Fri 01 Nov 2019 at 16:54, John Hurley <john.hurley@...ronome.com> wrote:
> On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@...ronome.com> wrote:
>> > When a new filter is added to cls_api, the function
>> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
>> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
>> > creates a new entry or continues with an existing one. In cls_flower, this
>> > allows the function fl_ht_insert_unque to determine if a filter is a
>> > duplicate and reject appropriately, meaning that the duplicate will not be
>> > passed to drivers via the offload hooks. However, when a tcf_proto is
>> > destroyed it is removed from its chain before a hardware remove hook is
>> > hit. This can lead to a race whereby the driver has not received the
>> > remove message but duplicate flows can be accepted. This, in turn, can
>> > lead to the offload driver receiving incorrect duplicate flows and out of
>> > order add/delete messages.
>> >
>> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
>> > hash table per block stores each unique chain/protocol/prio being
>> > destroyed. This entry is only removed when the full destroy (and hardware
>> > offload) has completed. If a new flow is being added with the same
>> > identiers as a tc_proto being detroyed, then the add request is replayed
>> > until the destroy is complete.
>> >
>> > v1->v2:
>> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
>> > (Vlad Buslov)
>> > - add error log for cases when hash entry fails. Previously it failed
>> > silently with no indication. (Vlad Buslov)
>> >
>> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
>> > Signed-off-by: John Hurley <john.hurley@...ronome.com>
>> > Reviewed-by: Simon Horman <simon.horman@...ronome.com>
>> > Reported-by: Louis Peens <louis.peens@...ronome.com>
>> > ---
>>
>> Hi John,
>>
>> Patch looks good! However, I think we can simplify it even more and
>> remove duplication of data in tcf_proto (hashtable key contains copy of
>> data that is already available in the struct itself) and remove all
>> dynamic memory allocations. I have refactored your patch accordingly
>> (attached). WDYT?
>>
>
> Hi Vlad,
> Thanks for the review/modifications.
> The changes look good to me but I can fire it through our test setup to be sure.
>
> The only thing I'm a bit concerned about here is the size of the
> static allocation.
> We are now defining quite a large hash table per block (65536 buckets).
> It's hard to know exactly how many elements will be in this at the one time.
> With flushes of large chains it may well become quite full, but I'd
> expect that it will be empty a majority of the time.
> Resizable hash seems a bit more appropriate here.
> Do you have any opinions on this?
>
> John
Yeah, I agree that 65536 buckets is quite an overkill for this. I think
cls API assumes that there is not too many tp instances because they are
attached to chain through regular linked list and each lookup is a
linear search. With this I assume that proto_destroy_ht size can be
safely reduced to 256 buckets, if not less. Resizable table is an
option, but that also sounds like an overkill to me since linear search
over list of chains attached to block or over list of tp's attached to
chain will be the main bottleneck, if user creates hundreds of thousands
of proto instances.
>
>> [...]
>>
Powered by blists - more mailing lists