[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK+XE==iPdOq65FK1_MvTr7x6=dRQDYe7kASLDEkux+D5zUh+g@mail.gmail.com>
Date: Fri, 1 Nov 2019 14:54:49 +0000
From: John Hurley <john.hurley@...ronome.com>
To: Vlad Buslov <vladbu@...lanox.com>
Cc: Jiri Pirko <jiri@...lanox.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"simon.horman@...ronome.com" <simon.horman@...ronome.com>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"louis.peens@...ronome.com" <louis.peens@...ronome.com>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules
from tcf_proto destroy race
On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@...lanox.com> wrote:
>
>
> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@...ronome.com> wrote:
> > When a new filter is added to cls_api, the function
> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
> > creates a new entry or continues with an existing one. In cls_flower, this
> > allows the function fl_ht_insert_unque to determine if a filter is a
> > duplicate and reject appropriately, meaning that the duplicate will not be
> > passed to drivers via the offload hooks. However, when a tcf_proto is
> > destroyed it is removed from its chain before a hardware remove hook is
> > hit. This can lead to a race whereby the driver has not received the
> > remove message but duplicate flows can be accepted. This, in turn, can
> > lead to the offload driver receiving incorrect duplicate flows and out of
> > order add/delete messages.
> >
> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
> > hash table per block stores each unique chain/protocol/prio being
> > destroyed. This entry is only removed when the full destroy (and hardware
> > offload) has completed. If a new flow is being added with the same
> > identiers as a tc_proto being detroyed, then the add request is replayed
> > until the destroy is complete.
> >
> > v1->v2:
> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
> > (Vlad Buslov)
> > - add error log for cases when hash entry fails. Previously it failed
> > silently with no indication. (Vlad Buslov)
> >
> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
> > Signed-off-by: John Hurley <john.hurley@...ronome.com>
> > Reviewed-by: Simon Horman <simon.horman@...ronome.com>
> > Reported-by: Louis Peens <louis.peens@...ronome.com>
> > ---
>
> Hi John,
>
> Patch looks good! However, I think we can simplify it even more and
> remove duplication of data in tcf_proto (hashtable key contains copy of
> data that is already available in the struct itself) and remove all
> dynamic memory allocations. I have refactored your patch accordingly
> (attached). WDYT?
>
Hi Vlad,
Thanks for the review/modifications.
The changes look good to me but I can fire it through our test setup to be sure.
The only thing I'm a bit concerned about here is the size of the
static allocation.
We are now defining quite a large hash table per block (65536 buckets).
It's hard to know exactly how many elements will be in this at the one time.
With flushes of large chains it may well become quite full, but I'd
expect that it will be empty a majority of the time.
Resizable hash seems a bit more appropriate here.
Do you have any opinions on this?
John
> [...]
>
Powered by blists - more mailing lists