[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87jzpso53a.fsf@nvidia.com>
Date: Tue, 5 Dec 2023 20:34:03 +0200
From: Vlad Buslov <vladbu@...dia.com>
To: Pedro Tammela <pctammela@...atatu.com>
CC: <netdev@...r.kernel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
<kuba@...nel.org>, <pabeni@...hat.com>, <jhs@...atatu.com>,
<xiyou.wangcong@...il.com>, <jiri@...nulli.us>, <marcelo.leitner@...il.com>
Subject: Re: [PATCH net-next 1/2] net/sched: act_api: rely on rcu in
tcf_idr_check_alloc
On Tue 05 Dec 2023 at 12:30, Pedro Tammela <pctammela@...atatu.com> wrote:
> Instead of relying only on the idrinfo->lock mutex for
> bind/alloc logic, rely on a combination of rcu + mutex + atomics
> to better scale the case where multiple rtnl-less filters are
> binding to the same action object.
>
> Action binding happens when an action index is specified explicitly and
> an action exists which such index exists. Example:
Nit: the first sentence looks mangled, extra 'exists' word and probably
'which' should be 'with'.
> tc actions add action drop index 1
> tc filter add ... matchall action drop index 1
> tc filter add ... matchall action drop index 1
> tc filter add ... matchall action drop index 1
> tc filter ls ...
> filter protocol all pref 49150 matchall chain 0 filter protocol all pref 49150 matchall chain 0 handle 0x1
> not_in_hw
> action order 1: gact action drop
> random type none pass val 0
> index 1 ref 4 bind 3
>
> filter protocol all pref 49151 matchall chain 0 filter protocol all pref 49151 matchall chain 0 handle 0x1
> not_in_hw
> action order 1: gact action drop
> random type none pass val 0
> index 1 ref 4 bind 3
>
> filter protocol all pref 49152 matchall chain 0 filter protocol all pref 49152 matchall chain 0 handle 0x1
> not_in_hw
> action order 1: gact action drop
> random type none pass val 0
> index 1 ref 4 bind 3
>
> When no index is specified, as before, grab the mutex and allocate
> in the idr the next available id. In this version, as opposed to before,
> it's simplified to store the -EBUSY pointer instead of the previous
> alloc + replace combination.
>
> When an index is specified, rely on rcu to find if there's an object in
> such index. If there's none, fallback to the above, serializing on the
> mutex and reserving the specified id. If there's one, it can be an -EBUSY
> pointer, in which case we just try again until it's an action, or an action.
> Given the rcu guarantees, the action found could be dead and therefore
> we need to bump the refcount if it's not 0, handling the case it's
> in fact 0.
>
> As bind and the action refcount are already atomics, these increments can
> happen without the mutex protection while many tcf_idr_check_alloc race
> to bind to the same action instance.
>
> Signed-off-by: Pedro Tammela <pctammela@...atatu.com>
> ---
> net/sched/act_api.c | 56 +++++++++++++++++++++++++++------------------
> 1 file changed, 34 insertions(+), 22 deletions(-)
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index abec5c45b5a4..79a044d2ae02 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -824,43 +824,55 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
> struct tcf_idrinfo *idrinfo = tn->idrinfo;
> struct tc_action *p;
> int ret;
> + u32 max;
>
> -again:
> - mutex_lock(&idrinfo->lock);
> if (*index) {
> +again:
> + rcu_read_lock();
> p = idr_find(&idrinfo->action_idr, *index);
> +
> if (IS_ERR(p)) {
> /* This means that another process allocated
> * index but did not assign the pointer yet.
> */
> - mutex_unlock(&idrinfo->lock);
> + rcu_read_unlock();
> goto again;
> }
>
> - if (p) {
> - refcount_inc(&p->tcfa_refcnt);
> - if (bind)
> - atomic_inc(&p->tcfa_bindcnt);
> - *a = p;
> - ret = 1;
> - } else {
> - *a = NULL;
> - ret = idr_alloc_u32(&idrinfo->action_idr, NULL, index,
> - *index, GFP_KERNEL);
> - if (!ret)
> - idr_replace(&idrinfo->action_idr,
> - ERR_PTR(-EBUSY), *index);
> + if (!p) {
> + /* Empty slot, try to allocate it */
> + max = *index;
> + rcu_read_unlock();
> + goto new;
> + }
> +
> + if (!refcount_inc_not_zero(&p->tcfa_refcnt)) {
> + /* Action was deleted in parallel */
> + rcu_read_unlock();
> + return -ENOENT;
Current version doesn't return ENOENT since it is synchronous. You are
now introducing basically a change to UAPI since users of this function
(individual actions) are not prepared to retry on ENOENT and will
propagate the error up the call chain. I guess you need to try to create
a new action with specified index instead.
> }
> +
> + if (bind)
> + atomic_inc(&p->tcfa_bindcnt);
> + *a = p;
> +
> + rcu_read_unlock();
> +
> + return 1;
> } else {
> + /* Find a slot */
> *index = 1;
> - *a = NULL;
> - ret = idr_alloc_u32(&idrinfo->action_idr, NULL, index,
> - UINT_MAX, GFP_KERNEL);
> - if (!ret)
> - idr_replace(&idrinfo->action_idr, ERR_PTR(-EBUSY),
> - *index);
> + max = UINT_MAX;
> }
> +
> +new:
> + *a = NULL;
> +
> + mutex_lock(&idrinfo->lock);
> + ret = idr_alloc_u32(&idrinfo->action_idr, ERR_PTR(-EBUSY), index, max,
> + GFP_KERNEL);
What if multiple concurrent tasks didn't find the action by index with
rcu and get here, synchronizing on the idrinfo->lock? It looks like
after the one who got the lock first successfully allocates the index
everyone else will fail (also propagating ENOSPACE to the user). I guess
you need some mechanism to account for such case and retry.
> mutex_unlock(&idrinfo->lock);
> +
> return ret;
> }
> EXPORT_SYMBOL(tcf_idr_check_alloc);
Powered by blists - more mailing lists