[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ygnhv98z9pfj.fsf@nvidia.com>
Date: Tue, 6 Apr 2021 22:35:28 +0300
From: Vlad Buslov <vladbu@...dia.com>
To: Cong Wang <xiyou.wangcong@...il.com>
CC: Linux Kernel Network Developers <netdev@...r.kernel.org>,
"Kumar Kartikeya Dwivedi" <memxor@...il.com>,
David Miller <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
"Jakub Kicinski" <kuba@...nel.org>,
Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH RFC 2/4] net: sched: fix err handler in tcf_action_init()
On Tue 06 Apr 2021 at 01:56, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov <vladbu@...dia.com> wrote:
>> So, the following happens in reproduction provided in commit message
>> when executing "tc actions add action simple sdata \"1\" index 1
>> action simple sdata \"2\" index 2" command:
>>
>> 1. tcf_action_init() is called with batch of two actions of same type,
>> no module references are held, 'actions' array is empty:
>>
>> act_simple refcnt balance = 0
>> actions[] = {}
>>
>> 2. tc_action_load_ops() is called for first action:
>>
>> act_simple refcnt balance = +1
>> actions[] = {}
>>
>> 3. tc_action_load_ops() is called for second action:
>>
>> act_simple refcnt balance = +2
>> actions[] = {}
>>
>> 4. tcf_action_init_1() called for first action, succeeds, action
>> instance is assigned to 'actions' array:
>>
>> act_simple refcnt balance = +2
>> actions[] = { [0]=act1 }
>>
>> 5. tcf_action_init_1() fails for second action, 'actions' array not
>> changed, goto err:
>>
>> act_simple refcnt balance = +2
>> actions[] = { [0]=act1 }
>>
>> 6. tcf_action_destroy() is called for 'actions' array, last reference to
>> first action is released, tcf_action_destroy_1() calls module_put() for
>> actions module:
>>
>> act_simple refcnt balance = +1
>> actions[] = {}
>>
>> 7. err_mod loop starts iterating over ops array, executes module_put()
>> for first actions ops:
>>
>> act_simple refcnt balance = 0
>> actions[] = {}
>>
>> 7. err_mod loop executes module_put() for second actions ops:
>>
>> act_simple refcnt balance = -1
>> actions[] = {}
>>
>>
>> The goal of my fix is to not unconditionally release the module
>> reference for successfully initialized actions because this is already
>> handled by action destroy code. Hope this explanation clarifies things.
>
> Great explanation! It seems harder and harder to understand the
> module refcnt here. How about we just take the refcnt when we
> successfully create an action? Something like this:
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index b919826939e0..075cc80480bf 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -493,6 +493,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32
> index, struct nlattr *est,
> }
>
> p->idrinfo = idrinfo;
> + __module_get(ops->owner);
> p->ops = ops;
> *a = p;
> return 0;
> @@ -1035,13 +1036,6 @@ struct tc_action *tcf_action_init_1(struct net
> *net, struct tcf_proto *tp,
> if (!name)
> a->hw_stats = hw_stats;
>
> - /* module count goes up only when brand new policy is created
> - * if it exists and is only bound to in a_o->init() then
> - * ACT_P_CREATED is not returned (a zero is).
> - */
> - if (err != ACT_P_CREATED)
> - module_put(a_o->owner);
> -
> return a;
>
> err_out:
> @@ -1100,7 +1094,8 @@ int tcf_action_init(struct net *net, struct
> tcf_proto *tp, struct nlattr *nla,
> tcf_idr_insert_many(actions);
>
> *attr_size = tcf_action_full_attrs_size(sz);
> - return i - 1;
> + err = i - 1;
> + goto err_mod:
>
> err:
> tcf_action_destroy(actions, bind);
>
> The idea is on the higher level we hold refcnt when loading module and
> put it back _unconditionally_ when returning, and hold a refcnt only when
> we create an action and conditionally put it back when an error happens.
> With pseudo code, it is something like this:
>
> load_ops() // module refcnt +1
> init_actions(); // module refcnt +1 only when create a new one
> if (err)
> // module refcnt -1 when we delete one
> module_put();
> module_put(); // module refcnt -1
>
> This looks much easier to track. What do you think?
>
> Thanks!
Indeed, your suggestion looks more straightforward. The only thing we
need to mind is that action->init() callbacks assume that caller
releases the module even after calling tcf_idr_create(), so we also need
to modify tcf_idr_release() (used by error handlers in action->init()
implementations) to release the module.
I'll run some tests tomorrow to verify that I'm not missing anything
else.
Regards,
Vlad
Powered by blists - more mailing lists