netdev - Re: [PATCH RFC 2/4] net: sched: fix err handler in tcf_action

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ygnhv98z9pfj.fsf@nvidia.com>
Date:   Tue, 6 Apr 2021 22:35:28 +0300
From:   Vlad Buslov <vladbu@...dia.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
CC:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "Kumar Kartikeya Dwivedi" <memxor@...il.com>,
        David Miller <davem@...emloft.net>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        "Jakub Kicinski" <kuba@...nel.org>,
        Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH RFC 2/4] net: sched: fix err handler in tcf_action_init()


On Tue 06 Apr 2021 at 01:56, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov <vladbu@...dia.com> wrote:
>> So, the following happens in reproduction provided in commit message
>> when executing "tc actions add action simple sdata \"1\" index 1
>> action simple sdata \"2\" index 2" command:
>>
>> 1. tcf_action_init() is called with batch of two actions of same type,
>> no module references are held, 'actions' array is empty:
>>
>> act_simple refcnt balance = 0
>> actions[] = {}
>>
>> 2. tc_action_load_ops() is called for first action:
>>
>> act_simple refcnt balance = +1
>> actions[] = {}
>>
>> 3. tc_action_load_ops() is called for second action:
>>
>> act_simple refcnt balance = +2
>> actions[] = {}
>>
>> 4. tcf_action_init_1() called for first action, succeeds, action
>> instance is assigned to 'actions' array:
>>
>> act_simple refcnt balance = +2
>> actions[] = { [0]=act1 }
>>
>> 5. tcf_action_init_1() fails for second action, 'actions' array not
>> changed, goto err:
>>
>> act_simple refcnt balance = +2
>> actions[] = { [0]=act1 }
>>
>> 6. tcf_action_destroy() is called for 'actions' array, last reference to
>> first action is released, tcf_action_destroy_1() calls module_put() for
>> actions module:
>>
>> act_simple refcnt balance = +1
>> actions[] = {}
>>
>> 7. err_mod loop starts iterating over ops array, executes module_put()
>> for first actions ops:
>>
>> act_simple refcnt balance = 0
>> actions[] = {}
>>
>> 7. err_mod loop executes module_put() for second actions ops:
>>
>> act_simple refcnt balance = -1
>> actions[] = {}
>>
>>
>> The goal of my fix is to not unconditionally release the module
>> reference for successfully initialized actions because this is already
>> handled by action destroy code. Hope this explanation clarifies things.
>
> Great explanation! It seems harder and harder to understand the
> module refcnt here. How about we just take the refcnt when we
> successfully create an action? Something like this:
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index b919826939e0..075cc80480bf 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -493,6 +493,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32
> index, struct nlattr *est,
>         }
>
>         p->idrinfo = idrinfo;
> +       __module_get(ops->owner);
>         p->ops = ops;
>         *a = p;
>         return 0;
> @@ -1035,13 +1036,6 @@ struct tc_action *tcf_action_init_1(struct net
> *net, struct tcf_proto *tp,
>         if (!name)
>                 a->hw_stats = hw_stats;
>
> -       /* module count goes up only when brand new policy is created
> -        * if it exists and is only bound to in a_o->init() then
> -        * ACT_P_CREATED is not returned (a zero is).
> -        */
> -       if (err != ACT_P_CREATED)
> -               module_put(a_o->owner);
> -
>         return a;
>
>  err_out:
> @@ -1100,7 +1094,8 @@ int tcf_action_init(struct net *net, struct
> tcf_proto *tp, struct nlattr *nla,
>         tcf_idr_insert_many(actions);
>
>         *attr_size = tcf_action_full_attrs_size(sz);
> -       return i - 1;
> +       err = i - 1;
> +       goto err_mod:
>
>  err:
>         tcf_action_destroy(actions, bind);
>
> The idea is on the higher level we hold refcnt when loading module and
> put it back _unconditionally_ when returning, and hold a refcnt only when
> we create an action and conditionally put it back when an error happens.
> With pseudo code, it is something like this:
>
> load_ops() // module refcnt +1
> init_actions(); // module refcnt +1 only when create a new one
> if (err)
>   // module refcnt -1 when we delete one
>   module_put();
> module_put(); // module refcnt -1
>
> This looks much easier to track. What do you think?
>
> Thanks!

Indeed, your suggestion looks more straightforward. The only thing we
need to mind is that action->init() callbacks assume that caller
releases the module even after calling tcf_idr_create(), so we also need
to modify tcf_idr_release() (used by error handlers in action->init()
implementations) to release the module.

I'll run some tests tomorrow to verify that I'm not missing anything
else.

Regards,
Vlad