netdev - Re: [PATCH net 1/2] net/sched: cls_u32: fix refcount leak in the error path of u32

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <vbfr2102swb.fsf@mellanox.com>
Date:   Thu, 19 Dec 2019 16:15:53 +0000
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Jamal Hadi Salim <jhs@...atatu.com>
CC:     Davide Caratti <dcaratti@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Vlad Buslov <vladbu@...lanox.com>,
        Roman Mashak <mrv@...atatu.com>
Subject: Re: [PATCH net 1/2] net/sched: cls_u32: fix refcount leak in the
 error path of u32_change()


On Thu 19 Dec 2019 at 15:32, Jamal Hadi Salim <jhs@...atatu.com> wrote:
> Hi Davide,
>
> I ran your test on my laptop (4.19) and nothing dangled.
> Looking at that area of the code difference 4.19 vs current net-next
> there was a destroy() in there that migrated into the inner guts of
> tcf_chain_tp_delete_empty() Vlad? My gut feeling is restoring the old
> logic like this would work at least for u32:
>
> ------------
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 6a0eacafdb19..34a1d4e7e6e3 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -2135,8 +2135,10 @@ static int tc_new_tfilter(struct sk_buff *skb, struct
> nlmsghdr *n,
>         }
>
>  errout:
> -       if (err && tp_created)
> +       if (err && tp_created) {
> +               tcf_proto_destroy(tp, rtnl_held, true, NULL);
>                 tcf_chain_tp_delete_empty(chain, tp, rtnl_held, NULL);
> +       }
>  errout_tp:
>         if (chain) {
>                 if (tp && !IS_ERR(tp))
> -----
>
> Maybe even better tcf_proto_put(tp, rtnl_held, NULL) directly instead
> and no need for tcf_chain_tp_delete_empty().
>
> Of course above not even compile tested and may have consequences
> for other classifiers (flower would be a good test) and concurency
> intentions. Thoughts?
>
> cheers,
> jamal
>
> On 2019-12-18 9:23 a.m., Jamal Hadi Salim wrote:
>>
>> On 2019-12-17 6:00 p.m., Davide Caratti wrote:
>>> when users replace cls_u32 filters with new ones having wrong parameters,
>>> so that u32_change() fails to validate them, the kernel doesn't roll-back
>>> correctly, and leaves semi-configured rules.
>>>
>>> Fix this in u32_walk(), avoiding a call to the walker function on filters
>>> that don't have a match rule connected. The side effect is, these "empty"
>>> filters are not even dumped when present; but that shouldn't be a problem
>>> as long as we are restoring the original behaviour, where semi-configured
>>> filters were not even added in the error path of u32_change().
>>>
>>> Fixes: 6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is
>>> empty")
>>> Signed-off-by: Davide Caratti <dcaratti@...hat.com>
>>
>> Hi Davide,
>>
>> Great catch (and good test case addition),
>> but I am not sure about the fix.
>>
>> Unless I am  misunderstanding the flow is:
>> You enter bad rules, validation fails, partial state had already been
>> created (in this case root hts) before validation failure, you then
>> leave the partial state in the kernel but when someone dumps you hide
>> these bad tables?
>>
>> It sounds like the root cause is there is a missing destroy()
>> invocation somewhere during the create/validation failure - which is
>> what needs fixing... Those dangling tables should not have been
>> inserted; maybe somewhere along the code path for tc_new_tfilter().
>> Note "replace" is essentially "create if it doesnt
>> exist" semantic therefore NLM_F_CREATE will be set.
>>
>> Since this is a core cls issue - I would check all other classifiers
>> with similar aggrevation - make some attribute fail in the middle or
>> end.
>> Very likely only u32 is the victim.
>>
>> cheers,
>> jamal
>>

Hi Jamal,

Just destroying tp unconditionally will break unlocked case (flower)
because of possibility of concurrent insertion of new filters to the
same tp instance.

The root cause here is precisely described by Davide in cover letter -
to accommodate concurrent insertions cls API verifies that tp instance
is empty before deleting it and since there is no cls ops to do it
directly, it relies on checking that walk() stopped without accessing
any filters instead. Unfortunately, somw classifier implementations
assumed that there is always at least one filter on classifier (I fixed
several of these) and now Davide also uncovered this leak in u32.

As a simpler solution to fix such issues once and for all I can propose
not to perform the walk() check at all and assume that any classifier
implementation that doesn't have TCF_PROTO_OPS_DOIT_UNLOCKED flag set is
empty in tcf_chain_tp_delete_empty() (there is no possibility of
concurrent insertion when synchronizing with rtnl).

WDYT?

Regards,
Vlad