lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbfmum6lovf.fsf@mellanox.com>
Date:   Thu, 7 Mar 2019 14:51:07 +0000
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Davide Caratti <dcaratti@...hat.com>
CC:     Cong Wang <xiyou.wangcong@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>, Paolo Abeni <pabeni@...hat.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net 03/16] net/sched: act_csum: validate the control
 action inside init()


On Thu 07 Mar 2019 at 15:56, Davide Caratti <dcaratti@...hat.com> wrote:
> On Fri, 2019-03-01 at 16:04 -0800, Cong Wang wrote:
>> > On Fri, Mar 1, 2019 at 10:02 AM Davide Caratti <dcaratti@...hat.com> wrote:
>> >
>> > if I well understand the question, you are worried about
>> > tcf_action_goto_chain_exec(), that can dereference 'oldchain' while we are
>> > overwriting the action. A call to tcf_chain_put_by_act(oldchain) would
>> > decrease refcounts and eventually call kfree(oldchain).
>> >
>> > But this would result in a use-after-free only in case the chain has only
>> > refcount held by 1 action (the one we are overwriting), and 0 filters: is
>> > this a condition where packets can go through this action's data plane?
>>
>> Hmm? Isn't goto chain can be arbitrary? Packets can be routed
>> from this action to any filter chain, so even if the target chain has 0
>> filter this action still has traffic as long as itself is not on the same
>> chain?
>
> hi,
>
> sorry for the delay: it took some time to verify experimentally if we
> really need this or not. So, we want to ensure the control path doesn't do
>
>     tcf_csum_init(..., ovr=1, ...)
>       tcf_chain_put_by_act(oldchain)
>         tcf_chain_destroy(oldchain, ...)
> 	  kfree(oldchain);
>
> while the traffic path of the action is doing
>
>     res->goto_tp = rcu_dereference_bh(oldchain->filter_chain);
>
> I iterated this script many times on a KASAN kernel,
>
> # tc chain add dev crash0 chain 42 ingress protocol ip flower ip_proto udp action pass
> # tc filter add dev crash0 ingress protocol ip flower ip_proto udp action csum udp goto chain 42 index 66
> # tc chain del dev crash0 chain 42 ingress
> (start netperf traffic)
> # tc action replace action csum udp pass index 66
>
> and reproduced twice the following splat:
>
>  ==================================================================
>  BUG: KASAN: null-ptr-deref in tcf_action_exec+0x181/0x200
>  Read of size 8 at addr 0000000000000000 by task netperf/5777
>
>  CPU: 0 PID: 5777 Comm: netperf Not tainted 5.0.0-rc7.goto_chain_fix+ #551
>  Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>  Call Trace:
>  <IRQ>
>   dump_stack+0xc7/0x15b
>   ? show_regs_print_info+0x5/0x5
>   ? _raw_read_lock_irq+0x40/0x40
>   ? tcf_action_exec+0x181/0x200
>   ? tcf_action_exec+0x181/0x200
>   kasan_report+0x176/0x192
>   ? tcf_action_exec+0x181/0x200
>   tcf_action_exec+0x181/0x200
>   ? find_dump_kind+0x170/0x170
>   ? rcu_irq_exit+0x153/0x210
>   fl_classify+0x81a/0x830
>
> so, I think that the answer to your question:
>
> On Wed, 2019-02-27 at 17:50 -0800, Cong Wang wrote:
>> > > > +       if (oldchain)
>> > > > +               tcf_chain_put_by_act(oldchain);
>> > >
>> > > Do we need to respect RCU grace period here?
>
> is a "yes, we do".
> Now I'm trying something similar to what's done in tcf_bpf_init(), to
> release the bpf program on 'replace' operations:
>
> 365         if (res == ACT_P_CREATED) {
> 366                 tcf_idr_insert(tn, *act);
> 367         } else {
> 368                 /* make sure the program being replaced is no longer executing */
> 369                 synchronize_rcu();
> 370                 tcf_bpf_cfg_cleanup(&old);
> 371         }
>
> do you think it's worth going in this direction?
> thank you in advance!

Hi Davide,

Using synchronize_rcu() will impact rule update rate performance and I
don't think we really need it. I don't see any reason why we can't just
update chain to be rcu-friendly. Data path is already rcu_read
protected, in fact it only needs chain to read rcu-pointer to tp list
when jumping to chain. So it should be enough to do the following:

1) Update tcf_chain_destroy() to free chain after rcu grace period.

2) Convert tc_action->goto_chain to be a proper rcu pointer. (mark it
with "__rcu", assign with rcu_assign_pointer(), read it with
rcu_dereference{_bh}(), etc.)

Am I missing anything?

Regards,
Vlad

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ