netdev - Re: [PATCH net-next] net: sched: set dedicated tcf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <vbfwoljc3uk.fsf@mellanox.com>
Date:   Thu, 28 Feb 2019 15:49:43 +0000
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
CC:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next] net: sched: set dedicated tcf_walker flag when
 tp is empty


On Wed 27 Feb 2019 at 23:09, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Wed, Feb 27, 2019 at 6:28 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Tue 26 Feb 2019 at 22:38, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> > On Tue, Feb 26, 2019 at 7:08 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >>
>> >>
>> >> On Mon 25 Feb 2019 at 22:52, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> >> > On Mon, Feb 25, 2019 at 7:38 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >> >>
>> >> >> Using tcf_walker->stop flag to determine when tcf_walker->fn() was called
>> >> >> at least once is unreliable. Some classifiers set 'stop' flag on error
>> >> >> before calling walker callback, other classifiers used to call it with NULL
>> >> >> filter pointer when empty. In order to prevent further regressions, extend
>> >> >> tcf_walker structure with dedicated 'nonempty' flag. Set this flag in
>> >> >> tcf_walker->fn() implementation that is used to check if classifier has
>> >> >> filters configured.
>> >> >
>> >> >
>> >> > So, after this patch commits like 31a998487641 ("net: sched: fw: don't
>> >> > set arg->stop in fw_walk() when empty") can be reverted??
>> >>
>> >> Yes, it is safe now to revert following commits:
>> >>
>> >> 3027ff41f67c ("net: sched: route: don't set arg->stop in route4_walk() when empty")
>> >> 31a998487641 ("net: sched: fw: don't set arg->stop in fw_walk() when empty")
>> >
>> > Yeah, and probably commit d66022cd1623
>> > ("net: sched: matchall: verify that filter is not NULL in mall_walk()").
>> >
>> > Please send a patch to revert them all.
>> >
>> > Thanks.
>>
>> I think commit d66022cd1623 ("net: sched: matchall: verify that filter
>> is not NULL in mall_walk()") and commit 8b58d12f4ae1 ("net: sched:
>> cgroup: verify that filter is not NULL during walk") shouldn't be
>> reverted. They are still necessary to prevent tcf_chain_dump() from
>> dumping NULL filter pointer. It can happen when dump is initiated in
>> parallel with inserting first filter to unlocked classifier.
>> tcf_fill_node() verifies that filter pointer is not NULL, so it will not
>> crash, but will output tcf_proto info for second time. This might
>> "confuse" user-space.
>
> I don't get this.
>
> First of all, what's confused here?

Let me describe it in more details. tcf_chain_dump() calls
tcf_fill_node() with NULL filter for every tp. When called like that
tcf_fill_node() outputs general tp information (iproute2 tc text
output):

filter protocol ip pref 1 flower chain 0

After that tcf_fill_node() is called for each filter on tp by
ops->walk(). When invoked in with non-NULL filter pointer it outputs tp
info and filter details:

filter protocol ip pref 1 flower chain 0 handle 0x1
  dst_mac e4:12:00:00:00:00
  src_mac e4:11:00:00:00:00
  eth_type ipv4
  ip_proto udp
  dst_ip 192.168.111.2
  src_ip 192.168.111.1
  dst_port 1
  src_port 1
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 33 sec used 33 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

However, if we revert NULL fixes dump will print general tp information
for same tp twice (once correctly before dumping all filters on the tp, and
second time when called for bogus NULL filter), like this:

filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0

This is what I meant in previous email by "confuse". Sorry for not being
more clear.

>
> Secondly, if there is something confusing, isn't it all because of
> your parallel algorithm? That is, the retry logic.

Yes, this is a side effect of my implementation. Dump code can see
transient tp instances before first filter is inserted. Doesn't have
anything to do with retry mechanism specifically, just a result of
unlocked execution. I didn't expect some classifiers to call
tcf_walker->fn() with NULL filters when I implemented it.

> I don't see how
> commit d66022cd1623 could be useful in this context, it helps
> to prevent a NULL crash which isn't a concern as long as it is
> checked in tcf_fill_node() as you described.

It prevents ops->walk() from calling tcf_walker->fn() with NULL filter
pointer. Besides my new tcf_proto_is_empty() function ops->walk() is
also used by dump code, which will not crash, but will generate output
described above.