netdev - Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <vbfr2bmohfr.fsf@mellanox.com>
Date:   Mon, 4 Mar 2019 14:14:20 +0000
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
CC:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next] net: sched: don't release block->lock when
 dumping chains


On Sat 02 Mar 2019 at 00:08, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Thu, Feb 28, 2019 at 6:53 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Wed 27 Feb 2019 at 23:03, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> > On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >>
>> >>
>> >> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> >> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >> >>
>> >> >> Function tc_dump_chain() obtains and releases block->lock on each iteration
>> >> >> of its inner loop that dumps all chains on block. Outputting chain template
>> >> >> info is fast operation so locking/unlocking mutex multiple times is an
>> >> >> overhead when lock is highly contested. Modify tc_dump_chain() to only
>> >> >> obtain block->lock once and dump all chains without releasing it.
>> >> >>
>> >> >> Signed-off-by: Vlad Buslov <vladbu@...lanox.com>
>> >> >> Suggested-by: Cong Wang <xiyou.wangcong@...il.com>
>> >> >
>> >> > Thanks for the followup!
>> >> >
>> >> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
>> >> > And for tc_dump_tfilter()?
>> >>
>> >> Not really. These two dump all tp filters and not just a template, which
>> >> is O(n) on number of filters and can be slow because it calls hw offload
>> >> API for each of them. Our typical use-case involves periodic filter dump
>> >> (to update stats) while multiple concurrent user-space threads are
>> >> updating filters, so it is important for them to be able to execute in
>> >> parallel.
>> >
>> > Hmm, but if these are read-only, you probably don't even need a
>> > mutex, you can just use RCU read lock to protect list iteration
>> > and you still can grab the refcnt in the same way.
>>
>> That is how it worked in my initial implementation. However, it doesn't
>> work with hw offloads because driver callbacks can sleep.
>
> Hmm? You drop RCU read lock after grabbing the refcnt,
> right? If so what's the problem with sleeping?

Okay, I misunderstood your suggestion. In tc_dump_tfilter() we can't use
RCU in __tcf_get_next_chain() because chain reference counters are not
atomic and require protection of block->lock. __tcf_get_next_proto()
requires chain->filter_chain_lock because it checks 'deleting' flag
besides taking reference to tp.