linux-kernel - Re: [PATCH v2] EXP rcu: Move expedited grace period (GP) work to RT kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEXW_YTmZnk_kFw48HeyyFTXZzfj1cPdw+BaOra14JiWJh6kNg@mail.gmail.com>
Date:   Wed, 13 Apr 2022 13:21:20 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Hillf Danton <hdanton@...a.com>,
        Kalesh Singh <kaleshsingh@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] EXP rcu: Move expedited grace period (GP) work to RT kthread_worker

Hi Paul,


On Wed, Apr 13, 2022 at 8:07 AM Paul E. McKenney <paulmck@...nel.org> wrote:
>
> On Wed, Apr 13, 2022 at 07:37:11PM +0800, Hillf Danton wrote:
> > On Sat, 9 Apr 2022 08:56:12 -0700 Paul E. McKenney wrote:
> > > On Sat, Apr 09, 2022 at 03:17:40PM +0800, Hillf Danton wrote:
> > > > On Fri, 8 Apr 2022 10:53:53 -0700 Kalesh Singh wrote
> > > > > Thanks for the discussion everyone.
> > > > >
> > > > > We didn't fully switch to kthread workers to avoid changing the
> > > > > behavior for users that dont need this low latency exp GPs. Another
> > > > > (and perhaps more important) reason is because kthread_worker offers
> > > > > reduced concurrency than workqueues which Pual reported can pose
> > > > > issues on systems with a large number of CPUs.
> > > >
> > > > A second ... what issues were reported wrt concurrency, given the output
> > > > of grep -nr workqueue block mm drivers.
> > > >
> > > > Feel free to post a URL link to the issues.
> > >
> > > The issues can be easily seen by inspecting kthread_queue_work() and
> > > the functions that it invokes.  In contrast, normal workqueues uses
> > > per-CPU mechanisms to avoid contention, as can equally easily be seen
> > > by inspecting queue_work_on() and the functions that it invokes.
> >
> > The worker from kthread_create_worker() roughly matches unbound workqueue
> > that can get every CPU overloaded, thus the difference in implementation
> > details between kthread worker and WQ worker (either bound or unbound) can
> > be safely ignored if the kthread method works, given that prioirty is barely
> > a cure to concurrency issues.
>
> Please look again, this time taking lock contention in to account,
> keeping in mind that systems with several hundred CPUs are reasonably
> common and that systems with more than a thousand CPUs are not unheard of.


You are talking about lock contention in the kthread_worker infra
which unbound WQ does not suffer from, right? I don't think the worker
lock contention will be an issue unless several
synchronize_rcu_expedited() calls are trying to queue work at the same
time. Did I miss something? Considering synchronize_rcu_expedited()
can block in the normal case (blocking is a pretty heavy operation
involving the scheduler and load balancers), I don't see how
contending on the worker infra locks can be an issue. If it was
call_rcu() , then I can relate to any contention since that executes
much more often.

I think the argument about too many things being RT is stronger though.

Thanks,

Joel


>
>
>                                                         Thanx, Paul
>
> > Hillf
> > >
> > > Please do feel free to take a look.
> > >
> > > If taking a look does not convince you, please construct some in-kernel
> > > benchmarks to test the scalability of these two mechanisms.  Please note
> > > that some care will be required to make sure that you are doing a valid
> > > apples-to-apples comparison.
> > >
> > >                                                     Thanx, Paul
> > >