lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ima1ndwz.fsf@notabene.neil.brown.name>
Date:   Fri, 20 Nov 2020 10:07:56 +1100
From:   NeilBrown <neilb@...e.de>
To:     Hillf Danton <hdanton@...a.com>
Cc:     "tj@...nel.org" <tj@...nel.org>,
        Trond Myklebust <trondmy@...merspace.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH rfc] workqueue: honour cond_resched() more effectively.

On Wed, Nov 18 2020, Hillf Danton wrote:

> On Wed, 18 Nov 2020 16:11:44 +1100 NeilBrown wrote:
>> On Wed, Nov 18 2020, Hillf Danton wrote:
>> ...
>> I don't think this is a good idea.
>
> Let me add a few more words.
>
>> cond_resched() is expected to be called often.  Adding all this extra
>
> They are those only invoked in concurrency-managed worker contexts and
> are thus supposed to be less often than thought; what is more the callers
> know what they are doing if a schedule() follows up, needless to say it
> is an ant-antenna-size add-in to check WORKER_CPU_INTENSIVE given
> 	WARN_ON_ONCE(workqueue_mustnt_use_cpu())
> added in cond_resched().

"supposed to be less often" is the central point here.
Because the facts are that they sometime happen with high frequency
despite what is "supposed" to happen.
Either the assumption that CM-workers don't call cond_resched() is
wrong, or the code that schedules such workers on CM-queues is wrong.

I much prefer the perspective that the assumption is wrong.  If that is
agreed then we need to handle that circumstance without making
cond_resched() more expensive.
Note that adding WARN_ON_ONCE() does not make it more expensive as it is
only enabled with KERNEL_DEBUG (and WQ_WATCHDOG, though the particular
config option could be changed). It isn't needed in production.

If the workqueue maintainers are unmovable in the position that a
CM-workitem must not use excessive CPU ever, and so must not call
cond_resched(), then I can take that back to the NFS maintainers and
negotiate different workqueue settings.  But as I've said, I think this
is requiring the decision to be made in a place that is not well
positioned to make it.

Thanks,
NeilBrown

Download attachment "signature.asc" of type "application/pgp-signature" (854 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ