lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pn4mosks.fsf@notabene.neil.brown.name>
Date:   Tue, 10 Nov 2020 13:26:27 +1100
From:   NeilBrown <neilb@...e.de>
To:     Peter Zijlstra <peterz@...radead.org>,
        Trond Myklebust <trondmy@...merspace.com>
Cc:     "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "jiangshanlai@...il.com" <jiangshanlai@...il.com>,
        "tj@...nel.org" <tj@...nel.org>,
        "mhocko@...e.com" <mhocko@...e.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>
Subject: Re: [PATCH rfc] workqueue: honour cond_resched() more effectively.

On Mon, Nov 09 2020, Peter Zijlstra wrote:

> On Mon, Nov 09, 2020 at 01:50:40PM +0000, Trond Myklebust wrote:
>> On Mon, 2020-11-09 at 09:00 +0100, Peter Zijlstra wrote:
>
>> > I'm thinking the real problem is that you're abusing workqueues. Just
>> > don't stuff so much work into it that this becomes a problem. Or
>> > rather,
>> > if you do, don't lie to it about it.
>> 
>> If we can't use workqueues to call iput_final() on an inode, then what
>> is the point of having them at all?
>
> Running short stuff, apparently.

Also running stuff that sleeps.  If only does work in short bursts, and
sleeps between the works, it can run as long as it likes.
It is only sustained bursts that are currently not supported with
explicit code.

>
>> Neil's use case is simply a file that has managed to accumulate a
>> seriously large page cache, and is therefore taking a long time to
>> complete the call to truncate_inode_pages_final(). Are you saying we
>> have to allocate a dedicated thread for every case where this happens?
>
> I'm not saying anything, but you're trying to wreck the scheduler
> because of a workqueue 'feature'. The 'new' workqueues limit concurrency
> by design, if you're then relying on concurrency for things, you're
> using it wrong.
>
> I really don't know what the right answer is here, but I thoroughly hate
> the one proposed.

Oh good - plenty for room for improvement then :-)

I feel strongly that this should work transparently.  Expecting people
too choose the right option to handle cases that don't often some up in
testing is naive.
A warning whenever a bound,non-CPU-intensive worker calls cond_resched()
is trivial to implement and extremely noise.  As mentioned, I get twenty
just to boot.

One amusing example is rhashtable which schedule a worker to rehash a
table.  This is expected to be cpu-intensive because it calls
cond_resched(), but it is run with schedule_work() - clearly not
realizing that will block other scheduled work on that CPU.

An amusing example for the flip-side is crypto/cryptd.c which creates a
WQ_CPU_INTENSIVE workqueue (cryptd) but the cryptd_queue_worker() has
a comment "Only handle one request at a time to avoid hogging crypto
workqueue." !!! The whole point of WQ_CPU_INTENSIVE is that you cannot
hog the workqueue!!

Anyway, I digress....  warning on ever cond_resched() generates lots of
warnings, including some from printk.... so any work item that might
ever print a message needs to be CPU_INTENSIVE???
I don't think that scales.

Is there some way the scheduler can help?  Does the scheduler notice
"time to check on that CPU over there" and then:
 - if it is in user-space- force it to schedule
 - if it is in kernel-space (and preempt is disabled), then leave it
 alone
 ??

If so, could there be a third case - if it is a bound,non-cpu-intensive
worker, switch it to cpu-intensive???

I wonder how long workers typically run - do many run long enough that
the scheduler might want to ask them to take a break?

Thanks,
NeilBrown

Download attachment "signature.asc" of type "application/pgp-signature" (854 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ