lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141202142627.6f59f693@tlielax.poochiereds.net>
Date:	Tue, 2 Dec 2014 14:26:27 -0500
From:	Jeff Layton <jeff.layton@...marydata.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [RFC PATCH 00/14] nfsd/sunrpc: add support for a
 workqueue-based nfsd

On Tue, 2 Dec 2014 14:18:14 -0500
Tejun Heo <tj@...nel.org> wrote:

> Hello, Jeff.
> 
> On Tue, Dec 02, 2014 at 01:24:09PM -0500, Jeff Layton wrote:
> > 2) get some insight about the latency from those with a better
> > understanding of the CMWQ code. Any thoughts as to why we might be
> > seeing such high latency here? Any ideas of what we can do about it?
> 
> The latency is prolly from concurrency management.  Work items which
> participate in concurrency management (the ones on per-cpu workqueues
> w/o WQ_CPU_INTENSIVE set) tend to get penalized on latency side quite
> a bit as the "run" durations for all such work items end up being
> serialized on the cpu.  Setting WQ_CPU_INTENSIVE on the workqueue
> disables concurrency management and so does making the workqueue
> unbound.  If strict cpu locality is likely to be beneficial and each
> work item isn't likely to consume huge amount of cpu cycles,
> WQ_CPU_INTENSIVE would fit better; otherwise, WQ_UNBOUND to let the
> scheduler do its thing.
> 
> Thanks.
> 

Thanks Tejun,

I'm already using WQ_UNBOUND workqueues. If that exempts this code from
the concurrency management, then that's probably not the problem. The
jobs here aren't terribly CPU intensive, but they can sleep for a long
time while waiting on I/O, etc...

I don't think we necessarily need CPU locality (though that's nice to
have of course), but NUMA affinity will likely be important. It looked
like you had done some work a year or so ago to make unbound workqueues
prefer to queue work on the same NUMA node which meshes nicely with
what I think we want for this.

I'll keep looking at it -- let me know if you have any other thoughts
on the latency...

Cheers!
-- 
Jeff Layton <jlayton@...marydata.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ