linux-kernel - Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181001160142.GE270328@devbig004.ftw2.facebook.com>
Date:   Mon, 1 Oct 2018 09:01:42 -0700
From:   Tejun Heo <tj@...nel.org>
To:     Alexander Duyck <alexander.h.duyck@...ux.intel.com>
Cc:     linux-nvdimm@...ts.01.org, gregkh@...uxfoundation.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org, len.brown@...el.com,
        dave.jiang@...el.com, rafael@...nel.org, vishal.l.verma@...el.com,
        jiangshanlai@...il.com, pavel@....cz, zwisler@...nel.org,
        dan.j.williams@...el.com
Subject: Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide
 queue_work_near to queue work near a given NUMA node

Hello,

On Wed, Sep 26, 2018 at 03:19:21PM -0700, Alexander Duyck wrote:
> On 9/26/2018 3:09 PM, Tejun Heo wrote:
> I could just use queue_work_on probably, but is there any issue if I
> am passing CPU values that are not in the wq_unbound_cpumask? That

That should be fine.  If it can't find any available cpu, it'll fall
back to round-robin.  We probably can improve it so that it can
consider the numa distance when falling back.

> was mostly my concern. Also for an unbound queue do I need to worry
> about the hotplug lock? I wasn't sure if that was the case or not as

Issuers don't need to worry about them.

> I know it is called out as something to be concerned with using
> queue_work_on, but in __queue_work the value is just used to
> determine which node to grab a work queue from.

It might be better to leave queue_work_on() to be used for per-cpu
workqueues and introduce queue_work_near() as you suggseted.  I just
don't want it to duplicate the node selection code in it.  Would that
work?

> I forgot to address your question about the advantages. They are
> pretty significant. The test system I was working with was
> initializing 3TB of nvdimm memory per node. If the node is aligned
> it takes something like 24 seconds, whereas an unaligned core can
> take 36 seconds or more.

Oh yeah, sure, numa affinity matters quite a bit on memory heavy
workloads.  I was mistaken that you were adding adding numa affinity
to per-cpu workqueues.

Thanks.

-- 
tejun