linux-kernel - Re: [PATCH 4/4] nvdimm: Trigger the device probe on a cpu local to the device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPcyv4gm30sT_us0j27jLmNTV_Fug4d8EW4xTmiTMFdwGSjN-A@mail.gmail.com>
Date:   Tue, 11 Sep 2018 22:48:40 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        pavel.tatashin@...rosoft.com, Michal Hocko <mhocko@...e.com>,
        Dave Jiang <dave.jiang@...el.com>,
        Ingo Molnar <mingo@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Logan Gunthorpe <logang@...tatee.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH 4/4] nvdimm: Trigger the device probe on a cpu local to
 the device

On Mon, Sep 10, 2018 at 4:44 PM, Alexander Duyck
<alexander.duyck@...il.com> wrote:
> From: Alexander Duyck <alexander.h.duyck@...el.com>
>
> This patch is based off of the pci_call_probe function used to initialize
> PCI devices. The general idea here is to move the probe call to a location
> that is local to the memory being initialized. By doing this we can shave
> significant time off of the total time needed for initialization.
>
> With this patch applied I see a significant reduction in overall init time
> as without it the init varied between 23 and 37 seconds to initialize a 3GB
> node. With this patch applied the variance is only between 23 and 26
> seconds to initialize each node.
>
> I hope to refine this further in the future by combining this logic into
> the async_schedule_domain code that is already in use. By doing that it
> would likely make this functionality redundant.

Yeah, it is a bit sad that we schedule an async thread only to move it
back somewhere else.

Could we trivially achieve the same with an
async_schedule_domain_on_cpu() variant? It seems we can and the
workqueue core will "Do the right thing".

I now notice that async uses the system_unbound_wq and work_on_cpu()
uses the system_wq.  I don't think we want long running nvdimm work on
system_wq.