lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y+WUJ/qeyVRxYUhN@kbusch-mbp.dhcp.thefacebook.com>
Date:   Thu, 9 Feb 2023 17:47:35 -0700
From:   Keith Busch <kbusch@...nel.org>
To:     "Patel, Nirmal" <nirmal.patel@...ux.intel.com>
Cc:     Xinghui Li <korantwork@...il.com>,
        Jonathan Derrick <jonathan.derrick@...ux.dev>,
        lpieralisi@...nel.org, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, Xinghui Li <korantli@...cent.com>
Subject: Re: [PATCH] PCI: vmd: Do not disable MSI-X remapping in VMD 28C0
 controller

On Thu, Feb 09, 2023 at 04:57:59PM -0700, Patel, Nirmal wrote:
> On 2/9/2023 4:05 PM, Keith Busch wrote:
> > On Tue, Feb 07, 2023 at 01:32:20PM -0700, Patel, Nirmal wrote:
> >> On 2/6/2023 8:18 PM, Xinghui Li wrote:
> >>> Keith Busch <kbusch@...nel.org> 于2023年2月7日周二 02:28写道:
> >>>> I suspect bypass is the better choice if "num_active_cpus() > pci_msix_vec_count(vmd->dev)".
> >>> For this situation, My speculation is that the PCIE nodes are
> >>> over-mounted and not just because of the CPU to Drive ratio.
> >>> We considered designing online nodes, because we were concerned that
> >>> the IO of different chunk sizes would adapt to different MSI-X modes.
> >>> I privately think that it may be logically complicated if programmatic
> >>> judgments are made.
> >> Also newer CPUs have more MSIx (128) which means we can still have
> >> better performance without bypass. It would be better if user have
> >> can chose module parameter based on their requirements. Thanks.
> > So what? More vectors just pushes the threshold to when bypass becomes
> > relevant, which is exactly why I suggested it. There has to be an empirical
> > answer to when bypass beats muxing. Why do you want a user tunable if there's a
> > verifiable and automated better choice?
> 
> Make sense about the automated choice. I am not sure what is the exact
> tipping point. The commit message includes only two cases. one 1 drive
> 1 CPU and second 12 drives 6 CPU. Also performance gets worse from 8
> drives to 12 drives.

That configuration's storage performance overwhelms the CPU with interrupt
context switching. That problem probably inverts when your active CPU count
exceeds your VMD vectors because you'll be funnelling more interrupts into
fewer CPUs, leaving other CPUs idle.

> One the previous comments also mentioned something about FIO changing
> cpus_allowed; will there be an issue when VMD driver decides to bypass
> the remapping during the boot up, but FIO job changes the cpu_allowed?

No. Bypass mode uses managed interrupts for your nvme child devices, which sets
the best possible affinity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ