[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83d3944f-8a31-eb31-93db-294906630b0e@grimberg.me>
Date: Thu, 2 Nov 2017 12:08:43 +0200
From: Sagi Grimberg <sagi@...mberg.me>
To: Tariq Toukan <tariqt@...lanox.com>,
Jes Sorensen <jsorensen@...com>,
Saeed Mahameed <saeedm@....mellanox.co.il>
Cc: Networking <netdev@...r.kernel.org>,
Leon Romanovsky <leonro@...lanox.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Kernel Team <kernel-team@...com>,
Christoph Hellwig <hch@....de>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: mlx5 broken affinity
>>> I vaguely remember Nacking Sagi's patch as we knew it would break
>>> mlx5e netdev affinity assumptions.
> I remember that argument. Still the series found its way in.
Of course it maid its way in, it was acked by three different
maintainers, and I addressed all of Saeed's comments.
> That series moves affinity decisions to kernel's responsibility.
> AFAI see, what kernel does is assign IRQs to the NUMA's one by one in
> increasing indexing (starting with cores of NUMA #0), no matter what
> NUMA is closer to the NIC.
Well, as we said before, if there is a good argument to do the home node
first we can change the generic code (as it should be given that this is
absolutely not device specific).
> This means that if your NIC is on NUMA #1, and you reduce the number of
> channels, you might end up working only with the cores on the far NUMA.
> Not good!
We deliberated on this before, and concluded that application affinity
and device affinity are equally important. If you have a real use case
that shows otherwise, its perfectly doable to start from the device home
node.
>>> And I agree here that user should be able to read
>>> /proc/irq/x/smp_affinity and even modify it if required.
> Totally agree. We should fix that ASAP.
> User must have write access.
I'll let Thomas reply here, I do not fully understand the reason for why
pci_alloc_irq_vectors() make the affinity assignments immutable..
Powered by blists - more mailing lists