[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bae1d1d-d401-115d-91cc-4b7df88b02c5@grimberg.me>
Date: Mon, 13 Nov 2017 23:13:46 +0200
From: Sagi Grimberg <sagi@...mberg.me>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Jens Axboe <axboe@...com>, Jes Sorensen <jsorensen@...com>,
Tariq Toukan <tariqt@...lanox.com>,
Saeed Mahameed <saeedm@....mellanox.co.il>,
Networking <netdev@...r.kernel.org>,
Leon Romanovsky <leonro@...lanox.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Kernel Team <kernel-team@...com>,
Christoph Hellwig <hch@....de>
Subject: Re: [RFD] Managed interrupt affinities [ Was: mlx5 broken affinity ]
>> Do you know if any exist? Would it make sense to have a survey to
>> understand if anyone relies on it?
>>
>> From what I've seen so far, drivers that were converted simply worked
>> with the non-managed facility and didn't have any special code for it.
>> Perhaps Christoph can comment as he convert most of them.
>>
>> But if there aren't any drivers that absolutely rely on it, maybe its
>> not a bad idea to allow it by default?
>
> Sure, I was just cautious and I have to admit that I have no insight into
> the driver side details.
Christoph, feel free to chime in :)
Should I construct an email list of the driver maintainers of the
converted drivers?
>>> * When and how is the driver informed about the change?
>>>
>>> When:
>>>
>>> #1 Before the core tries to move the interrupt so it can veto the
>>> move if it cannot allocate new resources or whatever is required
>>> to operate after the move.
>>
>> What would the core do if a driver veto a move?
>
> Return the error code from write_affinity() as it does with any other error
> which fails to set the affinity.
OK, so this would mean that the driver queue no longer has a vector
correct? so is the semantics that it needs to cleanup its resources or
should it expect another callout for that?
>> I'm wandering in what conditions a driver will be unable to allocate
>> resources for move to cpu X but able to allocate for move to cpu Y.
>
> Node affine memory allocation is the only thing which comes to my mind, or
> some decision not to have a gazillion of queues on a single CPU.
Yea, makes sense.
>> This looks like it can work to me, but I'm probably not familiar enough
>> to see the full picture here.
>
> On the interrupt core side this is workable, I just need the input from the
> driver^Wsubsystem side if this can be implemented sanely.
Can you explain what do you mean by "subsystem"? I thought that the
subsystem would be the irq subsystem (which means you are the one to
provide the needed input :) ) and the driver would pass in something
like msi_irq_ops to pci_alloc_irq_vectors() if it supports the driver
requirements that you listed and NULL to tell the core to leave it alone
and do what it sees fit (or pass msi_irq_ops with flag that means that).
ops structure is a very common way for drivers to communicate with a
subsystem core.
Powered by blists - more mailing lists