netdev - Re: [RFD] Managed interrupt affinities [ Was: mlx5 broken affinity ]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4bae1d1d-d401-115d-91cc-4b7df88b02c5@grimberg.me>
Date:   Mon, 13 Nov 2017 23:13:46 +0200
From:   Sagi Grimberg <sagi@...mberg.me>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Jens Axboe <axboe@...com>, Jes Sorensen <jsorensen@...com>,
        Tariq Toukan <tariqt@...lanox.com>,
        Saeed Mahameed <saeedm@....mellanox.co.il>,
        Networking <netdev@...r.kernel.org>,
        Leon Romanovsky <leonro@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Kernel Team <kernel-team@...com>,
        Christoph Hellwig <hch@....de>
Subject: Re: [RFD] Managed interrupt affinities [ Was: mlx5 broken affinity ]


>> Do you know if any exist? Would it make sense to have a survey to
>> understand if anyone relies on it?
>>
>>  From what I've seen so far, drivers that were converted simply worked
>> with the non-managed facility and didn't have any special code for it.
>> Perhaps Christoph can comment as he convert most of them.
>>
>> But if there aren't any drivers that absolutely rely on it, maybe its
>> not a bad idea to allow it by default?
> 
> Sure, I was just cautious and I have to admit that I have no insight into
> the driver side details.

Christoph, feel free to chime in :)

Should I construct an email list of the driver maintainers of the
converted drivers?

>>>     * When and how is the driver informed about the change?
>>>
>>>        When:
>>>
>>>          #1 Before the core tries to move the interrupt so it can veto the
>>> 	  move if it cannot allocate new resources or whatever is required
>>> 	  to operate after the move.
>>
>> What would the core do if a driver veto a move?
> 
> Return the error code from write_affinity() as it does with any other error
> which fails to set the affinity.

OK, so this would mean that the driver queue no longer has a vector
correct? so is the semantics that it needs to cleanup its resources or
should it expect another callout for that?

>> I'm wandering in what conditions a driver will be unable to allocate
>> resources for move to cpu X but able to allocate for move to cpu Y.
> 
> Node affine memory allocation is the only thing which comes to my mind, or
> some decision not to have a gazillion of queues on a single CPU.

Yea, makes sense.

>> This looks like it can work to me, but I'm probably not familiar enough
>> to see the full picture here.
> 
> On the interrupt core side this is workable, I just need the input from the
> driver^Wsubsystem side if this can be implemented sanely.

Can you explain what do you mean by "subsystem"? I thought that the
subsystem would be the irq subsystem (which means you are the one to 
provide the needed input :) ) and the driver would pass in something
like msi_irq_ops to pci_alloc_irq_vectors() if it supports the driver
requirements that you listed and NULL to tell the core to leave it alone
and do what it sees fit (or pass msi_irq_ops with flag that means that).

ops structure is a very common way for drivers to communicate with a
subsystem core.