linux-kernel - Re: Kernel-managed IRQ affinity (cont)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87v9pjrtbh.fsf@nanos.tec.linutronix.de>
Date:   Fri, 10 Jan 2020 20:43:14 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Ming Lei <ming.lei@...hat.com>
Cc:     Peter Xu <peterx@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
        Ming Lei <minlei@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-block@...r.kernel.org
Subject: Re: Kernel-managed IRQ affinity (cont)

Ming,

Ming Lei <ming.lei@...hat.com> writes:
> On Thu, Jan 09, 2020 at 09:02:20PM +0100, Thomas Gleixner wrote:
>> Ming Lei <ming.lei@...hat.com> writes:
>>
>> This is duct tape engineering with absolutely no semantics. I can't even
>> figure out the intent of this 'managed_irq' parameter.
>
> The intent is to isolate the specified CPUs from handling managed
> interrupt.

That's what I figured, but it still does not provide semantics and works
just for specific cases.

> We can do that. The big problem is that the RT case can't guarantee that
> IO won't be submitted from isolated CPU always. blk-mq's queue mapping
> relies on the setup affinity, so un-known behavior(kernel crash, or io
> hang, or other) may be caused if we exclude isolated CPUs from interrupt
> affinity.
>
> That is why I try to exclude isolated CPUs from interrupt effective affinity,
> turns out the approach is simple and doable.

Yes, it's doable. But it still is inconsistent behaviour. Assume the
following configuration:

  8 CPUs CPU0,1 assigned for housekeeping

With 8 queues the proposed change does nothing because each queue is
mapped to exactly one CPU.

With 4 queues you get the following:

 CPU0,1       queue 0
 CPU2,3       queue 1
 CPU4,5       queue 2
 CPU6,7       queue 3

No effect on the isolated CPUs either.

With 2 queues you get the following:

 CPU0,1,2,3   queue 0
 CPU4,5,6,7   queue 1

So here the isolated CPUs 2 and 3 get the isolation, but 4-7
not. That's perhaps intended, but definitely not documented.

So you really need to make your mind up and describe what the intended
effect of this is and why you think that the result is correct.

Thanks,

       tglx