linux-kernel - Re: [PATCH] uio: Allow to take irq bottom-half into irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHtQpK5FLJn5qHKDbDqB00W9zthybEz4-DDMEtyPF12thEcxBA@mail.gmail.com>
Date:   Wed, 6 Dec 2017 17:59:01 +0100
From:   Andrey Zhizhikin <andrey.z@...il.com>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH] uio: Allow to take irq bottom-half into irq_handler with
 additional dt-binding

Hello Greg,

Thanks a lot for your prompt reply!

First of, this is my first patch submission to the Kernel, so thanks a
lot for your additional guidelines here regarding missing pieces.
Please don't judge me hard here. :)

I would add new DT bindings to Documentation and contact DT
maintainers to have a new binding discussed. However, I was not able
to find any dt-binding documentation for uio drivers in the kernel,
presumably I would have to create a new entry there...

As for the win against latency and running the patch against the
system which has all IRQ in threaded mode:

I've actually originated this patch based on the PREEMPT_RT kernel
configuration, where all IRQs are threaded. I have ARM-based system
running around 20 genirq UIO instances, and was demoting 2 of those
from threaded to non-threaded IRQ handlers without any issues recorded
to all the IRQ handlers. This patch actually is aimed exactly with the
logic that if new property is not found - then system behavior is not
amended, and all IRQs stays threaded. If needed, then a developer can
enable this property for it's node, but then he should be well-aware
of what this property implications are.

In average, using ftrace and kernelshark to analyze I observed the
gain of 20-30 usec per chain: irq_handler_entry -> irq/XX-uio -> <User
space IRQ part> so I would say the gain is not very significant for
average user-space task. However IMHO, there are several hidden
benefits here with having this modification, namely:
- It eliminates few re-scheduling operations, making INT ACK code more
robust and relieves the pressure from the scheduler when HW interrupt
for this IRQ is signaled at a high-enough frequency;
- It makes top and bottom half to be executed back-to-back with IRQ
OFF, making operation pseudo-atomic;
- Above gain might be significant when average latency times for the
systems are comparable.

I do have a worst-case latency measured with cyclictest here at 50
usec, so as a developer I would consider to have above gain in my
system. :)

Please let me know what you think on those points, if they all make
sense to you - otherwise I can drop this patch out.

-- 
Regards,
Andrey.

On Wed, Dec 6, 2017 at 4:31 PM, Greg KH <gregkh@...uxfoundation.org> wrote:
> On Wed, Dec 06, 2017 at 03:55:40PM +0100, Andrey Zhizhikin wrote:
>> Certain Kernel preemption models are using threaded interrupt handlers,
>> which is in general quite beneficial. However, threaded handlers
>> introducing additional scheduler overhead, when the bottom-half thread
>> should be woken up and scheduled for execution. This can result is
>> additional latency, which in certain cases is not desired.
>>
>> UIO driver with Generic IRQ handler, that wraps a HW block might suffer
>> a small degradation when it's bottom half is executed, since it needs
>> its bottom half to be woken up by the scheduler every time INT is
>> delivered. For high rate INT signals, this also bring additional
>> undesired load on the scheduler itself.
>>
>> Since the actual ACK is performed in the top-half, and bottom-half of
>> the UIO driver with Generic IRQ handler is relatively slick (only flag
>> is set based on the INT reception), it might be beneficial to move this
>> bottom-half to the irq_handler itself, rather than to have a separate
>> thread to service it.
>>
>> This patch aims to address the task above, and in addition introduces
>> a new dt-binding which could be configured on a per-node basis. That
>> means developers utilizing the UIO driver could decide which UIO
>> instance is critical in terms of interrupt processing, and move their
>> corresponding bottom-halves to the irq_handler to fight additional
>> scheduling latency.
>>
>> New DT binding:
>> - no-threaded-irq: when present, request_irq() is called with
>>   IRQF_NO_THREAD flag set, effectively skipping threaded interrupt
>>   handler and taking bottom-half into irq_handler
>>
>> Signed-off-by: Andrey Zhizhikin <andrey.z@...il.com>
>
> For new DT bindings, don't you have to add them to the in-kernel
> documentation and get an ack from the DT maintainers?  Please do that
> here.
>
> ALso, how much does this really save in latency/delay by not allowing a
> threaded irq?  What about systems that run all irqs in threaded mode?
> Will that break something here?
>
> thanks,
>
> greg k-h