lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b88719c-782a-4a63-db9f-bf62734a7874@linaro.org>
Date:   Tue, 3 Sep 2019 08:40:35 +0200
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Ming Lei <ming.lei@...hat.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Long Li <longli@...rosoft.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Keith Busch <keith.busch@...el.com>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        John Garry <john.garry@...wei.com>,
        Hannes Reinecke <hare@...e.com>,
        linux-nvme@...ts.infradead.org, linux-scsi@...r.kernel.org
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

On 03/09/2019 08:31, Ming Lei wrote:
> Hi Daniel,
> 
> On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote:
>>
>> Hi Ming Lei,
>>
>> On 03/09/2019 05:30, Ming Lei wrote:
>>
>> [ ... ]
>>
>>
>>>>> 2) irq/timing doesn't cover softirq
>>>>
>>>> That's solvable, right?
>>>
>>> Yeah, we can extend irq/timing, but ugly for irq/timing, since irq/timing
>>> focuses on hardirq predication, and softirq isn't involved in that
>>> purpose.
>>>
>>>>  
>>>>> Daniel, could you take a look and see if irq flood detection can be
>>>>> implemented easily by irq/timing.c?
>>>>
>>>> I assume you can take a look as well, right?
>>>
>>> Yeah, I have looked at the code for a while, but I think that irq/timing
>>> could become complicated unnecessarily for covering irq flood detection,
>>> meantime it is much less efficient for detecting IRQ flood.
>>
>> In the series, there is nothing describing rigorously the problem (I can
>> only guess) and why the proposed solution solves it.
>>
>> What is your definition of an 'irq flood'? A high irq load? An irq
>> arriving while we are processing the previous one in the bottom halves?
> 
> So far, it means that handling interrupt & softirq takes all utilization
> of one CPU, then processes can't be run on this CPU basically, usually
> sort of CPU lockup warning will be triggered.

It is a scheduler problem then ?

>> The patch 2/4 description says "however IO completion is only done on
>> one of these submission CPU cores". That describes the bottleneck and
>> then the patch says "Add IRQF_RESCUE_THREAD to create one interrupt
>> thread handler", what is the rational between the bottleneck (problem)
>> and the irqf_rescue_thread (solution)?
> 
> The solution is to switch to handle this interrupt on the created rescue
> irq thread context when irq flood is detected, and 'this interrupt' means
> the interrupt requested with IRQF_RESCUE_THREAD.
> 
>>
>> Is it really the solution to track the irq timings to detect a flood?
> 
> The solution tracks the time taken on running do_IRQ() for each CPU.




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ