lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DF4PR84MB01696549697831B4CEC4A3FDAB150@DF4PR84MB0169.NAMPRD84.PROD.OUTLOOK.COM>
Date:   Thu, 18 Aug 2016 21:08:18 +0000
From:   "Elliott, Robert (Persistent Memory)" <elliott@....com>
To:     Sreekanth Reddy <sreekanth.reddy@...adcom.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "irqbalance@...ts.infradead.org" <irqbalance@...ts.infradead.org>
CC:     Kashyap Desai <kashyap.desai@...adcom.com>,
        Sathya Prakash Veerichetty <sathya.prakash@...adcom.com>,
        Chaitra Basappa <chaitra.basappa@...adcom.com>,
        Suganath Prabu Subramani 
        <suganath-prabu.subramani@...adcom.com>
Subject: RE: Observing Softlockup's while running heavy IOs



> -----Original Message-----
> From: linux-kernel-owner@...r.kernel.org [mailto:linux-kernel-
> owner@...r.kernel.org] On Behalf Of Sreekanth Reddy
> Sent: Thursday, August 18, 2016 12:56 AM
> Subject: Observing Softlockup's while running heavy IOs
> 
> Problem statement:
> Observing softlockups while running heavy IOs on 8 SSD drives
> connected behind our LSI SAS 3004 HBA.
> 
...
> Observing a loop in the IO path, i.e only one CPU is busy with
> processing the interrupts and other CPUs (in the affinity_hint mask)
> are busy with sending the IOs (these CPUs are not yet all receiving
> any interrupts). For example, only CPU6 is busy with processing the
> interrupts from IRQ 219 and remaining CPUs i.e CPU 7,8,9,10 & 11 are
> just busy with pumping the IOs and they never processed any IO
> interrupts from IRQ 219. So we are observing softlockups due to
> existence this loop in the IO Path.
> 
> We may not observe these softlockups if irqbalancer might have
> balanced the interrupts among the CPUs enabled in the particular
> irq's
> affinity_hint mask. so that all the CPUs are equaly busy with send
> IOs
> and processing the interrupts. I am not sure how irqbalancer balance
> the load among the CPUs, but here I see only one CPU from irq's
> affinity_hint mask is busy with interrupts and remaining CPUs won't
> receive any interrupts from this IRQ.
> 
> Please help me with any suggestions/recomendations to slove/limit
> these kind of softlockups. Also please let me known if I have missed
> any setting in the irqbalance.
> 

The CPUs need to be forced to self-throttle by processing interrupts for 
their own submissions, which reduces the time they can submit more IOs.

See https://lkml.org/lkml/2014/9/9/931 for discussion of this
problem when blk-mq was added.


---
Robert Elliott, HPE Persistent Memory



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ