lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 7 Feb 2012 18:25:18 +0200
From:	Gilad Ben-Yossef <gilad@...yossef.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	linux-fsdevel@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	hare@...e.de, Al Viro <viro@...iv.linux.org.uk>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH] vfs: Avoid IPI storm due to bh LRU invalidation

On Tue, Feb 7, 2012 at 12:25 AM, Jan Kara <jack@...e.cz> wrote:
> On Mon 06-02-12 13:17:17, Andrew Morton wrote:
>> On Mon, 6 Feb 2012 17:47:32 +0100
>> Jan Kara <jack@...e.cz> wrote:
>>
>> > On Mon 06-02-12 21:12:36, Srivatsa S. Bhat wrote:
>> > > On 02/06/2012 07:25 PM, Jan Kara wrote:
>> > >
>> > > > When discovery of lots of disks happen in parallel, we call
>> > > > invalidate_bh_lrus() once for each disk from partitioning code resulting in a
>> > > > storm of IPIs and causing a softlockup detection to fire (it takes several
>> > > > *minutes* for a machine to execute all the invalidate_bh_lrus() calls).
>>
>> Gad.  How many disks are we talking about here?
>  I think something around hundred scsi disks in this case (number of
> physical drives is actually lower but multipathing blows it up). I actually
> saw machines with close to thousand scsi disks (yes, they had names like
> sdabc ;).

LOL. Is that a huge SCSI disk array in your server or your are just
happy to see me... ? :-)
>
...
>> > >
>> > > Something related that you might be interested in:
>> > > https://lkml.org/lkml/2012/2/5/109
>> > >
>> > > (This is part of Gilad's patchset that tries to reduce cross-CPU IPI
>> > > interference.)
>> >   Thanks for the pointer. I didn't know about it. As Hannes wrote, this
>> > need not be enough for our use case as there might indeed be some bhs in
>> > the LRU. But I'd be interested how well the patchset works anyway. Maybe it
>> > would be enough because after all when we invalidate LRUs subsequent
>> > callers will see them empty and not issue IPI? Hannes, can you give a try
>> > to the patches?

I think its worth a shot since the mutex just delays the IPIs instead
of canceling them
altogether.

A somewhat similar issue in the direct reclaim path of the buddy
allocator trying
to reclaim per cpu pages was causing a massive storm of IPIs during OOM with
concurrent work loads and the IPI noise patches mitigate 85% of the
IPIs sent just by checking to see if there are any per cpu pages on the CPU you
are about to IPI, so maybe the same kind of logic applies here as well.

Thanks,
Gilad

-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@...yossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ