lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 15 Jul 2011 16:43:44 -0700
From:	ersatz splatt <ersatzsplatt@...il.com>
To:	Roland Dreier <roland@...estorage.com>
Cc:	Matthew Wilcox <matthew@....cx>, Jens Axboe <axboe@...nel.dk>,
	"Jiang, Dave" <dave.jiang@...el.com>,
	"Williams, Dan J" <dan.j.williams@...el.com>,
	"Foong, Annie" <annie.foong@...el.com>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Nadolski, Edmund" <edmund.nadolski@...el.com>,
	"Skirvin, Jeffrey D" <jeffrey.d.skirvin@...el.com>
Subject: Re: rq_affinity doesn't seem to work?

On Thu, Jul 14, 2011 at 10:02 AM, Roland Dreier <roland@...estorage.com> wrote:

> The problem as we've seen it is that on a dual-socket Westmere (Xeon
> 56xx) system, we have two sockets with 6 cores (12 threads) each, all
> sharing L3 cache, and so we end up with all block softirqs on only 2
> out of 24 threads, which is not enough to handle all the IOPS that
> fast storage can provide.

I have a dual socket system with Tylersburg chipset (approximately
Westmere I gather).
With two Xeon X5660 packages I get this when running with more iops
potential than the system can handle:

02:15:00 PM CPU  %usr  %nice %sys %iowait  %irq  %soft  %steal  %guest %idle
02:15:02 PM  all    2.76    0.00   30.40   28.28    0.00   13.74
0.00    0.00   24.81
02:15:02 PM    0    0.00    0.00    0.00    0.00    0.00  100.00
0.00    0.00    0.00
02:15:02 PM    1    0.00    0.00    0.50    0.00    0.00    0.00
0.00    0.00   99.50
02:15:02 PM    2    3.02    0.00   36.68   52.26    0.00    8.04
0.00    0.00    0.00
02:15:02 PM    3    2.50    0.00   36.00   54.50    0.00    7.00
0.00    0.00    0.00
02:15:02 PM    4    5.47    0.00   64.18   18.91    0.00   11.44
0.00    0.00    0.00
02:15:02 PM    5    3.02    0.00   37.69   53.27    0.00    6.03
0.00    0.00    0.00
02:15:02 PM    6    0.00    0.00    0.50    0.00    0.00   91.54
0.00    0.00    7.96
02:15:02 PM    7    0.00    0.00    0.00    0.00    0.00    0.00
0.00    0.00  100.00
02:15:02 PM    8    3.00    0.00   35.50   55.00    0.00    6.50
0.00    0.00    0.00
02:15:02 PM    9    3.02    0.00   39.70   50.25    0.00    7.04
0.00    0.00    0.00
02:15:02 PM   10    3.50    0.00   36.50   53.00    0.00    7.00
0.00    0.00    0.00
02:15:02 PM   11    6.53    0.00   70.85    9.05    0.00   13.57
0.00    0.00    0.00
02:15:02 PM   12    0.00    0.00    0.57    0.00    0.00    0.00
0.00    0.00   99.43
02:15:02 PM   13    3.00    0.00    0.00    0.00    0.00    0.00
0.00    0.00   97.00
02:15:02 PM   14    2.50    0.00   36.50   54.00    0.00    7.00
0.00    0.00    0.00
02:15:02 PM   15    3.52    0.00   36.18   53.77    0.00    6.53
0.00    0.00    0.00
02:15:02 PM   16    5.00    0.00   64.00   21.00    0.00   10.00
0.00    0.00    0.00
02:15:02 PM   17    3.02    0.00   37.19   52.76    0.00    7.04
0.00    0.00    0.00
02:15:02 PM   18    0.00    0.00    0.00    0.00    0.00    0.00
0.00    0.00  100.00
02:15:02 PM   19    0.00    0.00    1.01    0.00    0.00    0.00
0.00    0.00   98.99
02:15:02 PM   20    3.48    0.00   38.31   52.24    0.00    5.97
0.00    0.00    0.00
02:15:02 PM   21    5.50    0.00   63.00   18.50    0.00   13.00
0.00    0.00    0.00
02:15:02 PM   22    2.50    0.00   35.00   54.50    0.00    8.00
0.00    0.00    0.00
02:15:02 PM   23    5.03    0.00   58.79   23.62    0.00   12.56
0.00    0.00    0.00

By "more IOPS potential than the system can handle", I mean that with
about a quarter of the targets I get the same figure.  The HBA is
known to handle more than twice the IOPS I'm seeing.

I'm using 16 targets with fio driving one target with each core you
see sys activity on.  You can see that two additional cores are
getting weighed down -- 0 and 6.  Is that indicative of the
bottleneck?

These results are without using any of the patches suggested in this
e-mail thread.  I'll have to try and see if they help.

What is the top number of IOPS I should hope for with this system and
the Linux kernel?
Dave Jiang (or anyone else) -- can you share the max IOPS that you are seeing?


> It's not clear to me what the right answer or tradeoffs are here.  It
> might make sense to use only one hyperthread per core for block
> softirqs.  As I understand the Westmere cache topology, there's not
> really an obvious intermediate step -- all the cores in a package
> share the L3, and then each core has its own L2.
>
> Limiting softirqs to 10% of a core seems a bit low, since we seem to
> be able to use more than 100% of a core handling block softirqs, and
> anyway magic numbers like that seem to always be wrong sometimes.
> Perhaps we could use the queue length on the destination CPU as a
> proxy for how busy ksoftirq is?
>
>  - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ