lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 05 Oct 2011 17:13:36 +0900
From:	"Jun'ichi Nomura" <j-nomura@...jp.nec.com>
To:	Lukas Hejtmanek <xhejtman@....muni.cz>
CC:	Mike Snitzer <snitzer@...hat.com>,
	Kiyoshi Ueda <k-ueda@...jp.nec.com>, agk@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: request baset device mapper in Linux

Hi Lukas,

On 09/30/11 05:57, Lukas Hejtmanek wrote:
> On Mon, Sep 19, 2011 at 02:50:08PM +0900, Jun'ichi Nomura wrote:
>>   2.6.32.36
>>     no-multipath:        2.8 GB/s
>>     multipath:           600 MB/s
>>     multipath w/ patch:  2.8-2.9 GB/s
>>
>>   3.0.3
>>     no-multipath:        ??
>>     multipath:           2.5 GB/s
>>     multipath w/ patch:  2.5 GB/s(?)
>>
>> Have you tried 3.0.3 without multipath?
> 
> yes, 3GB/s and only kwapd0 and kswapd1 is running, no kworker or ksoftirqd..

Hmm.. did you find any difference in your profile this time?

I'm trying to reproduce it myself but no success so far
(perhaps disks are not fast enough to saturate CPU on my test machine).

As ksoftirqd in top implies your CPU4 gets too much I/O completions,
'rq_affnity = 2' for both dm and SCSI devices might be a solution.
It'll distribute block completion softirqs to submitters and possibly
reduce the loads of the 1st CPU in the socket.
(See the commit below. It's a new feature of 3.1. Not available in 3.0...)

  commit 5757a6d76cdf6dda2a492c09b985c015e86779b1
  Author: Dan Williams <dan.j.williams@...el.com>
  Date:   Sat Jul 23 20:44:25 2011 +0200

    block: strict rq_affinity
    
    Some systems benefit from completions always being steered to the strict
    requester cpu rather than the looser "per-socket" steering that
    blk_cpu_to_group() attempts by default. This is because the first
    CPU in the group mask ends up being completely overloaded with work,
    while the others (including the original submitter) has power left
    to spare.
    
    Allow the strict mode to be set by writing '2' to the sysfs control
    file. This is identical to the scheme used for the nomerges file,
    where '2' is a more aggressive setting than just being turned on.
    
    echo 2 > /sys/block/<bdev>/queue/rq_affinity

Thanks,
-- 
Jun'ichi Nomura, NEC Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ