lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 09 Jan 2014 14:54:10 -0500
From:	Douglas Gilbert <dgilbert@...erlog.com>
To:	Sergey Meirovich <rathamahata@...il.com>,
	James Smart <james.smart@...lex.com>
CC:	Jan Kara <jack@...e.cz>, linux-scsi <linux-scsi@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Gluk <git.user@...il.com>
Subject: Re: Terrible performance of sequential O_DIRECT 4k writes in SAN
 environment. ~3 times slower then Solars 10 with the same HBA/Storage.

On 14-01-08 08:57 AM, Sergey Meirovich wrote:
> Hi James,
>
> On 7 January 2014 22:57, James Smart <james.smart@...lex.com> wrote:
>> Sergey,
>>
>> The Thor chipset is a bit old - a 4Gig adapter.  Most of our performance
>> improvements, including parallelization, have gone into the 8G and 16G
>> adapters. But you still should have seen significantly beyond what you
>> reported.
>
> First of all - thanks a lot!
>
> I took Thor because we have exactly the same Thors in some of our
> Solaris servers. I've also tried 6 different qlogics (mostly 8G) and
> fnic (10G) as well. Surprisingly enough Thor was the fastest one for
> seqwr 4k. Though in most of the cases machines were from our different
> DCs and hence each one connected to yet another storage.
>
>>
>> We did a sanity check some hardware we already had set up with a Thor
>> adapter.  We saw 23555 iop/s and 92.1 MB/s without needing to do much, well
>> beyond what you've reported, and still not up to what we know the card can
>> do.  There are some inefficiencies from the linux kernel and some locking
>> deltas between our solaris and linux drivers - but not enough to account for
>> what you are seeing.
>>
>> I expect the Direct IO filesystem behavior is the root issue.
>
> The strangest thing to me that this is the problem with sequential
> write. For example the fnic one machine is zoned to EMC XtremIO and
> had results: 14.43Mb/sec 3693.65 Requests/sec for sequential 4k. The
> same fnic machine perfrormed rather impressive for random 4k
> 451.11Mb/sec 115485.02 Requests/sec

You could bypass O_DIRECT and use ddpt together with
a bsg pass-through (bsg is a little faster than sg
for these purposes).

For example:

# lsscsi -g
[0:0:0:0]    disk    ATA    INTEL SSDSC2CW12 400i  /dev/sda /dev/sg0
[14:0:0:0]   disk    Linux  scsi_debug       0004  -        /dev/sg1

# ddpt if=/dev/bsg/14:0:0:0 bs=512 bpt=128 count=1m
Output file not specified so no copy, just reading input
1048576+0 records in
0+0 records out
time to read data: 0.283566 secs at 1893.28 MB/sec

bs= should match the block size of the storage device and
the size of each SCSI READ is dictated by bpt= (so 64 KB
in this case).

Such a test should show you if your performance problem
is in the block layer or below, or above the block layer
(at least the point where pass-through commands are
injected).

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ