lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090904023421.GA11316@localhost>
Date:	Fri, 4 Sep 2009 10:34:21 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Fernando Silveira <fsilveira@...il.com>
Cc:	LKML <linux-kernel@...r.kernel.org>
Subject: Re: I/O and pdflush

On Fri, Sep 04, 2009 at 10:21:23AM +0800, Wu Fengguang wrote:
> On Fri, Sep 04, 2009 at 04:46:14AM +0800, Fernando Silveira wrote:
> > On Wed, Sep 2, 2009 at 09:50, Wu Fengguang<fengguang.wu@...el.com> wrote:
> > > On Wed, Sep 02, 2009 at 07:29:37PM +0800, Fernando Silveira wrote:
> > >> On Wed, Sep 2, 2009 at 00:05, Wu Fengguang<fengguang.wu@...el.com> wrote:
> > >> > Is it tightly related to two interleaved write streams (the performance drop
> > >> > starts and stops with the second write stream)?  If so, it sounds like a SSD
> > >> > hardware/firmware problem.
> > >>
> > >> By write stream you mean more than one sdc file descriptor writing? I
> > >> don't usually do that, except for some tests (e.g. the "20090831_0908
> > >> iostat_x_1_5-80mbps_3" test). In that last test I did not do that.
> > >>
> > >> The performance drops even when the whole system is idle, with no
> > >> process other than the essential and my program that does nothing but
> > >> open("/dev/sdc", O_WRONLY) and loops in write(fd, buf, 4194304).
> > >
> > > Yes, I mean concurrently writing to two large files.
> > 
> > Sorry, but that cannot be happening in my tests. As I said before, the
> > only processes running are "init", "udevd", "sshd", "getty", "bash"
> > and "ddv". In the "dmesg" logs you can see that the only processes
> > that are writing to the disk are "ddv" and "pdflush". I just cannot
> > see another stream being written.
> > 
> > Please tell me if I'm missing something.
> 
> OK, so performance can go bad even for one single dd command?
> 
> > > Note that the
> > > low level disk write start/stop time is normally not equal to
> > > application(eg. dd) write start/stop time because of delayed write.
> > >
> > > Passing "oflag=direct" to dd can make the two times aligned.
> > 
> > I was aware of that, however I tried it and now the problem seems to
> > be much more difficult to be reproduced. I've being trying for a
> > couple of hours and so far all I could see was a small performance
> > drop from 80MB/s (now almost steady) to 60MB/s for some seconds
> > (approximately 16).
> > 
> > Could this possibly mean something else to you?
> 
> So performance is good on two concurrent O_DIRECT dd commands?
> 
> > > These commands can show the block IO internals and help confirm
> > > whether it's the kernel or hardware's fault:
> > >
> > >        blktrace -w1 /dev/sdc # do this while throughput is 25MB/s
> > >        blkparse sdc
> > 
> > I did that but I couldn't identify whose fault is. I don't understand
> > much about I/O internals (still trying to) and I wonder if you could
> > check it for me, please.
> > 
> > The new trace logs are in this tarball:
> > 
> > http://rootshell.be/~swrh/ssd-tests/log-20090903_1323-MK020903042C0001B-prosar_fcc2-bs_4M-0-blktrace_25mbps.tar.gz
> 
> The kernel seems to do its job fine. I do see one obvious pattern(*).
> The driver seems to complete IO requests _in batches_, which in turn
> make the kernel issue IO requests in batches. Which might choke the
> hardware/firmware in some subtle way.
> 
> In direct dd(s), the io queue is limited in size, so this kind of
> batching effect will not be obvious, which could explain why it
> performs better. 

Fernando, to confirm this, you can try running dd with lowered queue depth:

        echo 32 > /sys/block/sdc/queue/nr_requests

Thanks,
Fengguang

> ---
> 
> (*) `blkparse sdc` shows repeated patterns of
> 
> [thousands of A-Q-M sequences by ddv or pdflush]
>   8,32   0    11992     0.093682688  2417  A   W 210184323 + 1 <- (8,34) 194119323
>   8,32   0    11993     0.093682832  2417  Q   W 210184323 + 1 [ddv]
>   8,32   0    11994     0.093683066  2417  M   W 210184323 + 1 [ddv]
>   8,32   0    11995     0.093683552  2417  A   W 210184324 + 1 <- (8,34) 194119324
>   8,32   0    11996     0.093683702  2417  Q   W 210184324 + 1 [ddv]
>   8,32   0    11997     0.093683930  2417  M   W 210184324 + 1 [ddv]
>   8,32   0    11998     0.093684422  2417  A   W 210184325 + 1 <- (8,34) 194119325
>   8,32   0    11999     0.093684566  2417  Q   W 210184325 + 1 [ddv]
>   8,32   0    12000     0.093684799  2417  M   W 210184325 + 1 [ddv]
>   8,32   0    12001     0.093685285  2417  A   W 210184326 + 1 <- (8,34) 194119326
>   8,32   0    12002     0.093685429  2417  Q   W 210184326 + 1 [ddv]
>   8,32   0    12003     0.093685663  2417  M   W 210184326 + 1 [ddv]
>   8,32   0    12004     0.093686155  2417  A   W 210184327 + 1 <- (8,34) 194119327
>   8,32   0    12005     0.093686299  2417  Q   W 210184327 + 1 [ddv]
>   8,32   0    12006     0.093686533  2417  M   W 210184327 + 1 [ddv]
> [near 100 C-D sequences]
>   8,32   1    12378     0.094161760     0  C   W 210164352 + 128 [0]
>   8,32   1    12379     0.094318451     0  D   W 210164608 + 128 [swapper]
>   8,32   0    12007     0.095964571  2417  C   W 210164480 + 128 [0]
>   8,32   0    12008     0.096097816  2417  D   W 210164736 + 128 [ddv]
>   8,32   0    12009     0.097421849   211  C   W 210164608 + 128 [0]
>   8,32   0    12010     0.097566626   211  D   W 210164864 + 128 [kswapd0]
>   8,32   1    12380     0.099029562     0  C   W 210164736 + 128 [0]
>   8,32   1    12381     0.099199427     0  D   W 210164992 + 128 [swapper]
>   8,32   0    12011     0.101573454  2417  C   W 210164864 + 128 [0]
>   8,32   0    12012     0.101722292  2417  D   W 210165120 + 128 [ddv]
>   8,32   1    12382     0.103317406     0  C   W 210164992 + 128 [0]
>   8,32   1    12383     0.103491195     0  D   W 210165248 + 128 [swapper]
>   8,32   0    12013     0.104819283  2417  C   W 210165120 + 128 [0]
>   8,32   0    12014     0.104966411  2417  D   W 210165376 + 128 [ddv]
>   8,32   0    12015     0.106378658  2417  C   W 210165248 + 128 [0]
>   8,32   0    12016     0.106545098  2417  D   W 210165504 + 128 [ddv]
>   8,32   1    12384     0.108959520     0  C   W 210165376 + 128 [0]
>   8,32   1    12385     0.109088854     0  D   W 210165632 + 128 [swapper]
>   8,32   0    12017     0.110761352     0  C   W 210165504 + 128 [0]
>   8,32   0    12018     0.110881820     0  D   W 210165760 + 128 [swapper]
>   8,32   1    12386     0.112203050     0  C   W 210165632 + 128 [0]
>   8,32   1    12387     0.112310847     0  D   W 210165888 + 128 [swapper]
>   8,32   0    12019     0.113762264     0  C   W 210165760 + 128 [0]
>   8,32   0    12020     0.113868915     0  D   W 210166016 + 128 [swapper]
>   8,32   0    12021     0.116295958     0  C   W 210165888 + 128 [0]
>   8,32   0    12022     0.116405740     0  D   W 210166144 + 128 [swapper]
>   8,32   1    12388     0.118039279     0  C   W 210166016 + 128 [0]
>   8,32   1    12389     0.118146452     0  D   W 210166272 + 128 [swapper]
>   8,32   0    12023     0.119468271     0  C   W 210166144 + 128 [0]
>   8,32   0    12024     0.119569714     0  D   W 210166400 + 128 [swapper]
>   8,32   1    12390     0.120972296     0  C   W 210166272 + 128 [0]
>   8,32   1    12391     0.121081701     0  D   W 210166528 + 128 [swapper]
>   8,32   1    12392     0.123565552     0  C   W 210166400 + 128 [0]
>   8,32   1    12393     0.123674146     0  D   W 210166656 + 128 [swapper]
>   8,32   0    12025     0.125309227     0  C   W 210166528 + 128 [0]
>   8,32   0    12026     0.125417174     0  D   W 210166784 + 128 [swapper]
>   8,32   1    12394     0.126739754     0  C   W 210166656 + 128 [0]
>   8,32   1    12395     0.126845037     0  D   W 210166912 + 128 [swapper]
>   8,32   0    12027     0.128297684     0  C   W 210166784 + 128 [0]
>   8,32   0    12028     0.128407466     0  D   W 210167040 + 128 [swapper]
>   8,32   1    12396     0.130840911     0  C   W 210166912 + 128 [0]
>   8,32   1    12397     0.130947693     0  D   W 210167168 + 128 [swapper]
>   8,32   1    12398     0.132628003     0  C   W 210167040 + 128 [0]
>   8,32   1    12399     0.132732236     0  D   W 210167296 + 128 [swapper]
>   8,32   0    12029     0.134054889     0  C   W 210167168 + 128 [0]
>   8,32   0    12030     0.134160082     0  D   W 210167424 + 128 [swapper]
>   8,32   1    12400     0.135563066     0  C   W 210167296 + 128 [0]
>   8,32   1    12401     0.135664905     0  D   W 210167552 + 128 [swapper]
>   8,32   0    12031     0.138157000     0  C   W 210167424 + 128 [0]
>   8,32   0    12032     0.138263110     0  D   W 210167680 + 128 [swapper]
>   8,32   1    12402     0.139842121     0  C   W 210167552 + 128 [0]
>   8,32   1    12403     0.139945646     0  D   W 210167808 + 128 [swapper]
>   8,32   1    12404     0.141269145   210  C   W 210167680 + 128 [0]
>   8,32   1    12405     0.141377805   210  D   W 210167936 + 128 [pdflush]
>   8,32   0    12033     0.142830026     0  C   W 210167808 + 128 [0]
>   8,32   0    12034     0.142936449     0  D   W 210168064 + 128 [swapper]
>   8,32   1    12406     0.145314280     0  C   W 210167936 + 128 [0]
>   8,32   1    12407     0.145418662     0  D   W 210168192 + 128 [swapper]
>   8,32   0    12035     0.147057325     0  C   W 210168064 + 128 [0]
>   8,32   0    12036     0.147159590     0  D   W 210168320 + 128 [swapper]
>   8,32   1    12408     0.148482753     0  C   W 210168192 + 128 [0]
>   8,32   1    12409     0.148587813     0  D   W 210168448 + 128 [swapper]
>   8,32   1    12410     0.149991158     0  C   W 210168320 + 128 [0]
>   8,32   1    12411     0.150096956     0  D   W 210168576 + 128 [swapper]
>   8,32   0    12037     0.152584743     0  C   W 210168448 + 128 [0]
>   8,32   0    12038     0.152690919     0  D   W 210168704 + 128 [swapper]
>   8,32   1    12412     0.154328700     0  C   W 210168576 + 128 [0]
>   8,32   1    12413     0.154431589     0  D   W 210168832 + 128 [swapper]
>   8,32   0    12039     0.155754320     0  C   W 210168704 + 128 [0]
>   8,32   0    12040     0.155852734     0  D   W 210168960 + 128 [swapper]
>   8,32   1    12414     0.157305657     0  C   W 210168832 + 128 [0]
>   8,32   1    12415     0.157407880     0  D   W 210169088 + 128 [swapper]
>   8,32   1    12416     0.159789046     0  C   W 210168960 + 128 [0]
>   8,32   1    12417     0.159887321     0  D   W 210169216 + 128 [swapper]
>   8,32   0    12041     0.161590753     0  C   W 210169088 + 128 [0]
>   8,32   0    12042     0.161690654     0  D   W 210169344 + 128 [swapper]
>   8,32   1    12418     0.163013019     0  C   W 210169216 + 128 [0]
>   8,32   1    12419     0.163115584     0  D   W 210169472 + 128 [swapper]
>   8,32   0    12043     0.164519096     0  C   W 210169344 + 128 [0]
>   8,32   0    12044     0.164622195     0  D   W 210169600 + 128 [swapper]
>   8,32   1    12420     0.167173282     0  C   W 210169472 + 128 [0]
>   8,32   1    12421     0.167276254     0  D   W 210169728 + 128 [swapper]
>   8,32   1    12422     0.168925764     0  C   W 210169600 + 128 [0]
>   8,32   1    12423     0.169027981     0  D   W 210169856 + 128 [swapper]
>   8,32   0    12045     0.170351065     0  C   W 210169728 + 128 [0]
>   8,32   0    12046     0.170455910     0  D   W 210169984 + 128 [swapper]
>   8,32   1    12424     0.171907579     0  C   W 210169856 + 128 [0]
>   8,32   1    12425     0.172013444     0  D   W 210170112 + 128 [swapper]
>   8,32   0    12047     0.174390555     0  C   W 210169984 + 128 [0]
>   8,32   0    12048     0.174492477     0  D   W 210170240 + 128 [swapper]
>   8,32   1    12426     0.176133642     0  C   W 210170112 + 128 [0]
>   8,32   1    12427     0.176237095     0  D   W 210170368 + 128 [swapper]
>   8,32   1    12428     0.177558824     0  C   W 210170240 + 128 [0]
>   8,32   1    12429     0.177664029     0  D   W 210170496 + 128 [swapper]
>   8,32   0    12049     0.179067469     0  C   W 210170368 + 128 [0]
>   8,32   0    12050     0.179169668     0  D   W 210170624 + 128 [swapper]
>   8,32   1    12430     0.181612304     0  C   W 210170496 + 128 [0]
>   8,32   1    12431     0.181714443     0  D   W 210170752 + 128 [swapper]
>   8,32   0    12051     0.183355913     0  C   W 210170624 + 128 [0]
>   8,32   0    12052     0.183459354     0  D   W 210170880 + 128 [swapper]
>   8,32   1    12432     0.184783537     0  C   W 210170752 + 128 [0]
>   8,32   1    12433     0.184880864     0  D   W 210171008 + 128 [swapper]
>   8,32   0    12053     0.186333961     0  C   W 210170880 + 128 [0]
>   8,32   0    12054     0.186428577     0  D   W 210171136 + 128 [swapper]
>   8,32   0    12055     0.188818497     0  C   W 210171008 + 128 [0]
>   8,32   0    12056     0.188920809     0  D   W 210171264 + 128 [swapper]
>   8,32   1    12434     0.190562105     0  C   W 210171136 + 128 [0]
>   8,32   1    12435     0.190666416     0  D   W 210171392 + 128 [swapper]
>   8,32   0    12057     0.191988452     0  C   W 210171264 + 128 [0]
>   8,32   0    12058     0.192089505     0  D   W 210171520 + 128 [swapper]
>   8,32   1    12436     0.193493089     0  C   W 210171392 + 128 [0]
>   8,32   1    12437     0.193595240     0  D   W 210171648 + 128 [swapper]
>   8,32   1    12438     0.196087478     0  C   W 210171520 + 128 [0]
>   8,32   1    12439     0.196191783     0  D   W 210171776 + 128 [swapper]
>   8,32   0    12059     0.197831496     0  C   W 210171648 + 128 [0]
>   8,32   0    12060     0.197933803     0  D   W 210171904 + 128 [swapper]
>   8,32   1    12440     0.199257319     0  C   W 210171776 + 128 [0]
>   8,32   1    12441     0.199360568     0  D   W 210172032 + 128 [swapper]
>   8,32   0    12061     0.200814049     0  C   W 210171904 + 128 [0]
>   8,32   0    12062     0.200913626     0  D   W 210172160 + 128 [swapper]
>   8,32   1    12442     0.203309419     0  C   W 210172032 + 128 [0]
>   8,32   1    12443     0.203412830     0  D   W 210172288 + 128 [swapper]
>   8,32   0    12063     0.205060570     0  C   W 210172160 + 128 [0]
>   8,32   0    12064     0.205162907     0  D   W 210172416 + 128 [swapper]
>   8,32   0    12065     0.206486099     0  C   W 210172288 + 128 [0]
>   8,32   0    12066     0.206587944     0  D   W 210172544 + 128 [swapper]
>   8,32   1    12444     0.207990905     0  C   W 210172416 + 128 [0]
>   8,32   1    12445     0.208091646     0  D   W 210172672 + 128 [swapper]
>   8,32   0    12067     0.210583968     0  C   W 210172544 + 128 [0]
>   8,32   0    12068     0.210680377     0  D   W 210172800 + 128 [swapper]
>   8,32   1    12446     0.212327398     0  C   W 210172672 + 128 [0]
>   8,32   1    12447     0.212428457     0  D   W 210172928 + 128 [swapper]
>   8,32   0    12069     0.213751590     0  C   W 210172800 + 128 [0]
>   8,32   0    12070     0.213853765     0  D   W 210173056 + 128 [swapper]
>   8,32   0    12071     0.215305974     0  C   W 210172928 + 128 [0]
>   8,32   0    12072     0.215410602     0  D   W 210173184 + 128 [swapper]
>   8,32   1    12448     0.217849597     0  C   W 210173056 + 128 [0]
>   8,32   1    12449     0.217949042     0  D   W 210173312 + 128 [swapper]
>   8,32   0    12073     0.219601719     0  C   W 210173184 + 128 [0]
>   8,32   0    12074     0.219703342     0  D   W 210173440 + 128 [swapper]
>   8,32   1    12450     0.221025743     0  C   W 210173312 + 128 [0]
>   8,32   1    12451     0.221127756     0  D   W 210173568 + 128 [swapper]
>   8,32   0    12075     0.222580660     0  C   W 210173440 + 128 [0]
>   8,32   0    12076     0.222676374     0  D   W 210173696 + 128 [swapper]
>   8,32   0    12077     0.225014766     0  C   W 210173568 + 128 [0]
>   8,32   0    12078     0.225111955     0  D   W 210173824 + 128 [swapper]
>   8,32   1    12452     0.226711310     0  C   W 210173696 + 128 [0]
>   8,32   1    12453     0.226812111     0  D   W 210173952 + 128 [swapper]
>   8,32   0    12079     0.229019147     0  C   W 210173824 + 128 [0]
>   8,32   0    12080     0.229119366     0  D   W 210174080 + 128 [swapper]
>   8,32   1    12454     0.230581959     0  C   W 210173952 + 128 [0]
>   8,32   1    12455     0.230682370     0  D   W 210174208 + 128 [swapper]
>   8,32   0    12081     0.232003758     0  C   W 210174080 + 128 [0]
>   8,32   0    12082     0.232102141     0  D   W 210174336 + 128 [swapper]
>   8,32   0    12083     0.233503956     0  C   W 210174208 + 128 [0]
>   8,32   0    12084     0.233600131     0  D   W 210174464 + 128 [swapper]
>   8,32   1    12456     0.234921110     0  C   W 210174336 + 128 [0]
>   8,32   1    12457     0.235018282     0  D   W 210174592 + 128 [swapper]
>   8,32   0    12085     0.236420624     0  C   W 210174464 + 128 [0]
>   8,32   0    12086     0.236519661     0  D   W 210174720 + 128 [swapper]
>   8,32   1    12458     0.238714991     0  C   W 210174592 + 128 [0]
>   8,32   1    12459     0.238812841     0  D   W 210174848 + 128 [swapper]
>   8,32   0    12087     0.240419071     0  C   W 210174720 + 128 [0]
>   8,32   0    12088     0.240514299     0  D   W 210174976 + 128 [swapper]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ