linux-kernel - Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191129005734.GB1829@ming.t460p>
Date:   Fri, 29 Nov 2019 08:57:34 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Andrea Vai <andrea.vai@...pv.it>
Cc:     "Schmid, Carsten" <Carsten_Schmid@...tor.com>,
        Finn Thain <fthain@...egraphics.com.au>,
        Damien Le Moal <Damien.LeMoal@....com>,
        Alan Stern <stern@...land.harvard.edu>,
        Jens Axboe <axboe@...nel.dk>,
        Johannes Thumshirn <jthumshirn@...e.de>,
        USB list <linux-usb@...r.kernel.org>,
        SCSI development list <linux-scsi@...r.kernel.org>,
        Himanshu Madhani <himanshu.madhani@...ium.com>,
        Hannes Reinecke <hare@...e.com>,
        Omar Sandoval <osandov@...com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Greg KH <gregkh@...uxfoundation.org>,
        Hans Holmberg <Hans.Holmberg@....com>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: AW: Slow I/O on USB media after commit
 f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6

On Thu, Nov 28, 2019 at 06:34:32PM +0100, Andrea Vai wrote:
> Il giorno gio, 28/11/2019 alle 17.17 +0800, Ming Lei ha scritto:
> > On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> > > Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> > > scritto:
> > > > > 
> > > > > > Then I started another set of 100 trials and let them run
> > > > tonight, and
> > > > > > the first 10 trials were around 1000s, then gradually
> > decreased
> > > > to
> > > > > > ~300s, and finally settled around 200s with some trials
> > below
> > > > 70-80s.
> > > > > > This to say, times are extremely variable and for the first
> > time
> > > > I
> > > > > > noticed a sort of "performance increase" with time.
> > > > > >
> > > > > 
> > > > > The sheer volume of testing (probably some terabytes by now)
> > would
> > > > > exercise the wear leveling algorithm in the FTL.
> > > > > 
> > > > But with "old kernel" the copy operation still is "fast", as far
> > as
> > > > i understood.
> > > > If FTL (e.g. wear leveling) would slow down, we would see that
> > also
> > > > in
> > > > the old kernel, right?
> > > > 
> > > > Andrea, can you confirm that the same device used with the old
> > fast
> > > > kernel is still fast today?
> > > 
> > > Yes, it is still fast. Just ran a 100 trials test and got an
> > average
> > > of 70 seconds with standard deviation = 6 seconds, aligned with
> > the
> > > past values of the same kernel.
> > 
> > Then can you collect trace on the old kernel via the previous
> > script?
> > 
> > #!/bin/sh
> > 
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > Both the two trace points and bcc should be available on the old
> > kernel.
> > 
> 
> Trace attached. Produced by: start the trace script
> (with the pendrive already plugged), wait some seconds, run the test
> (1 trial, 1 GB), wait for the test to finish, stop the trace.
> 
> The copy took 73 seconds, roughly as already seen before with the fast
> old kernel.

This trace shows a good write IO order because the writeback IOs are
queued to block layer serially from the 'cp' task and writeback wq.

However, writeback IO order is changed in current linus tree because
the IOs are queued to block layer concurrently from the 'cp' task
and writeback wq. It might be related with killing queue_congestion
by blk-mq.

The performance effect could be not only on this specific USB drive,
but also on all HDD., I guess.

However, I still can't reproduce it in my VM even though I built it
with similar setting of Andrea's test machine. Maybe the emulated disk
is too fast than Andrea's.

Andrea, can you collect the following log when running the test
on current new(bad) kernel?

	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request

Thanks,
Ming