[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191129023555.GA8620@ming.t460p>
Date: Fri, 29 Nov 2019 10:35:55 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Andrea Vai <andrea.vai@...pv.it>
Cc: "Schmid, Carsten" <Carsten_Schmid@...tor.com>,
Finn Thain <fthain@...egraphics.com.au>,
Damien Le Moal <Damien.LeMoal@....com>,
Alan Stern <stern@...land.harvard.edu>,
Jens Axboe <axboe@...nel.dk>,
Johannes Thumshirn <jthumshirn@...e.de>,
USB list <linux-usb@...r.kernel.org>,
SCSI development list <linux-scsi@...r.kernel.org>,
Himanshu Madhani <himanshu.madhani@...ium.com>,
Hannes Reinecke <hare@...e.com>,
Omar Sandoval <osandov@...com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Greg KH <gregkh@...uxfoundation.org>,
Hans Holmberg <Hans.Holmberg@....com>,
Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: AW: Slow I/O on USB media after commit
f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> On Thu, Nov 28, 2019 at 06:34:32PM +0100, Andrea Vai wrote:
> > Il giorno gio, 28/11/2019 alle 17.17 +0800, Ming Lei ha scritto:
> > > On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> > > > Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> > > > scritto:
> > > > > >
> > > > > > > Then I started another set of 100 trials and let them run
> > > > > tonight, and
> > > > > > > the first 10 trials were around 1000s, then gradually
> > > decreased
> > > > > to
> > > > > > > ~300s, and finally settled around 200s with some trials
> > > below
> > > > > 70-80s.
> > > > > > > This to say, times are extremely variable and for the first
> > > time
> > > > > I
> > > > > > > noticed a sort of "performance increase" with time.
> > > > > > >
> > > > > >
> > > > > > The sheer volume of testing (probably some terabytes by now)
> > > would
> > > > > > exercise the wear leveling algorithm in the FTL.
> > > > > >
> > > > > But with "old kernel" the copy operation still is "fast", as far
> > > as
> > > > > i understood.
> > > > > If FTL (e.g. wear leveling) would slow down, we would see that
> > > also
> > > > > in
> > > > > the old kernel, right?
> > > > >
> > > > > Andrea, can you confirm that the same device used with the old
> > > fast
> > > > > kernel is still fast today?
> > > >
> > > > Yes, it is still fast. Just ran a 100 trials test and got an
> > > average
> > > > of 70 seconds with standard deviation = 6 seconds, aligned with
> > > the
> > > > past values of the same kernel.
> > >
> > > Then can you collect trace on the old kernel via the previous
> > > script?
> > >
> > > #!/bin/sh
> > >
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > >
> > > /usr/share/bcc/tools/trace -t -C \
> > > 't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > > >rwbs, args->sector, args->nr_sector' \
> > > 't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > > >rwbs, args->sector, args->nr_sector'
> > >
> > > Both the two trace points and bcc should be available on the old
> > > kernel.
> > >
> >
> > Trace attached. Produced by: start the trace script
> > (with the pendrive already plugged), wait some seconds, run the test
> > (1 trial, 1 GB), wait for the test to finish, stop the trace.
> >
> > The copy took 73 seconds, roughly as already seen before with the fast
> > old kernel.
>
> This trace shows a good write IO order because the writeback IOs are
> queued to block layer serially from the 'cp' task and writeback wq.
>
> However, writeback IO order is changed in current linus tree because
> the IOs are queued to block layer concurrently from the 'cp' task
> and writeback wq. It might be related with killing queue_congestion
> by blk-mq.
>
> The performance effect could be not only on this specific USB drive,
> but also on all HDD., I guess.
>
> However, I still can't reproduce it in my VM even though I built it
> with similar setting of Andrea's test machine. Maybe the emulated disk
> is too fast than Andrea's.
>
> Andrea, can you collect the following log when running the test
> on current new(bad) kernel?
>
> /usr/share/bcc/tools/stackcount -K blk_mq_make_request
Instead, please run the following trace, given insert may be
called from other paths, such as flush plug:
/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
If you are using python3, the following failure may be triggered:
"cannot use a bytes pattern on a string-like object"
Then apply the following fix on /usr/lib/python3.7/site-packages/bcc/__init__.py
diff --git a/src/python/bcc/__init__.py b/src/python/bcc/__init__.py
index 6f114de8..bff5f282 100644
--- a/src/python/bcc/__init__.py
+++ b/src/python/bcc/__init__.py
@@ -769,7 +769,7 @@ class BPF(object):
evt_dir = os.path.join(cat_dir, event)
if os.path.isdir(evt_dir):
tp = ("%s:%s" % (category, event))
- if re.match(tp_re, tp):
+ if re.match(tp_re.decode(), tp):
results.append(tp)
return results
Thanks,
Ming
Powered by blists - more mailing lists