lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 7 Feb 2022 12:11:27 +0100
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     Ricky WU <ricky_wu@...ltek.com>
Cc:     "tommyhebb@...il.com" <tommyhebb@...il.com>,
        "linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi block rw

[...]

> > > > >
> > > > > Do you have any suggestion for testing random I/O But we think
> > > > > random I/O will not change much
> > > >
> > > > I would probably look into using fio,
> > > > https://fio.readthedocs.io/en/latest/
> > > >
> > >
> > > Filled random I/O data
> > > Before the patch:
> > > CMD (Randread):
> > > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > > -bs=1M -rw=randread
> >
> > Thanks for running the tests! Overall, I would not expect an impact on the
> > throughput when using a big blocksize like 1M. This is also pretty clear from
> > the result you have provided.
> >
> > However, especially for random writes and reads, we want to try with smaller
> > blocksizes. Like 8k or 16k, would you mind running another round of tests to
> > see how that works out?
> >
>
> Filled random I/O data(8k/16k)

Hi Ricky,

Apologize for the delay! Thanks for running the tests. Let me comment
on them below.

>
> Before(randread)
> 8k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=8k -rw=randread
> mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>    READ: bw=16.5MiB/s (17.3MB/s), 16.5MiB/s-16.5MiB/s (17.3MB/s-17.3MB/s), io=1024MiB (1074MB), run=62019-62019msec
> Disk stats (read/write):
>   mmcblk0: ios=130757/0, merge=0/0, ticks=57751/0, in_queue=57751, util=99.89%
>
> 16k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=16k -rw=randread
> mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>    READ: bw=23.3MiB/s (24.4MB/s), 23.3MiB/s-23.3MiB/s (24.4MB/s-24.4MB/s), io=1024MiB (1074MB), run=44034-44034msec
> Disk stats (read/write):
>   mmcblk0: ios=65333/0, merge=0/0, ticks=39420/0, in_queue=39420, util=99.84%
>
> Before(randrwrite)
> 8k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=8k -rw=randwrite
> mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>   WRITE: bw=4060KiB/s (4158kB/s), 4060KiB/s-4060KiB/s (4158kB/s-4158kB/s), io=100MiB (105MB), run=25220-25220msec
> Disk stats (read/write):
>   mmcblk0: ios=51/12759, merge=0/0, ticks=80/24154, in_queue=24234, util=99.90%
>
> 16k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=16k -rw=randwrite
> mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>   WRITE: bw=7201KiB/s (7373kB/s), 7201KiB/s-7201KiB/s (7373kB/s-7373kB/s), io=100MiB (105MB), run=14221-14221msec
> Disk stats (read/write):
>   mmcblk0: ios=51/6367, merge=0/0, ticks=82/13647, in_queue=13728, util=99.81%
>
>
> After(randread)
> 8k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=8k -rw=randread
> mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>    READ: bw=12.4MiB/s (13.0MB/s), 12.4MiB/s-12.4MiB/s (13.0MB/s-13.0MB/s), io=1024MiB (1074MB), run=82397-82397msec
> Disk stats (read/write):
>   mmcblk0: ios=130640/0, merge=0/0, ticks=74125/0, in_queue=74125, util=99.94%
>
> 16k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=16k -rw=randread
> mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>    READ: bw=20.0MiB/s (21.0MB/s), 20.0MiB/s-20.0MiB/s (21.0MB/s-21.0MB/s), io=1024MiB (1074MB), run=51076-51076msec
> Disk stats (read/write):
>   mmcblk0: ios=65282/0, merge=0/0, ticks=46255/0, in_queue=46254, util=99.87%
>
> After(randwrite)
> 8k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=8k -rw=randwrite
> mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>   WRITE: bw=4215KiB/s (4317kB/s), 4215KiB/s-4215KiB/s (4317kB/s-4317kB/s), io=100MiB (105MB), run=24292-24292msec
> Disk stats (read/write):
>   mmcblk0: ios=52/12717, merge=0/0, ticks=86/23182, in_queue=23267, util=99.92%
>
> 16k:
> Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=16k -rw=randwrite
> mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> result:
> Run status group 0 (all jobs):
>   WRITE: bw=6499KiB/s (6655kB/s), 6499KiB/s-6499KiB/s (6655kB/s-6655kB/s), io=100MiB (105MB), run=15756-15756msec
> Disk stats (read/write):
>   mmcblk0: ios=51/6347, merge=0/0, ticks=84/15120, in_queue=15204, util=99.80%

It looks like the rand-read tests above are degrading with the new
changes, while rand-writes are both improving and degrading.

To summarize my view from all the tests you have done at this point
(thanks a lot); it looks like the block I/O merging isn't really
happening at common blocklayer, at least to that extent that would
benefit us. Clearly you have shown that by the suggested change in the
mmc host driver, by detecting whether the "next" request is sequential
to the previous one, which allows us to skip a CMD12 and minimize some
command overhead.

However, according to the latest tests above, you have also proved
that the changes in the mmc host driver doesn't come without a cost.
In particular, small random-reads would degrade in performance from
these changes.

That said, it looks to me that rather than trying to improve things
for one specific mmc host driver, it would be better to look at this
from the generic block layer point of view - and investigate why
sequential reads/writes aren't getting merged often enough for the
MMC/SD case. If we can fix the problem there, all mmc host drivers
would benefit I assume.

BTW, have you tried with different I/O schedulers? If you haven't
tried BFQ, I suggest you do as it's a good fit for MMC/SD.

[...]

Kind regards
Uffe

Powered by blists - more mailing lists