[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z3zlgBB3ZrGApew7@xsang-OptiPlex-9020>
Date: Tue, 7 Jan 2025 16:27:44 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Niklas Cassel <cassel@...nel.org>
CC: Christoph Hellwig <hch@....de>, <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, Jens Axboe <axboe@...nel.dk>,
<linux-block@...r.kernel.org>, <virtualization@...ts.linux.dev>,
<linux-nvme@...ts.infradead.org>, Damien Le Moal <dlemoal@...nel.org>,
<linux-btrfs@...r.kernel.org>, <linux-aio@...ck.org>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [block] e70c301fae: stress-ng.aiol.ops_per_sec
49.6% regression
hi, Niklas,
On Fri, Jan 03, 2025 at 10:09:14AM +0100, Niklas Cassel wrote:
> On Fri, Jan 03, 2025 at 07:49:25AM +0100, Christoph Hellwig wrote:
> > On Thu, Jan 02, 2025 at 10:49:41AM +0100, Niklas Cassel wrote:
> > > > > from below information, it seems an 'ahci' to me. but since I have limited
> > > > > knowledge about storage driver, maybe I'm wrong. if you want more information,
> > > > > please let us know. thanks a lot!
> > > >
> > > > Yes, this looks like ahci. Thanks a lot!
> > >
> > > Did this ever get resolved?
> > >
> > > I haven't seen a patch that seems to address this.
> > >
> > > AHCI (ata_scsi_queuecmd()) only issues a single command, so if there is any
> > > reordering when issuing a batch of commands, my guess is that the problem
> > > also affects SCSI / the problem is in upper layers above AHCI, i.e. SCSI lib
> > > or block layer.
> >
> > I started looking into this before the holidays. blktrace shows perfectly
> > sequential writes without any reordering using ahci, directly on the
> > block device or using xfs and btrfs when using dd. I also started
> > looking into what the test does and got as far as checking out the
> > stress-ng source tree and looking at stress-aiol.c. AFAICS the default
> > submission does simple reads and writes using increasing offsets.
> > So if the test result isn't a fluke either the aio code does some
> > weird reordering or btrfs does.
> >
> > Oliver, did the test also show any interesting results on non-btrfs
> > setups?
> >
>
> One thing that came to mind.
> Some distros (e.g. Fedora and openSUSE) ship with an udev rule that sets
> the I/O scheduler to BFQ for single-queue HDDs.
>
> It could very well be the I/O scheduler that reorders.
>
> Oliver, which I/O scheduler are you using?
> $ cat /sys/block/sdb/queue/scheduler
> none mq-deadline kyber [bfq]
while our test running:
# cat /sys/block/sdb/queue/scheduler
none [mq-deadline] kyber bfq
>
>
> Kind regards,
> Niklas
Powered by blists - more mailing lists