[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+1E3r+C2KQENu=fO_+FZoUEvqZrAQcxziwSGt=FVidv85KQxA@mail.gmail.com>
Date: Wed, 19 Aug 2020 16:01:54 +0530
From: Kanchan Joshi <joshiiitr@...il.com>
To: Damien Le Moal <Damien.LeMoal@....com>
Cc: Christoph Hellwig <hch@....de>,
Kanchan Joshi <joshi.k@...sung.com>,
Jens Axboe <axboe@...nel.dk>,
"sagi@...mberg.me" <sagi@...mberg.me>,
Johannes Thumshirn <Johannes.Thumshirn@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
Keith Busch <kbusch@...nel.org>,
Selvakumar S <selvakuma.s1@...sung.com>,
Javier Gonzalez <javier.gonz@...sung.com>,
Nitesh Shetty <nj.shetty@...sung.com>
Subject: Re: [PATCH 1/2] nvme: set io-scheduler requirement for ZNS
On Wed, Aug 19, 2020 at 3:08 PM Damien Le Moal <Damien.LeMoal@....com> wrote:
>
> On 2020/08/19 18:27, Kanchan Joshi wrote:
> > On Tue, Aug 18, 2020 at 12:46 PM Christoph Hellwig <hch@....de> wrote:
> >>
> >> On Tue, Aug 18, 2020 at 10:59:35AM +0530, Kanchan Joshi wrote:
> >>> Set elevator feature ELEVATOR_F_ZBD_SEQ_WRITE required for ZNS.
> >>
> >> No, it is not.
> >
> > Are you saying MQ-Deadline (write-lock) is not needed for writes on ZNS?
> > I see that null-block zoned and SCSI-ZBC both set this requirement. I
> > wonder how it became different for NVMe.
>
> It is not required for an NVMe ZNS drive that has zone append native support.
> zonefs and upcoming btrfs do not use regular writes, removing the requirement
> for zone write locking.
I understand that if a particular user (zonefs, btrfs etc) is not
sending regular-write and sending append instead, write-lock is not
required.
But if that particular user or some other user (say F2FS) sends
regular write(s), write-lock is needed.
Above block-layer, both the opcodes REQ_OP_WRITE and
REQ_OP_ZONE_APPEND are available to be used by users. And I thought
write-lock is taken or not is a per-opcode thing and not per-user (FS,
MD/DM, user-space etc.), is not that correct? And MQ-deadline can
cater to both the opcodes, while other schedulers cannot serve
REQ_OP_WRITE well for zoned-device.
> In the context of your patch series, ELEVATOR_F_ZBD_SEQ_WRITE should be set only
> and only if the drive does not have native zone append support.
Sure I can keep it that way, once I get it right. If it is really not
required for native-append drive, it should not be here at the place
where I added.
> And even in that
> case, since for an emulated zone append the zone write lock is taken and
> released by the emulation driver itself, ELEVATOR_F_ZBD_SEQ_WRITE is required
> only if the user will also be issuing regular writes at high QD. And that is
> trivially controllable by the user by simply setting the drive elevator to
> mq-deadline. Conclusion: setting ELEVATOR_F_ZBD_SEQ_WRITE is not needed.
Are we saying applications should switch schedulers based on the write
QD (use any-scheduler for QD1 and mq-deadline for QD-N).
Even if it does that, it does not know what other applications would
be doing. That seems hard-to-get-right and possible only in a
tightly-controlled environment.
--
Joshi
Powered by blists - more mailing lists