[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq1k1621x3x.fsf@oracle.com>
Date: Tue, 07 Jan 2020 21:49:06 -0500
From: "Martin K. Petersen" <martin.petersen@...cle.com>
To: Kirill Tkhai <ktkhai@...tuozzo.com>
Cc: "Martin K. Petersen" <martin.petersen@...cle.com>,
"axboe\@kernel.dk" <axboe@...nel.dk>,
"linux-block\@vger.kernel.org" <linux-block@...r.kernel.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-ext4\@vger.kernel.org" <linux-ext4@...r.kernel.org>,
"tytso\@mit.edu" <tytso@....edu>,
"adilger.kernel\@dilger.ca" <adilger.kernel@...ger.ca>,
"ming.lei\@redhat.com" <ming.lei@...hat.com>,
"osandov\@fb.com" <osandov@...com>,
"jthumshirn\@suse.de" <jthumshirn@...e.de>,
"minwoo.im.dev\@gmail.com" <minwoo.im.dev@...il.com>,
"damien.lemoal\@wdc.com" <damien.lemoal@....com>,
"andrea.parri\@amarulasolutions.com"
<andrea.parri@...rulasolutions.com>,
"hare\@suse.com" <hare@...e.com>, "tj\@kernel.org" <tj@...nel.org>,
"ajay.joshi\@wdc.com" <ajay.joshi@....com>,
"sagi\@grimberg.me" <sagi@...mberg.me>,
"dsterba\@suse.com" <dsterba@...e.com>,
"chaitanya.kulkarni\@wdc.com" <chaitanya.kulkarni@....com>,
"bvanassche\@acm.org" <bvanassche@....org>,
"dhowells\@redhat.com" <dhowells@...hat.com>,
"asml.silence\@gmail.com" <asml.silence@...il.com>
Subject: Re: [PATCH RFC 1/3] block: Add support for REQ_OP_ASSIGN_RANGE operation
Kirill,
>> Correct. We shouldn't go down this path unless a device is thinly
>> provisioned (i.e. max_discard_sectors > 0).
>
> (I assumed it is a typo, and you mean max_allocate_sectors like bellow).
No, this was in the context of not having an explicit queue limit for
allocation. If a device does not have max_discard_sectors > 0 then it is
not thinly provisioned and therefore attempting allocation makes no
sense.
>> I don't like "write_zeroes_can_allocate" because that makes assumptions
>> about WRITE ZEROES being the command of choice. I suggest we call it
>> "max_allocate_sectors" to mirror "max_discard_sectors". I.e. put
>> emphasis on the semantic operation and not the plumbing.
>
> Hm. Do you mean "bool max_allocate_sectors" or "unsigned int max_allocate_sectors"?
unsigned int. At least for SCSI we could have a device which would use
UNMAP for discards and WRITE SAME for allocates. And therefore the range
limit could be different for the two operations. Sadly.
I have a patch in the pipeline which deals with some problems in this
department because some devices have a split brain wrt. their discard
limits.
> In the second case we should make all the
> q->limits.max_write_zeroes_sectors dereferencing as switches like the
> below (this is a partial patch and only several of places are
> converted to switches as examples):
Something like that, yes.
This is getting a bit messy :( However, I am not sure that scattering
REQ_OP_ALLOCATE all over the I/O stack is particularly attractive
either.
Both REQ_OP_DISCARD and REQ_OP_WRITE_SAME come with some storage
protocol baggage that forces us to have special handling all over the
stack. But REQ_OP_WRITE_ZEROES is fairly clean and simple and, except
for the potentially different block count limit, an allocate operation
would be a carbon copy of the plumbing for write zeroes. A lot of
duplication.
So even through I'm increasingly torn on whether introducing separate
REQ_OP_ALLOCATE plumbing throughout the stack or having a REQ_ALLOCATE
flag for REQ_OP_WRITE_ZEROES is best, I still think I'm leaning towards
the latter. That will also make it easier for me in the SCSI disk
driver.
--
Martin K. Petersen Oracle Linux Engineering
Powered by blists - more mailing lists