[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3177256.oiGErgHkdL@opensuse>
Date: Wed, 17 Aug 2022 14:02:54 +0200
From: "Fabio M. De Francesco" <fmdefrancesco@...il.com>
To: Keith Busch <kbusch@...nel.org>, Sagi Grimberg <sagi@...mberg.me>,
Christoph Hellwig <hch@....de>,
Chaitanya Kulkarni <kch@...dia.com>
Cc: linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
Ira Weiny <ira.weiny@...el.com>
Subject: Re: [RFC PATCH] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM
On mercoledì 17 agosto 2022 11:44:09 CEST Sagi Grimberg wrote:
> >> Therefore, I have two questions: am I right about thinking that the pages
> >> mapped in nvmet_tcp_map_pdu_iovec() are allocated with GFP_KERNEL?
> >
> > I think you are correct.
>
> It is correct. It is the same model for the linux scsi target, sunrpc
> etc.
I'll try to address the comments from the two last messages from Keith and
Sagi with this email (I replied yesterday to Chaitanya).
First of all: good to know that it is the same model for other subsystem. This
is useful to know. Thanks!
> >> If so, can anyone with more knowledge than mine please say if my changes
> >> make
> >> any sense?
> >
> > I think it does make sense.
Thanks, I'm glad I was not wrong :-)
> > I like the code simplification, though this use
> > was't really paying the kmap penalty since, as you mentioned, this is
never
> > highmem.
Correct, however everybody like code simplification. I added a couple of
sentences to kmap_local_page() documentation in highmem.rst. They clearly
state that, when users know that pages cannot come from Highmem, they may
better prefer page_address().
The changes to nvmet-tcp started with trying to convert kmap() / kunmap() to
kmap_local_page() /kunmap_local(), but it ended up to code shortening and
simplification with a plain use of page_address().
Obviously, due to my little experience with kernel developing and less than
little knowledge of this protocol I had to ask whether or not I was right in
identifying the site of the allocations.
The reasons why I had to use page_address() will be clearer reading what
follows...
> Yes, its the same code-path. Would be great if we still had an
> abstraction that would do the right thing regardless of highmem or
> not like kmap provides though.
It would be great and it is already possible (this is why Thomas Gleixner
created this kmap_local_page() API) but here we have a huge issue. kmap() and
kmap_atomic() have recently been deprecated and they shouldn't any longer be
used in new code: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com/ ("[PATCH] checkpatch: Add kmap and kmap_atomic to the
deprecated list").
kmap_local_page() always does the right thing: users can call it with or
without HIGHMEM enabled, in-atomic (also in interrupts) or in preemptible
contexts, they can take page faults.
It doesn't require global lock for synchronization and doesn't require global
TLB invalidation when the kmap's pool wraps and doesn't block waiting for free
slots.
Nice, isn't it?
However, with nvmet-tcp we cannot easily use kmap_local_page() because it
comes with a major problem: it's local to the thread. If users handed the
kernel virtual addresses returned by this function to other threads, the
pointers would be invalid.
Here kmap() and kunmap() call sites are in two different workqueues.
Therefore, there is no way to convert kmap() to kmap_local_page(), unless this
code is heavily refactored.
Knowing that the pages cannot come from Highmem avoids this refactoring and in
the meantime it allows us to delete the kmap() and kunmap() calls sites.
> > You should also remove the cmd's 'nr_mapped' field while you're at it,
> > otherwise you'll hit the WARN in nvmet_tcp_free_cmd_buffers().
>
> Not remove nr_mapped because we use it to know the iovec entries, but
> we can just remove the WARN statement.
Ah, OK. I'll take care of this too. That was not my first concern when I did
the RFC. The "real" patch must also address this detail.
@Chaitanya:
Since this is a mere simplification and shorten of code, I suppose I can skip
the performance tests. Ira and I have still hundreds of call sites with kmap()
and kmap_atomic() which we should care of, therefore we prefer to leave alone
everything that is not strictly necessary for the deprecated API deletions.
Thanks to you all,
Fabio
Powered by blists - more mailing lists