[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27280395.gRfpFWEtPU@localhost.localdomain>
Date: Fri, 23 Sep 2022 17:05:43 +0200
From: "Fabio M. De Francesco" <fmdefrancesco@...il.com>
To: Anirudh Venkataramanan <anirudh.venkataramanan@...el.com>,
Ira Weiny <ira.weiny@...el.com>
Cc: Alexander Duyck <alexander.duyck@...il.com>,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Netdev <netdev@...r.kernel.org>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
Alexander Duyck <alexanderduyck@...com>,
John Fastabend <john.fastabend@...il.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Alexei Starovoitov <ast@...nel.org>, bpf <bpf@...r.kernel.org>,
Paolo Abeni <pabeni@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Tony Nguyen <anthony.l.nguyen@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH] ixgbe: Use kmap_local_page in ixgbe_check_lbtest_frame()
Hi Anirudh,
On Friday, September 23, 2022 12:38:02 AM CEST Anirudh Venkataramanan wrote:
> On 9/22/2022 1:58 PM, Alexander Duyck wrote:
> > On Thu, Sep 22, 2022 at 1:07 PM Anirudh Venkataramanan
> > <anirudh.venkataramanan@...el.com> wrote:
> >>
> >>
> >> Following Fabio's patches, I made similar changes for e1000/e1000e and
> >> submitted them to IWL [1].
I saw your patches and they look good to me. I might comment and probably
review them, however I prefer to wait for Ira to do that. Furthermore, looking
again at your patches made me recall that I need to talk with him about
something that is only indirectly related with you work.
Please don't rely on older patches of mine as models for your next patches. In
the last months I changed many things in the way I handle the removal of
kmap() in favour of a plain page_address() or decide to convert to
kmap_local_page(). Obviously I'm talking about pages which cannot come from
ZONE_HIGHMEM.
> >> Yesterday, Ira Weiny pointed me to some feedback from Dave Hansen on the
> >> use of page_address() [2]. My understanding of this feedback is that
> >> it's safer to use kmap_local_page() instead of page_address(), because
> >> you don't always know how the underlying page was allocated.
Your understanding of Dave's message is absolutely correct.
> >> This approach (of using kmap_local_page() instead of page_address())
> >> makes sense to me. Any reason not to go this way?
> >> [1]
> >>
> >> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/
20220919180949.388785-1-anirudh.venkataramanan@...el.com/
> >>
> >> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/
20220919180949.388785-2-anirudh.venkataramanan@...el.com/
> >>
> >> [2]
> >> https://lore.kernel.org/lkml/5d667258-b58b-3d28-3609-e7914c99b31b@intel.com/
> >>
> >> Ani
> >
> > For the two patches you referenced the driver is the one allocating
> > the pages. So in such a case the page_address should be acceptable.
> > Specifically we are falling into alloc_page(GFP_ATOMIC) which should
> > fall into the first case that Dave Hansen called out.
>
> Right. However, I did run into a case in the chelsio inline crypto
> driver where it seems like the pages are allocated outside the driver.
> In such cases, kmap_local_page() would be the right approach, as the
> driver can't make assumptions on how the page was allocated.
The mere fact that we are still discussing this particular topic is my only
fault. I mean that the guidelines about what to do with ZONE_NORMAL or lower
pages is not enough clear. I'll have to improve that paragraph.
For now let me tell you what I'm doing whenever I have to decide between a
conversion from kmap{,_atomic}() to kmap_local_page() or a kmap() removal in
favour of page_address() use.
> ... and this makes me wonder why not just use kmap_local_page() even in
> cases where the page allocation was done in the driver. IMO, this is
> simpler because
>
> a) you don't have to care how a page was allocated. kmap_local_page()
> will create a temporary mapping if required, if not it just becomes a
> wrapper to page_address().
>
> b) should a future patch change the allocation to be from highmem, you
> don't have to change a bunch of page_address() calls to be
> kmap_local_page().
"a" and "b" are good arguments with sound logic. However there are a couple of
cases that you are not yet considering.
As my main rule I prefer the use of kmap_local_page() whenever tracking if
pages can't come from Highmem is complex, especially when allocation is
performed in other translation units of the same driver or, worse, pages come
from different subsystems.
Instead, I don't like to use kmap_local_page() when the allocation is in the
same function and you see immediately that it cannot come from ZONE_HIGHMEM.
Sometimes it's so clear that using kmap_local_page() looks silly to me :-)
For example...
void *silly_alloc_and_map() {
struct *page;
page = alloc_page(GFP_KERNEL);
return kmap_local_page(page);
}
In this case you know without any effort that the page cannot come from
ZONE_HIGHMEM. Therefore, why bother with mapping and unmapping (and perhaps
write a function for unmapping).
While working on the removals or the conversions of kmap(), I noticed that
people tend to forget to call kunmap(). We have a limited amount of kmap()
slots. If the mapping space is fully utilized we'll have the next slot
available only after reboot or unloading and reloading a module.
If I recall correctly, with kmap_local_page() we can map a maximum of 16 pages
per task_struct. Therefore, limits are everywhere and people tend to leak
resources.
To summarize: whenever allocation is easily trackable, and pages cannot come
from ZONE_HIGHMEM, I prefer page_address().
Honestly, if code is well designed I don't care whether or not within 5 days
or 10 years decide to change the allocation. I think it's like to refrain from
deleting unreachable code, variables, partially implemented functions, and so
on just because one day someone may think to make something useful from those
things.
Greg K-H taught me that I must see the code as is now and don't speculate
about possible future scenarios. I agree with him in full :-)
Very different case where I _need_ page_address() are due to the strict rules
of nesting mapping and unmapping-mapping. I recall that I spent days on a
function in Btrfs because I could not map and unmap with the usual Last In -
First Out (LIFO) rule.
A function was so complex and convoluted that nobody could know in advance the
order of execution of the mappings of two pages. Lots of goto, breaks, loops
made impossible to unmap in the correct order at the "clean and exit" label.
I made a first attempt using a two element array as a stack which registered
the mappings and then I used it to unmap in the correct order at exit.
It was intentionally a means to draw the attention of the maintainers. One of
them proposed to split that very complex function in several helpers, and
isolate the mappings one by one. It was OK to me.
After weeks, David Sterba noticed that he knew that one of the pages came from
the page cache and we had to map it, but the second page was allocated inside
Btrfs with GFP_KERNEL. Therefore, the better suited solution was to use
kmap_local_page() for the first and page address() for the second.
My stack based solution was working but nobody should write such an ugly code
just to enforce local mapping :-)
> Is using page_address() directly beneficial in some way?
A possible call chain on 32 bits kernels is the following:
kmap_local_page() ->
__kmap_local_page_prot() {
if (!IS_ENABLED(CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP) && |
PageHighMem(page))
return page_address(page);
....
}
How many instructions can you save calling page_address() directly?
If you don't know, look at the assembly.
Thanks,
Fabio
> Ani
>
>
Powered by blists - more mailing lists