netdev - Re: [RFC 00/12] net: huge page backed page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230712100108.00bee44f@kernel.org>
Date: Wed, 12 Jul 2023 10:01:08 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc: Yunsheng Lin <linyunsheng@...wei.com>, brouer@...hat.com,
 netdev@...r.kernel.org, almasrymina@...gle.com, hawk@...nel.org,
 ilias.apalodimas@...aro.org, edumazet@...gle.com, dsahern@...il.com,
 michael.chan@...adcom.com, willemb@...gle.com
Subject: Re: [RFC 00/12] net: huge page backed page_pool

On Wed, 12 Jul 2023 14:43:32 +0200 Jesper Dangaard Brouer wrote:
> On 12/07/2023 13.47, Yunsheng Lin wrote:
> > On 2023/7/12 8:08, Jakub Kicinski wrote:  
> >> Oh, I split the page into individual 4k pages after DMA mapping.
> >> There's no need for the host memory to be a huge page. I mean,
> >> the actual kernel identity mapping is a huge page AFAIU, and the
> >> struct pages are allocated, anyway. We just need it to be a huge
> >> page at DMA mapping time.
> >>
> >> So the pages from the huge page provider only differ from normal
> >> alloc_page() pages by the fact that they are a part of a 1G DMA
> >> mapping.  
> 
> So, Jakub you are saying the PP refcnt's are still done "as usual" on 
> individual pages.

Yes - other than coming from a specific 1G of physical memory 
the resulting pages are really pretty ordinary 4k pages.

> > If it is about DMA mapping, is it possible to use dma_map_sg()
> > to enable a big continuous dma map for a lot of discontinuous
> > 4k pages to avoid allocating big huge page?
> > 
> > As the comment:
> > "The scatter gather list elements are merged together (if possible)
> > and tagged with the appropriate dma address and length."
> > 
> > https://elixir.free-electrons.com/linux/v4.16.18/source/arch/arm/mm/dma-mapping.c#L1805
> >   
> 
> This is interesting for two reasons.
> 
> (1) if this DMA merging helps IOTLB misses (?)

Maybe I misunderstand how IOMMU / virtual addressing works, but I don't
see how one can merge mappings from physically non-contiguous pages.
IOW we can't get 1G-worth of random 4k pages and hope that thru some
magic they get strung together and share an IOTLB entry (if that's
where Yunsheng's suggestion was going..)

> (2) PP could use dma_map_sg() to amortize dma_map call cost.
> 
> For case (2) __page_pool_alloc_pages_slow() already does bulk allocation
> of pages (alloc_pages_bulk_array_node()), and then loops over the pages
> to DMA map them individually.  It seems like an obvious win to use
> dma_map_sg() here?

That could well be worth investigating!