lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC_iWjL4YfCOffAZPUun5wggxrqAanjd+8SgmJQN0yyWsvb3sg@mail.gmail.com>
Date:   Fri, 18 Aug 2023 09:12:09 +0300
From:   Ilias Apalodimas <ilias.apalodimas@...aro.org>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Mina Almasry <almasrymina@...gle.com>,
        Yunsheng Lin <linyunsheng@...wei.com>, davem@...emloft.net,
        pabeni@...hat.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Lorenzo Bianconi <lorenzo@...nel.org>,
        Alexander Duyck <alexander.duyck@...il.com>,
        Liang Chen <liangchen.linux@...il.com>,
        Alexander Lobakin <aleksander.lobakin@...el.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Leon Romanovsky <leon@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Jesper Dangaard Brouer <hawk@...nel.org>
Subject: Re: [PATCH net-next v7 1/6] page_pool: frag API support for 32-bit
 arch with 64-bit DMA

On Fri, 18 Aug 2023 at 02:57, Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Thu, 17 Aug 2023 19:59:37 +0300 Ilias Apalodimas wrote:
> > > Can we assume the DMA mapping of page pool is page aligned? We should
> > > be, right?
> >
> > Yes
> >
> > > That means we're storing 12 bits of 0 at the lower end.
> > > So even with 32b of space we can easily store addresses for 32b+12b =>
> > > 16TB of memory. "Ought to be enough" to paraphrase Bill G, and the
> > > problem is only in our heads?
> >
> > Do you mean moving the pp_frag_count there?
>
> Right, IIUC we don't have enough space to fit dma_addr_t and the
> refcount, but if we store the dma addr on a shifted u32 instead
> of using dma_addr_t explicitly - the refcount should fit?

struct page looks like this:

unsigned long dma_addr;
union {
      unsigned long dma_addr_upper;
      atomic_long_t pp_frag_count;
};

So, on 32bit platforms with 64bit dma we can't support a frag count at all.
We could either use the lower 12 bits (and have support for 4096 frags
'only') or do what you suggest.
TBH I don't love any of these and since those platforms are rare (or
at least that's what I think), I prefer not supporting them at all.

>
> > I was questioning the need to have PP_FLAG_PAGE_SPLIT_IN_DRIVER
> > overall.  With Yunshengs patches such a platform would allocate a
> > page, so why should we prevent it from splitting it internally?
>
> Splitting it is fine, the problem is that the refcount AKA
> page->pp_frag_count** counts outstanding PP-aware references
> and page->refcount counts PP-unaware references.
>
> If we want to use page->refcount directly we'd need to unmap
> the page whenever drivers calls page_pool_defrag_page().
> But the driver may assume the page is still mapped afterwards.

What I am suggesting here is to not add the new
PP_FLAG_PAGE_SPLIT_IN_DRIVER flag.  If a driver wants to split pages
internally it should create a pool without
PP_FLAG_DMA_SYNC_DEV to begin with.  The only responsibility the
driver would have is to elevate the page refcnt so page pool would not
try to free/recycle it.  Since it won't be able to allocate fragments
we don't have to worry about the rest.

> We can change the API to make this behavior explicit. Although
> IMHO that's putting the burden of rare platforms on non-rare
> platforms which we should avoid.

Yep, agree here.

>
> ** I said it before and I will keep saying this until someone gets
>    angry at me - I really think we should rename this field because
>    the association with frags is a coincidence.

I had similar confusions everytime I had to re-read our code hence git
show 4d4266e3fd32
Any suggestions?

Thanks
/Ilias

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ