lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 14 Nov 2023 20:49:08 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>
CC: Jakub Kicinski <kuba@...nel.org>, <davem@...emloft.net>,
	<pabeni@...hat.com>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, Willem de Bruijn <willemb@...gle.com>,
	Kaiyuan Zhang <kaiyuanz@...gle.com>, Jesper Dangaard Brouer
	<hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Eric
 Dumazet <edumazet@...gle.com>, Christian König
	<christian.koenig@....com>, Jason Gunthorpe <jgg@...dia.com>, Matthew Wilcox
	<willy@...radead.org>, Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH RFC 3/8] memory-provider: dmabuf devmem memory provider

On 2023/11/14 20:21, Mina Almasry wrote:
> On Tue, Nov 14, 2023 at 12:23 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>
>> +cc Christian, Jason and Willy
>>
>> On 2023/11/14 7:05, Jakub Kicinski wrote:
>>> On Mon, 13 Nov 2023 05:42:16 -0800 Mina Almasry wrote:
>>>> You're doing exactly what I think you're doing, and what was nacked in RFC v1.
>>>>
>>>> You've converted 'struct page_pool_iov' to essentially become a
>>>> duplicate of 'struct page'. Then, you're casting page_pool_iov* into
>>>> struct page* in mp_dmabuf_devmem_alloc_pages(), then, you're calling
>>>> mm APIs like page_ref_*() on the page_pool_iov* because you've fooled
>>>> the mm stack into thinking dma-buf memory is a struct page.
>>
>> Yes, something like above, but I am not sure about the 'fooled the mm
>> stack into thinking dma-buf memory is a struct page' part, because:
>> 1. We never let the 'struct page' for devmem leaking out of net stacking
>>    through the 'not kmap()able and not readable' checking in your patchset.
> 
> RFC never used dma-buf pages outside the net stack, so that is the same.
> 
> You are not able to get rid of the 'net kmap()able and not readable'
> checking with this approach, because dma-buf memory is fundamentally
> unkmapable and unreadable. This approach would still need
> skb_frags_not_readable checks in net stack, so that is also the same.

Yes, I am agreed that checking is still needed whatever the proposal is.

> 
>> 2. We inititiate page->_refcount for devmem to one and it remains as one,
>>    we will never call page_ref_inc()/page_ref_dec()/get_page()/put_page(),
>>    instead, we use page pool's pp_frag_count to do reference counting for
>>    devmem page in patch 6.
>>
> 
> I'm not sure that moves the needle in terms of allowing dma-buf
> memory to look like struct pages.
> 
>>>>
>>>> RFC v1 was almost exactly the same, except instead of creating a
>>>> duplicate definition of struct page, it just allocated 'struct page'
>>>> instead of allocating another struct that is identical to struct page
>>>> and casting it into struct page.
>>
>> Perhaps it is more accurate to say this is something between RFC v1 and
>> RFC v3, in order to decouple 'struct page' for devmem from mm subsystem,
>> but still have most unified handling for both normal memory and devmem
>> in page pool and net stack.
>>
>> The main difference between this patchset and RFC v1:
>> 1. The mm subsystem is not supposed to see the 'struct page' for devmem
>>    in this patchset, I guess we could say it is decoupled from the mm
>>    subsystem even though we still call PageTail()/page_ref_count()/
>>    page_is_pfmemalloc() on 'struct page' for devmem.
>>
> 
> In this patchset you pretty much allocate a struct page for your
> dma-buf memory, and then cast it into a struct page, so all the mm
> calls in page_pool.c are seeing a struct page when it's really dma-buf
> memory.
> 
> 'even though we still call
> PageTail()/page_ref_count()/page_is_pfmemalloc() on 'struct page' for
> devmem' is basically making dma-buf memory look like struct pages.
> 
> Actually because you put the 'strtuct page for devmem' in
> skb->bv_frag, the net stack will grab the 'struct page' for devmem
> using skb_frag_page() then call things like page_address(), kmap,
> get_page, put_page, etc, etc, etc.

Yes, as above, skb_frags_not_readable() checking is still needed for
kmap() and page_address().

get_page, put_page related calling is avoided in page_pool_frag_ref()
and napi_pp_put_page() for devmem page as the above checking is true
for devmem page:
(pp_iov->pp_magic & ~0x3UL) == PP_SIGNATURE

> 
>> The main difference between this patchset and RFC v3:
>> 1. It reuses the 'struct page' to have more unified handling between
>>    normal page and devmem page for net stack.
> 
> This is what was nacked in RFC v1.
> 
>> 2. It relies on the page->pp_frag_count to do reference counting.
>>
> 
> I don't see you change any of the page_ref_* calls in page_pool.c, for
> example this one:
> 
> https://elixir.bootlin.com/linux/latest/source/net/core/page_pool.c#L601
> 
> So the reference the page_pool is seeing is actually page->_refcount,
> not page->pp_frag_count? I'm confused here. Is this a bug in the
> patchset?

page->_refcount is the same as page_pool_iov->_refcount for devmem, which
is ensured by the 'PAGE_POOL_MATCH(_refcount, _refcount);', and
page_pool_iov->_refcount is set to one in mp_dmabuf_devmem_alloc_pages()
by calling 'refcount_set(&ppiov->_refcount, 1)' and always remains as one.

So the 'page_ref_count(page) == 1' checking is always true for devmem page.

Powered by blists - more mailing lists