[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0d87c81b-6bf3-4e46-a2d1-a2f9f3a551dd@gmail.com>
Date: Thu, 15 Aug 2024 19:48:02 +0100
From: Usama Arif <usamaarif642@...il.com>
To: Yu Zhao <yuzhao@...gle.com>, Kefeng Wang <wangkefeng.wang@...wei.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>, Zi Yan <ziy@...dia.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH mm-unstable v1 2/3] mm/cma: add cma_alloc_folio()
On 15/08/2024 19:25, Yu Zhao wrote:
> On Thu, Aug 15, 2024 at 12:04 PM Yu Zhao <yuzhao@...gle.com> wrote:
>>
>> On Thu, Aug 15, 2024 at 8:41 AM Kefeng Wang <wangkefeng.wang@...wei.com> wrote:
>>>
>>>
>>>
>>> On 2024/8/12 5:21, Yu Zhao wrote:
>>>> With alloc_contig_range() and free_contig_range() supporting large
>>>> folios, CMA can allocate and free large folios too, by
>>>> cma_alloc_folio() and cma_release().
>>>>
>>>> Signed-off-by: Yu Zhao <yuzhao@...gle.com>
>>>> ---
>>>> include/linux/cma.h | 1 +
>>>> mm/cma.c | 47 ++++++++++++++++++++++++++++++---------------
>>>> 2 files changed, 33 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>>>> index 9db877506ea8..086553fbda73 100644
>>>> --- a/include/linux/cma.h
>>>> +++ b/include/linux/cma.h
>>>> @@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>>>> struct cma **res_cma);
>>>> extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
>>>> bool no_warn);
>>>> +extern struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
>>>> extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
>>>> extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
>>>>
>>>> diff --git a/mm/cma.c b/mm/cma.c
>>>> index 95d6950e177b..46feb06db8e7 100644
>>>> --- a/mm/cma.c
>>>> +++ b/mm/cma.c
>>>> @@ -403,18 +403,8 @@ static void cma_debug_show_areas(struct cma *cma)
>>>> spin_unlock_irq(&cma->lock);
>>>> }
>>>>
>>>> -/**
>>>> - * cma_alloc() - allocate pages from contiguous area
>>>> - * @cma: Contiguous memory region for which the allocation is performed.
>>>> - * @count: Requested number of pages.
>>>> - * @align: Requested alignment of pages (in PAGE_SIZE order).
>>>> - * @no_warn: Avoid printing message about failed allocation
>>>> - *
>>>> - * This function allocates part of contiguous memory on specific
>>>> - * contiguous memory area.
>>>> - */
>>>> -struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> - unsigned int align, bool no_warn)
>>>> +static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>>>> + unsigned int align, gfp_t gfp)
>>>> {
>>>> unsigned long mask, offset;
>>>> unsigned long pfn = -1;
>>>> @@ -463,8 +453,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>>
>>>> pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
>>>> mutex_lock(&cma_mutex);
>>>> - ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
>>>> - GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
>>>> + ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
>>>> mutex_unlock(&cma_mutex);
>>>> if (ret == 0) {
>>>> page = pfn_to_page(pfn);
>>>> @@ -494,7 +483,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> page_kasan_tag_reset(nth_page(page, i));
>>>> }
>>>>
>>>> - if (ret && !no_warn) {
>>>> + if (ret && !(gfp & __GFP_NOWARN)) {
>>>> pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n",
>>>> __func__, cma->name, count, ret);
>>>> cma_debug_show_areas(cma);
>>>> @@ -513,6 +502,34 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> return page;
>>>> }
>>>>
>>>> +/**
>>>> + * cma_alloc() - allocate pages from contiguous area
>>>> + * @cma: Contiguous memory region for which the allocation is performed.
>>>> + * @count: Requested number of pages.
>>>> + * @align: Requested alignment of pages (in PAGE_SIZE order).
>>>> + * @no_warn: Avoid printing message about failed allocation
>>>> + *
>>>> + * This function allocates part of contiguous memory on specific
>>>> + * contiguous memory area.
>>>> + */
>>>> +struct page *cma_alloc(struct cma *cma, unsigned long count,
>>>> + unsigned int align, bool no_warn)
>>>> +{
>>>> + return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
>>>> +}
>>>> +
>>>> +struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
>>>> +{
>>>> + struct page *page;
>>>> +
>>>> + if (WARN_ON(order && !(gfp | __GFP_COMP)))
>>>> + return NULL;
>>>> +
>>>> + page = __cma_alloc(cma, 1 << order, order, gfp);
>>>> +
>>>> + return page ? page_folio(page) : NULL;
>>>
>>> We don't set large_rmappable for cma alloc folio, which is not consistent
>>> with other folio allocation, eg folio_alloc/folio_alloc_mpol(),
>>> there is no issue for HugeTLB folio, and for HugeTLB folio must without
>>> large_rmappable, but once we use it for mTHP/THP, it need some extra
>>> handle, maybe we set large_rmappable here, and clear it in
>>> init_new_hugetlb_folio()?
>>
>> I want to hear what Matthew thinks about this.
>>
>> My opinion is that we don't want to couple largely rmappable (or
>> deferred splittable) with __GFP_COMP, and for that matter, with large
>> folios, because the former are specific to THPs whereas the latter can
>> potentially work for most types of high order allocations.
>>
>> Again, IMO, if we want to seriously answer the question of
>> Can we get rid of non-compound multi-page allocations? [1]
>> then we should start planning on decouple large rmappable from the
>> generic folio allocation API.
>>
>> [1] https://lpc.events/event/18/sessions/184/#20240920
>
> Also along the similar lines, Usama is trying to add
> PG_partially_mapped [1], which I have explicitly asked him not to
> introduce that flag to hugeTLB, unless there are good reasons (none
> ATM).
>
> [1] https://lore.kernel.org/CAOUHufbmgwZwzUuHVvEDMqPGcsxE2hEreRZ4PhK5yz27GdK-Tw@mail.gmail.com/
PG_partially_mapped won't be cleared for hugeTLB in the next revision of the series as suggested by Yu.
Its not there in the fix patch I posted as well in https://lore.kernel.org/all/4acdf2b7-ed65-4087-9806-8f4a187b4eb5@gmail.com/
Powered by blists - more mailing lists