lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <854d4ec8-1eb9-3595-b867-3e50f5a0e6a8@redhat.com>
Date:   Wed, 17 Feb 2021 17:51:27 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Minchan Kim <minchan@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>,
        mhocko@...e.com, joaodias@...gle.com
Subject: Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

On 17.02.21 17:36, Minchan Kim wrote:
> alloc_contig_range is usually used on cma area or movable zone.
> It's critical if the page migration fails on those areas so
> dump more debugging message like memory_hotplug unless user
> specifiy __GFP_NOWARN.
> 
> Signed-off-by: Minchan Kim <minchan@...nel.org>
> ---
>   mm/page_alloc.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0b55c9c95364..67f3ee3a1528 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8486,6 +8486,15 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>   				NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
>   	}
>   	if (ret < 0) {
> +		if (!(cc->gfp_mask & __GFP_NOWARN)) {
> +			struct page *page;
> +
> +			list_for_each_entry(page, &cc->migratepages, lru) {
> +				pr_warn("migrating pfn %lx failed ret:%d ",
> +						page_to_pfn(page), ret);
> +				dump_page(page, "migration failure");
> +			}

This can create *a lot* of noise. For example, until huge pages are 
actually considered, we will choke on each end every huge page - and 
might do so over and over again.

This might be helpful for debugging, but is unacceptable for production 
systems for now I think. Maybe for now, do it based on CONFIG_DEBUG_VM.

> +		}
>   		putback_movable_pages(&cc->migratepages);
>   		return ret;
>   	}
> @@ -8728,6 +8737,8 @@ struct page *alloc_contig_pages(unsigned long nr_pages, gfp_t gfp_mask,
>   		pfn = ALIGN(zone->zone_start_pfn, nr_pages);
>   		while (zone_spans_last_pfn(zone, pfn, nr_pages)) {
>   			if (pfn_range_valid_contig(zone, pfn, nr_pages)) {
> +				unsigned long gfp_flags;
> +
>   				/*
>   				 * We release the zone lock here because
>   				 * alloc_contig_range() will also lock the zone
> @@ -8736,8 +8747,11 @@ struct page *alloc_contig_pages(unsigned long nr_pages, gfp_t gfp_mask,
>   				 * and cause alloc_contig_range() to fail...
>   				 */
>   				spin_unlock_irqrestore(&zone->lock, flags);
> +
> +				if (zone_idx(zone) != ZONE_MOVABLE)
> +					gfp_flags = gfp_mask | __GFP_NOWARN;

This feels wrong. It might be better to make that decision inside 
__alloc_contig_migrate_range() based on cc->zone.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ