lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xa1td1hcwhpk.fsf@mina86.com>
Date:   Thu, 01 Dec 2016 02:39:35 +0100
From:   Michal Nazarewicz <mina86@...a86.com>
To:     "Robin H. Johnson" <robbat2@...too.org>,
        Michal Hocko <mhocko@...nel.org>
Cc:     "Robin H. Johnson" <robbat2@...is-terrarum.net>,
        linux-kernel@...r.kernel.org, robbat2@...too.org,
        linux-mm@...ck.org
Subject: Re: PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy

On Wed, Nov 30 2016, Robin H. Johnson wrote:
> (I'm going to respond directly to this email with the stack trace.)
>
> On Wed, Nov 30, 2016 at 02:28:49PM +0100, Michal Hocko wrote:
>> > On the other hand, if this didn’t happen and now happens all the time,
>> > this indicates a regression in CMA’s capability to allocate pages so
>> > just rate limiting the output would hide the potential actual issue.
>> 
>> Or there might be just a much larger demand on those large blocks, no?
>> But seriously, dumping those message again and again into the low (see
>> the 2.5_GB_/h to the log is just insane. So there really should be some
>> throttling.
>> 
>> Does the following help you Robin. At least to not get swamped by those
>> message.
> Here's what I whipped up based on that, to ensure that dump_stack got
> rate-limited at the same pass as PFNs-busy. It dropped the dmesg spew to
> ~25MB/hour (and is suppressing ~43 entries/second right now).
>
> commit 6ad4037e18ec2199f8755274d8a745a9904241a1
> Author: Robin H. Johnson <robbat2@...too.org>
> Date:   Wed Nov 30 10:32:57 2016 -0800
>
>     mm: ratelimit & trace PFNs busy.
>     
>     Signed-off-by: Robin H. Johnson <robbat2@...too.org>

Acked-by: Michal Nazarewicz <mina86@...a86.com>

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6de9440e3ae2..3c28ec3d18f8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7289,8 +7289,15 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  
>  	/* Make sure the range is really isolated. */
>  	if (test_pages_isolated(outer_start, end, false)) {
> -		pr_info("%s: [%lx, %lx) PFNs busy\n",
> -			__func__, outer_start, end);
> +		static DEFINE_RATELIMIT_STATE(ratelimit_pfn_busy,
> +					DEFAULT_RATELIMIT_INTERVAL,
> +					DEFAULT_RATELIMIT_BURST);
> +		if (__ratelimit(&ratelimit_pfn_busy)) {
> +			pr_info("%s: [%lx, %lx) PFNs busy\n",
> +				__func__, outer_start, end);

I’m thinking out loud here, but maybe it would be useful to include
a count of how many times this message has been suppressed?

> +			dump_stack();

Perhaps do it only if CMA_DEBUG?

+			if (IS_ENABLED(CONFIG_CMA_DEBUG))
+				dump_stack();

> +		}
> +
>  		ret = -EBUSY;
>  		goto done;
>  	}
>
> -- 
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
> E-Mail   : robbat2@...too.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ