lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160926081200.GB27030@dhcp22.suse.cz>
Date:   Mon, 26 Sep 2016 10:12:01 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: warn about allocations which stall for too long

On Fri 23-09-16 10:34:01, Dave Hansen wrote:
> On 09/23/2016 01:15 AM, Michal Hocko wrote:
> > +	/* Make sure we know about allocations which stall for too long */
> > +	if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {
> > +		pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
> > +				current->comm, jiffies_to_msecs(jiffies-alloc_start),
> > +				order, gfp_mask, &gfp_mask);
> > +		stall_timeout += 10 * HZ;
> > +		dump_stack();
> > +	}
> 
> This would make an awesome tracepoint.  There's probably still plenty of
> value to having it in dmesg, but the configurability of tracepoints is
> hard to beat.

Currently we only have trace_mm_page_alloc in __alloc_pages_nodemask. I
think we want to add another one to mark the beginning of the allocation
so that we can track allocation latencies per allocation context and
ideally drop them down into sources - congestion waits, reclaim path,
slab reclaim etc. Janani Ravichandran is working on a script to do that
http://lkml.kernel.org/r/20160911222411.GA2854@janani-Inspiron-3521

But this sounds a bit orthogonal to my proposal here because I would
really like to warn unconditionally when an allocation stalls for
unreasonably long. Tracepoints are not an ideal tool for that because
you have to start collecting tracing output before this situations
happen. Moreover in my experience I often had to replace my local
debugging trace_printks by regular printks because the prior ones just
got lost under a heavy memory pressure.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ