linux-kernel - Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150121143920.GD23700@dhcp22.suse.cz>
Date:	Wed, 21 Jan 2015 15:39:20 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Vinayak Menon <vinmenon@...eaurora.org>
Cc:	Christoph Lameter <cl@...ux.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	hannes@...xchg.org, vdavydov@...allels.com, mgorman@...e.de,
	minchan@...nel.org
Subject: Re: [PATCH v2] mm: vmscan: fix the page state calculation in
 too_many_isolated

On Mon 19-01-15 09:57:08, Vinayak Menon wrote:
> On 01/18/2015 01:18 AM, Christoph Lameter wrote:
> >On Sat, 17 Jan 2015, Vinayak Menon wrote:
> >
> >>which had not updated the vmstat_diff. This CPU was in idle for around 30
> >>secs. When I looked at the tvec base for this CPU, the timer associated with
> >>vmstat_update had its expiry time less than current jiffies. This timer had
> >>its deferrable flag set, and was tied to the next non-deferrable timer in the
> >
> >We can remove the deferrrable flag now since the vmstat threads are only
> >activated as necessary with the recent changes. Looks like this could fix
> >your issue?
> >
> 
> Yes, this should fix my issue.

Does it? Because I would prefer not getting into un-synced state much
more than playing around one specific place which shows the problems
right now.

> But I think we may need the fix in too_many_isolated, since there can still
> be a delay of few seconds (HZ by default and even more because of reasons
> pointed out by Michal) which will result in reclaimers unnecessarily
> entering congestion_wait. No ?

I think we can solve this as well. We can stick vmstat_shepherd into a
kernel thread with a loop with the configured timeout and then create a
mask of CPUs which need the update and run vmstat_update from
IPI context (smp_call_function_many).
We would have to drop cond_resched from refresh_cpu_vm_stats of
course. The nr_zones x NR_VM_ZONE_STAT_ITEMS in the IPI context
shouldn't be excessive but I haven't measured that so I might be easily
wrong.

Anyway, that should work more reliably than the current scheme and
should help to reduce pointless wakeups which the original patchset was
addressing.  Or am I missing something?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/