linux-kernel - Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54C11444.2020300@suse.cz>
Date:	Thu, 22 Jan 2015 16:16:20 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Michal Hocko <mhocko@...e.cz>,
	Vinayak Menon <vinmenon@...eaurora.org>
CC:	Christoph Lameter <cl@...ux.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	hannes@...xchg.org, vdavydov@...allels.com, mgorman@...e.de,
	minchan@...nel.org
Subject: Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated

On 01/21/2015 03:39 PM, Michal Hocko wrote:
> On Mon 19-01-15 09:57:08, Vinayak Menon wrote:
>> On 01/18/2015 01:18 AM, Christoph Lameter wrote:
>>> On Sat, 17 Jan 2015, Vinayak Menon wrote:
>>>
>>>> which had not updated the vmstat_diff. This CPU was in idle for around 30
>>>> secs. When I looked at the tvec base for this CPU, the timer associated with
>>>> vmstat_update had its expiry time less than current jiffies. This timer had
>>>> its deferrable flag set, and was tied to the next non-deferrable timer in the
>>>
>>> We can remove the deferrrable flag now since the vmstat threads are only
>>> activated as necessary with the recent changes. Looks like this could fix
>>> your issue?
>>>
>>
>> Yes, this should fix my issue.
>
> Does it? Because I would prefer not getting into un-synced state much
> more than playing around one specific place which shows the problems
> right now.
>
>> But I think we may need the fix in too_many_isolated, since there can still
>> be a delay of few seconds (HZ by default and even more because of reasons
>> pointed out by Michal) which will result in reclaimers unnecessarily
>> entering congestion_wait. No ?
>
> I think we can solve this as well. We can stick vmstat_shepherd into a
> kernel thread with a loop with the configured timeout and then create a
> mask of CPUs which need the update and run vmstat_update from
> IPI context (smp_call_function_many).
> We would have to drop cond_resched from refresh_cpu_vm_stats of
> course. The nr_zones x NR_VM_ZONE_STAT_ITEMS in the IPI context
> shouldn't be excessive but I haven't measured that so I might be easily
> wrong.
>
> Anyway, that should work more reliably than the current scheme and
> should help to reduce pointless wakeups which the original patchset was
> addressing.  Or am I missing something?

Maybe to further reduce wakeups, a CPU could check and update its 
counters before going idle? (unless that already happens)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/