[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7eddcc58-f65f-0be9-60e8-2de013365909@linux.microsoft.com>
Date: Thu, 17 Sep 2020 11:03:56 -0700
From: Vijay Balakrishna <vijayb@...ux.microsoft.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Oleg Nesterov <oleg@...hat.com>,
Song Liu <songliubraving@...com>,
Andrea Arcangeli <aarcange@...hat.com>,
Pavel Tatashin <pasha.tatashin@...een.com>,
Allen Pais <apais@...rosoft.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [[PATCH]] mm: khugepaged: recalculate min_free_kbytes after
memory hotplug as expected by khugepaged
On 9/17/2020 5:12 AM, Michal Hocko wrote:
> On Wed 16-09-20 11:28:40, Vijay Balakrishna wrote:
> [...]
>> OOM splat below. I see we had kmem leak detection turned on here. We
>> haven't run stress with kmem leak detection since uncovereing low
>> min_free_kbytes. During investigation we wanted to make sure there is no
>> kmem leaks, we didn't find significant leaks detected.
>>
>> [330319.766059] systemd invoked oom-killer:
>> gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=1, oom_score_adj=0
>
> [...]
>> [330319.861064] Mem-Info:
>> [330319.863519] active_anon:60744 inactive_anon:109226 isolated_anon:0
>> active_file:6418 inactive_file:3869 isolated_file:2
>> unevictable:0 dirty:8 writeback:1 unstable:0
>> slab_reclaimable:34660 slab_unreclaimable:795718
>> mapped:1256 shmem:165765 pagetables:689 bounce:0
>> free:340962 free_pcp:4672 free_cma:0
>
> The memory consumption is predominantely in slab (unreclaimable). Only
> ~8% of the memory is on LRUs (anonymous + file). Slab (both reclaimable
> and unreclaimable) is ~40%. So there is still a lot of memory
> unaccounted (direct users of the page allocator). This would partially
> explain why the oom killer is not able to make progress and eventually
> panics because it is the kernel which is blowing the memory consumption.
>
> There is still ~1G free memory but the problem is that this is a
> GFP_KERNEL request which is not allowed to consume Movable memory.
> Zone normal is depleted and therefore it cannot satisfy this request
> even when there are some order-1 pages available.
>
>> [330319.928124] Node 0 Normal free:12652kB min:14344kB low:19092kB=20
>> high:23840kB active_anon:55340kB inactive_anon:60276kB active_file:60kB
>> inactive_file:128kB unevictable:0kB writepending:4kB present:6220656kB
>> managed:4750196kB mlocked:0kB kernel_stack:9568kB pagetables:2756kB
>> bounce:0kB free_pcp:10056kB local_pcp:1376kB free_cma:0kB
> [...]
>> [330319.996879] Node 0 Normal: 3138*4kB (UME) 38*8kB (UM) 0*16kB 0*32kB
>> 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 12856kB
>
> I do not see the state of swap in the oom splat so I assume you have
> swap disabled. If that is the case then the memory reclaim cannot really
> do much for this request. There is almost no page cache to reclaim.
No swap configured in our system.
>
> That being said I do not see how a increased min_free_kbytes could help
> for this particular OOM situation. If there is really any relation it is
> more of a unintended side effect.
I haven't had a chance to rerun stress with kmem leak detection to know
if we still see OOM kills after min_free_kbytes restore.
>
> [...]
>>>> Extreme values can damage your system. Setting min_free_kbytes to an
>>>> extremely low value prevents the system from reclaiming memory, which can
>>>> result in system hangs and OOM-killing processes. However, setting
>>>> min_free_kbytes too high (for example, to 5–10% of total system memory)
>>>> causes the system to enter an out-of-memory state immediately, resulting in
>>>> the system spending too much time reclaiming memory.
>>>
>>> The auto tuned value should never reach such a low value to cause
>>> problems.
>>
>> The auto tuned value is incorrect post hotplug memory operation, in our use
>> case memoy hot add occurs very early during boot.
>
> Define incorrect. What are the actual values? Have you tried to increase
> the value manually after the hotplug?
In our case SoC with 8GB memory, system tuned min_free_kbytes
- first to 22528
- we perform memory hot add very early in boot
- now min_free_kbytes is 8703
Before looking at code, first I manually restored min_free_kbytes soon
after boot, reran stress and didn't notice symptoms I mentioned in
change log.
Thanks,
Vijay
Powered by blists - more mailing lists