lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Sep 2020 11:03:56 -0700
From:   Vijay Balakrishna <vijayb@...ux.microsoft.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Song Liu <songliubraving@...com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Allen Pais <apais@...rosoft.com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [[PATCH]] mm: khugepaged: recalculate min_free_kbytes after
 memory hotplug as expected by khugepaged



On 9/17/2020 5:12 AM, Michal Hocko wrote:
> On Wed 16-09-20 11:28:40, Vijay Balakrishna wrote:
> [...]
>> OOM splat below.  I see we had kmem leak detection turned on here.  We
>> haven't run stress with kmem leak detection since uncovereing low
>> min_free_kbytes.  During investigation we wanted to make sure there is no
>> kmem leaks, we didn't find significant leaks detected.
>>
>> [330319.766059] systemd invoked oom-killer:
>> gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=1, oom_score_adj=0
> 
> [...]
>> [330319.861064] Mem-Info:
>> [330319.863519] active_anon:60744 inactive_anon:109226 isolated_anon:0
>>                   active_file:6418 inactive_file:3869 isolated_file:2
>>                   unevictable:0 dirty:8 writeback:1 unstable:0
>>                   slab_reclaimable:34660 slab_unreclaimable:795718
>>                   mapped:1256 shmem:165765 pagetables:689 bounce:0
>>                   free:340962 free_pcp:4672 free_cma:0
> 
> The memory consumption is predominantely in slab (unreclaimable). Only
> ~8% of the memory is on LRUs (anonymous + file). Slab (both reclaimable
> and unreclaimable) is ~40%. So there is still a lot of memory
> unaccounted (direct users of the page allocator). This would partially
> explain why the oom killer is not able to make progress and eventually
> panics because it is the kernel which is blowing the memory consumption.
> 
> There is still ~1G free memory but the problem is that this is a
> GFP_KERNEL request which is not allowed to consume Movable memory.
> Zone normal is depleted and therefore it cannot satisfy this request
> even when there are some order-1 pages available.
> 
>> [330319.928124] Node 0 Normal free:12652kB min:14344kB low:19092kB=20
>> high:23840kB active_anon:55340kB inactive_anon:60276kB active_file:60kB
>> inactive_file:128kB unevictable:0kB writepending:4kB present:6220656kB
>> managed:4750196kB mlocked:0kB kernel_stack:9568kB pagetables:2756kB
>> bounce:0kB free_pcp:10056kB local_pcp:1376kB free_cma:0kB
> [...]
>> [330319.996879] Node 0 Normal: 3138*4kB (UME) 38*8kB (UM) 0*16kB 0*32kB
>> 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 12856kB
> 
> I do not see the state of swap in the oom splat so I assume you have
> swap disabled. If that is the case then the memory reclaim cannot really
> do much for this request. There is almost no page cache to reclaim.

No swap configured in our system.
> 
> That being said I do not see how a increased min_free_kbytes could help
> for this particular OOM situation. If there is really any relation it is
> more of a unintended side effect.

I haven't had a chance to rerun stress with kmem leak detection to know 
if we still see OOM kills after min_free_kbytes restore.
> 
> [...]
>>>> Extreme values can damage your system. Setting min_free_kbytes to an
>>>> extremely low value prevents the system from reclaiming memory, which can
>>>> result in system hangs and OOM-killing processes. However, setting
>>>> min_free_kbytes too high (for example, to 5–10% of total system memory)
>>>> causes the system to enter an out-of-memory state immediately, resulting in
>>>> the system spending too much time reclaiming memory.
>>>
>>> The auto tuned value should never reach such a low value to cause
>>> problems.
>>
>> The auto tuned value is incorrect post hotplug memory operation, in our use
>> case memoy hot add occurs very early during boot.
>   
> Define incorrect. What are the actual values? Have you tried to increase
> the value manually after the hotplug?

In our case SoC with 8GB memory, system tuned min_free_kbytes
- first to 22528
- we perform memory hot add very early in boot
- now min_free_kbytes is 8703

Before looking at code, first I manually restored min_free_kbytes soon 
after boot, reran stress and didn't notice symptoms I mentioned in 
change log.

Thanks,
Vijay

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ