lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161209173018.GA31809@dhcp22.suse.cz>
Date:   Fri, 9 Dec 2016 18:30:18 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Gerhard Wiesinger <lists@...singer.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Still OOM problems with 4.9er kernels

On Fri 09-12-16 17:58:14, Gerhard Wiesinger wrote:
> On 09.12.2016 17:09, Michal Hocko wrote:
[...]
> > > [97883.882611] Mem-Info:
> > > [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0
> > >                  active_file:3902 inactive_file:3639 isolated_file:0
> > >                  unevictable:0 dirty:205 writeback:0 unstable:0
> > >                  slab_reclaimable:9856 slab_unreclaimable:9682
> > >                  mapped:3722 shmem:59 pagetables:2080 bounce:0
> > >                  free:748 free_pcp:15 free_cma:0
> > there is still some page cache which doesn't seem to be neither dirty
> > nor under writeback. So it should be theoretically reclaimable but for
> > some reason we cannot seem to reclaim that memory.
> > There is still some anonymous memory and free swap so we could reclaim
> > it as well but it all seems pretty down and the memory pressure is
> > really large
> 
> Yes, it might be large on the update situation, but that should be handled
> by a virtual memory system by the kernel, right?

Well this is what we try and call it memory reclaim. But if we are not
able to reclaim anything then we eventually have to give up and trigger
the OOM killer. Now the information that 4.4 made a difference is
interesting. I do not really see any major differences in the reclaim
between 4.3 and 4.4 kernels. The reason might be somewhere else as well.
E.g. some of the subsystem consumes much more memory than before.

Just curious, what kind of filesystem are you using? Could you try some
additional debugging. Enabling reclaim related tracepoints might tell us
more. The following should tell us more
mount -t tracefs none /trace
echo 1 > /trace/events/vmscan/enable
echo 1 > /trace/events/writeback/writeback_congestion_wait/enable
cat /trace/trace_pipe > trace.log

Collecting /proc/vmstat over time might be helpful as well
mkdir logs
while true
do
	cp /proc/vmstat vmstat.$(date +%s)
	sleep 1s
done
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ