linux-kernel - Re: doing lots of disk writes causes oom killer to kill processes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOMqctQyS2SFraqJpzE0sRFcihFpMHRhT+3QuZhxft=SUXYVDw@mail.gmail.com>
Date:	Mon, 26 Aug 2013 15:51:24 +0200
From:	Michal Suchanek <hramrach@...il.com>
To:	Hillf Danton <dhillf@...il.com>
Cc:	LKML <linux-kernel@...r.kernel.org>, Linux-MM <linux-mm@...ck.org>
Subject: Re: doing lots of disk writes causes oom killer to kill processes

On 12 March 2013 03:15, Hillf Danton <dhillf@...il.com> wrote:
>>On 11 March 2013 13:15, Michal Suchanek <hramrach@...il.com> wrote:
>>>On 8 February 2013 17:31, Michal Suchanek <hramrach@...il.com> wrote:
>>> Hello,
>>>
>>> I am dealing with VM disk images and performing something like wiping
>>> free space to prepare image for compressing and storing on server or
>>> copying it to external USB disk causes
>>>
>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>>> are 100% used by system and the machine is basicaly unusable
>>>
>>> 2) oom killer killing processes
>>>
>>> This all on system with 8G ram so there should be plenty space to work with.
>>>
>>> This happens with kernels 3.6.4 or 3.7.1
>>>
>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>>> problem even with less ram.
>>>
>>> I have  vm.swappiness = 0 set for a long  time already.
>>>
>>>
>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>>kernel still causes all cores to loop somewhere in system when writing
>>lots of data to disk.
>>
>>With swappiness as much as 90 processes still get killed on large disk writes.
>>
>>Given that the max is 100 the interval in which mm works at all is
>>going to be very narrow, less than 10% of the paramater range. This is
>>a severe regression as is the cpu time consumed by the kernel.
>>
>>The io scheduler is the default cfq.
>>
>>If you have any idea what to try other than downgrading to an earlier
>>unaffected kernel I would like to hear.
>>
> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
> deadlock caused by too_many_isolated())?
>
> Or try 3.8 and/or 3.9, additionally?
>

Hello,

with deadline IO scheduler I experience this issue less often but it
still happens.

I am on 3.9.6 Debian kernel so 3.8 did not fix this problem.

Do you have some idea what to log so that useful information about the
lockup is gathered?

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/