lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 11 Jul 2016 16:47:17 +0200
From:	Matthias Dahl <ml_linux-kernel@...ary-island.eu>
To:	Mike Snitzer <snitzer@...hat.com>
Cc:	linux-mm@...ck.org, dm-devel@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [dm-devel] [4.7.0rc6] Page Allocation Failures with dm-crypt

Hello Mike...

On 2016-07-11 15:30, Mike Snitzer wrote:

> But that is expected given you're doing an unbounded buffered write to
> the device.  What isn't expected, to me anyway, is that the mm 
> subsystem
> (or the default knobs for buffered writeback) would be so aggressive
> about delaying writeback.

Ok. But, and please correct me if I am wrong, I was under the impression
that only the file caches/buffers were affected, iow, if I use free to
monitor the memory usage, the used memory increases to the point where 
it
consumes all memory, not the buffers/file caches... that is what I am
seeing here.

Also, if I use dd directly on the device w/o dm-crypt in-between, there
is no problem. Sure, buffers increase hugely also... but only those.

> Why are you doing this test anyway?  Such a large buffered write 
> doesn't
> seem to accurately model any application I'm aware of (but obviously it
> should still "work").

It is not a test per se. I simply wanted to fill the partition with 
noise.
And doing it this way is faster than using urandom or anything. ;-) That 
is
why I stumbled over this issue in the first place.

> Now that is weird.  Are you (or the distro you're using) setting any mm
> subsystem tunables to really broken values?

You can see those in my initial mail. I attached the kernel warnings, 
all
sysctl tunables and more. Maybe that helps.

> What is your raid10's full stripesize?

4 disks in RAID10, with a stripe size of 64k.

> Is your dd IO size of 512K somehow triggering excess R-M-W cycles which
> is exacerbating the problem?

The partitions are properly aligned. And as you can see, with that 
stripe
size, there is no issue.

In the meantime I did some further tests: I created an ext2 on the
partition as well as a 60GiB container image on it. I used that image
with dm-crypt, same parameters as before. No matter what I do here, I
cannot trigger the same behavior.

Maybe it is an interaction issue between dm-crypt and the s/w RAID. But
at this point, I have no idea how to further diagnose/test it. If you
can point me in any direction that would be great...

With Kind Regards from Germany
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
  services: custom software [desktop, mobile, web], server administration

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ