lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 21 Jun 2013 14:44:34 +0000
From:	Christoph Lameter <cl@...ux.com>
To:	Roland Dreier <roland@...nel.org>
cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Motohiro KOSAKI <kosaki.motohiro@...il.com>,
	penberg@...nel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH] mm: Revert pinned_vm braindamage

On Thu, 20 Jun 2013, Roland Dreier wrote:

> Christoph, your argument would be a lot more convincing if you stopped
> repeating this nonsense.  Sure, in a strict sense, it might be true

Well this is regarding tracking of pages that need to stay resident and
since the kernel does the pinning through the IB subsystem it is trackable
right there.  No nonsense and no need for a separate pinning system call.

> that the IB subsystem in the kernel is the code thatactually pins
> memory, but given that unprivileged userspace can tell the kernel to
> pin arbitrary parts of its memory for any amount of time, is that
> relevant?  And in fact taking your "initiate" word choice above, I
> don't even think your statement is true -- userspace initiates the
> pinning by, for example, doing an IB memory registration (libibverbs
> ibv_reg_mr() call), which turns into a system call, which leads to the
> kernel trying to pin pages.  The pages aren't unpinned until userspace
> unregisters the memory (or causes a cleanup by closing the context
> fd).

In some sense userspace initiates everything since the kernels purpose
is to run applications. So you can say that everything is user initated if
you wanted.

However, the user visible mechanism here is a registration of memory with
the IB subsystem for RDMA. The primary intend is not to pin the pages but
to make memory available for remote I/O. The pages are pinned *because*
otherwise remote RDMA operations could corrupt memory due to the kernel
moving/evicting memory.

> Here's an argument by analogy.  Would it make any sense for me to say
> userspace can't mlock memory, because only the kernel can set
> VM_LOCKED on a vma?  Of course not.  Userspace has the mlock() system
> call, and although the actual work happens in the kernel, we clearly
> want to be able to limit the amount of memory locked by the kernel ON
> BEHALF OF USERSPACE.

I would think that mlock is a memory management function and therefore the
app/user directly says that the memory is not to be evicted from memory.

This is different for the IB subsystem which is dealing with I/O and only
indirectly with memory. Would we have a different mechanism to prevent
reclaim etc the we would not need to pin the pages.

Actual there is such a mechanism that could be used here. If you had a
reserved memory region that is not mapped by the kernel (boot time alloc,
device memory) then you can use VM_PFNMAP to refer to that region and the
kernel would not be able to do reclaim on that memory. No pinning
necessary if the IB subsystem would register that type of memory.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ