linux-kernel - Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from ib_umem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20140828215135.GA11602@sbohrermbp13-local.rgmadvisors.com>
Date:	Thu, 28 Aug 2014 16:51:35 -0500
From:	Shawn Bohrer <shawn.bohrer@...il.com>
To:	Haggai Eran <haggaie@...lanox.com>
Cc:	Shachar Raindel <raindel@...lanox.com>,
	Roland Dreier <roland@...nel.org>,
	Christoph Lameter <cl@...ux.com>,
	Sean Hefty <sean.hefty@...el.com>,
	Hal Rosenstock <hal.rosenstock@...il.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"tomk@...advisors.com" <tomk@...advisors.com>,
	Shawn Bohrer <sbohrer@...advisors.com>,
	Yishai Hadas <yishaih@...lanox.com>,
	Or Gerlitz <ogerlitz@...lanox.com>
Subject: Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from
 ib_umem_get

On Thu, Aug 28, 2014 at 02:48:19PM +0300, Haggai Eran wrote:
> On 26/08/2014 00:07, Shawn Bohrer wrote:
> >>>> The following patch fixes the issue by storing the mm_struct of the
> >> > 
> >> > You are doing more than just storing the mm_struct - you are taking
> >> > a reference to the process' mm.  This can lead to a massive resource
> >> > leakage. The reason is bit complex: The destruction flow for IB
> >> > uverbs is based upon releasing the file handle for it. Once the file
> >> > handle is released, all MRs, QPs, CQs, PDs, etc. that the process
> >> > allocated are released.  For the kernel to release the file handle,
> >> > the kernel reference count to it needs to reach zero.  Most IB
> >> > implementations expose some hardware registers to the application by
> >> > allowing it to mmap the uverbs device file.  This mmap takes a
> >> > reference to uverbs device file handle that the application opened.
> >> > This reference is dropped when the process mm is released during the
> >> > process destruction.  Your code takes a reference to the mm that
> >> > will only be released when the parent MR/QP is released.
> >> > 
> >> > Now, we have a deadlock - the mm is waiting for the MR to be
> >> > destroyed, the MR is waiting for the file handle to be destroyed,
> >> > and the file handle is waiting for the mm to be destroyed.
> >> > 
> >> > The proper solution is to keep a reference to the task_pid (using
> >> > get_task_pid), and use this pid to get the task_struct and from it
> >> > the mm_struct during the destruction flow.
> >  
> > I'll put together a patch using get_task_pid() and see if I can
> > test/reproduce the issue.  This may take a couple of days since we
> > have to test this in production at the moment.
> >  
> 
> Hi,
> 
> I just wanted to point out that while working on the on demand paging patches
> we also needed to keep a reference to the task pid (to make sure we always 
> handle page faults on behalf of the correct mm struct). You can find the 
> relevant code in the patch titled "IB/core: Add support for on demand paging 
> regions" [1].
> 
> Haggai
> 
> [1] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg20552.html

Haggai,

I looked over the on demand paging patch and I'm not sure if you are
suggesting that it already fixes my issue, or that I should use it as
a reference for my code.  In any case I just sent a v2 of a patch that
appears to fix my issue.

--
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/