[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140828215135.GA11602@sbohrermbp13-local.rgmadvisors.com>
Date: Thu, 28 Aug 2014 16:51:35 -0500
From: Shawn Bohrer <shawn.bohrer@...il.com>
To: Haggai Eran <haggaie@...lanox.com>
Cc: Shachar Raindel <raindel@...lanox.com>,
Roland Dreier <roland@...nel.org>,
Christoph Lameter <cl@...ux.com>,
Sean Hefty <sean.hefty@...el.com>,
Hal Rosenstock <hal.rosenstock@...il.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"tomk@...advisors.com" <tomk@...advisors.com>,
Shawn Bohrer <sbohrer@...advisors.com>,
Yishai Hadas <yishaih@...lanox.com>,
Or Gerlitz <ogerlitz@...lanox.com>
Subject: Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from
ib_umem_get
On Thu, Aug 28, 2014 at 02:48:19PM +0300, Haggai Eran wrote:
> On 26/08/2014 00:07, Shawn Bohrer wrote:
> >>>> The following patch fixes the issue by storing the mm_struct of the
> >> >
> >> > You are doing more than just storing the mm_struct - you are taking
> >> > a reference to the process' mm. This can lead to a massive resource
> >> > leakage. The reason is bit complex: The destruction flow for IB
> >> > uverbs is based upon releasing the file handle for it. Once the file
> >> > handle is released, all MRs, QPs, CQs, PDs, etc. that the process
> >> > allocated are released. For the kernel to release the file handle,
> >> > the kernel reference count to it needs to reach zero. Most IB
> >> > implementations expose some hardware registers to the application by
> >> > allowing it to mmap the uverbs device file. This mmap takes a
> >> > reference to uverbs device file handle that the application opened.
> >> > This reference is dropped when the process mm is released during the
> >> > process destruction. Your code takes a reference to the mm that
> >> > will only be released when the parent MR/QP is released.
> >> >
> >> > Now, we have a deadlock - the mm is waiting for the MR to be
> >> > destroyed, the MR is waiting for the file handle to be destroyed,
> >> > and the file handle is waiting for the mm to be destroyed.
> >> >
> >> > The proper solution is to keep a reference to the task_pid (using
> >> > get_task_pid), and use this pid to get the task_struct and from it
> >> > the mm_struct during the destruction flow.
> >
> > I'll put together a patch using get_task_pid() and see if I can
> > test/reproduce the issue. This may take a couple of days since we
> > have to test this in production at the moment.
> >
>
> Hi,
>
> I just wanted to point out that while working on the on demand paging patches
> we also needed to keep a reference to the task pid (to make sure we always
> handle page faults on behalf of the correct mm struct). You can find the
> relevant code in the patch titled "IB/core: Add support for on demand paging
> regions" [1].
>
> Haggai
>
> [1] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg20552.html
Haggai,
I looked over the on demand paging patch and I'm not sure if you are
suggesting that it already fixes my issue, or that I should use it as
a reference for my code. In any case I just sent a v2 of a patch that
appears to fix my issue.
--
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists