lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53FF1703.3010204@mellanox.com>
Date:	Thu, 28 Aug 2014 14:48:19 +0300
From:	Haggai Eran <haggaie@...lanox.com>
To:	Shawn Bohrer <shawn.bohrer@...il.com>,
	Shachar Raindel <raindel@...lanox.com>
CC:	Roland Dreier <roland@...nel.org>,
	Christoph Lameter <cl@...ux.com>,
	"Sean Hefty" <sean.hefty@...el.com>,
	Hal Rosenstock <hal.rosenstock@...il.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"tomk@...advisors.com" <tomk@...advisors.com>,
	Shawn Bohrer <sbohrer@...advisors.com>,
	Yishai Hadas <yishaih@...lanox.com>,
	Or Gerlitz <ogerlitz@...lanox.com>
Subject: Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from ib_umem_get

On 26/08/2014 00:07, Shawn Bohrer wrote:
>>>> The following patch fixes the issue by storing the mm_struct of the
>> > 
>> > You are doing more than just storing the mm_struct - you are taking
>> > a reference to the process' mm.  This can lead to a massive resource
>> > leakage. The reason is bit complex: The destruction flow for IB
>> > uverbs is based upon releasing the file handle for it. Once the file
>> > handle is released, all MRs, QPs, CQs, PDs, etc. that the process
>> > allocated are released.  For the kernel to release the file handle,
>> > the kernel reference count to it needs to reach zero.  Most IB
>> > implementations expose some hardware registers to the application by
>> > allowing it to mmap the uverbs device file.  This mmap takes a
>> > reference to uverbs device file handle that the application opened.
>> > This reference is dropped when the process mm is released during the
>> > process destruction.  Your code takes a reference to the mm that
>> > will only be released when the parent MR/QP is released.
>> > 
>> > Now, we have a deadlock - the mm is waiting for the MR to be
>> > destroyed, the MR is waiting for the file handle to be destroyed,
>> > and the file handle is waiting for the mm to be destroyed.
>> > 
>> > The proper solution is to keep a reference to the task_pid (using
>> > get_task_pid), and use this pid to get the task_struct and from it
>> > the mm_struct during the destruction flow.
>  
> I'll put together a patch using get_task_pid() and see if I can
> test/reproduce the issue.  This may take a couple of days since we
> have to test this in production at the moment.
>  

Hi,

I just wanted to point out that while working on the on demand paging patches
we also needed to keep a reference to the task pid (to make sure we always 
handle page faults on behalf of the correct mm struct). You can find the 
relevant code in the patch titled "IB/core: Add support for on demand paging 
regions" [1].

Haggai

[1] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg20552.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ