lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <531F1550.3080401@mellanox.com>
Date:	Tue, 11 Mar 2014 15:53:20 +0200
From:	Or Gerlitz <ogerlitz@...lanox.com>
To:	Roland Dreier <roland@...nel.org>, Jiri Kosina <jkosina@...e.cz>
CC:	Sagi Grimberg <sagig@...lanox.com>,
	Amir Vadai <amirv@...lanox.com>,
	Eli Cohen <eli@....mellanox.co.il>,
	Eugenia Emantayev <eugenia@...lanox.com>,
	"David S. Miller" <davem@...emloft.net>,
	Mel Gorman <mgorman@...e.de>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Saeed Mahameed <saeedm@...lanox.com>,
	Shlomo Pongratz <shlomop@...lanox.com>
Subject: Re: [PATCH] mlx4: Use GFP_NOFS calls during the ipoib TX path when
 creating the QP

On 05/03/2014 21:25, Roland Dreier wrote:
> It's quite clear that this is a general problem with IPoIB connected
> mode on any IB device.  In connected mode, a packet send can trigger
> establishing a new connection, which will allocate a new QP, which in
> particular will allocate memory for the QP in the low-level IB device
> driver.  Currently I'm positive that every driver will do GFP_KERNEL
> allocations when allocating a QP (ehca does both a GFP_KERNEL
> kmem_cache allocation and vmalloc in internal_create_qp(), mlx5 and
> mthca are similar to mlx4 and qib does vmalloc() in qib_create_qp()).
> So this patch needs to be extended to the other 4 IB device drivers in
> the tree.
>
> Also, I don't think GFP_NOFS is enough -- it seems we need GFP_NOIO,
> since we could be swapping to a block device over iSCSI over IPoIB-CM,
> so even non-FS stuff could deadlock.
>
> I don't think it makes any sense to have a "do_not_deadlock" module
> parameter, especially one that defaults to "false."  If this is the
> right thing to do, then we should just unconditionally do it.
>
> It does seem that only using GFP_NOIO when we really need to would be
> a very difficult problem--how can we carry information about whether a
> particular packet is involved in freeing memory through all the layers
> of, say, NFS, TCP, IPSEC, bonding, &c?

Agree with all the above... next,

If we don't have away to nicely overcome the layer violations here, 
let's change IPoIB so they always ask the
IB driver to allocate QPs used for Connected Mode in a GFP_NOIO manner, 
to be practical I suggest the following:

1. Add new QP creation flag IB_QP_CREATE_USE_GFP to the existing 
creation flags of struct ib_qp_init_attr
and a new "gfp_t gfp" field to that structure too

2. in the IPoIB CM code, do the vzalloc allocation for new connection in 
GFP_NOIO manner and issue
the call to create QP with setting the IB_QP_CREATE_USE_GFP flag and 
GFO_NOIO to the gfp field

3. If the QP creation fails, with -EINVAL, issue a warning and retry the 
QP creation attempt without the GFP setting

4. implement in the mlx4 driver the support for GFP directives on QP 
creation

5. for the rest of the IB drivers, return -EINVAL if 
IB_QP_CREATE_USE_GFP is set

This will allow to provide working solution for mlx4 users and gradually 
add support for the rest of the IB drivers.

as for proper patch planning

patch #1 / items 1 and 5
patch #2 / item 4
patch #3 / item 3

Re item 5 -- I made a check and the ehca, ipath and mthca driver already 
return -EINVAL if provided with any creation flag, so you only need to 
patch the qib driver in qib_create_qp() to do that as well which is trivial.

As for the rest of the code, you practically have it all by now, just 
need to port the mlx4 changes you did to the suggested framework, remove 
the module param (which you don't like either) and add the new gfp_t 
field to ib_qp_init_attr

So sounds like a plan that makes sense?

Or.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ