[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <531F1550.3080401@mellanox.com>
Date: Tue, 11 Mar 2014 15:53:20 +0200
From: Or Gerlitz <ogerlitz@...lanox.com>
To: Roland Dreier <roland@...nel.org>, Jiri Kosina <jkosina@...e.cz>
CC: Sagi Grimberg <sagig@...lanox.com>,
Amir Vadai <amirv@...lanox.com>,
Eli Cohen <eli@....mellanox.co.il>,
Eugenia Emantayev <eugenia@...lanox.com>,
"David S. Miller" <davem@...emloft.net>,
Mel Gorman <mgorman@...e.de>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Saeed Mahameed <saeedm@...lanox.com>,
Shlomo Pongratz <shlomop@...lanox.com>
Subject: Re: [PATCH] mlx4: Use GFP_NOFS calls during the ipoib TX path when
creating the QP
On 05/03/2014 21:25, Roland Dreier wrote:
> It's quite clear that this is a general problem with IPoIB connected
> mode on any IB device. In connected mode, a packet send can trigger
> establishing a new connection, which will allocate a new QP, which in
> particular will allocate memory for the QP in the low-level IB device
> driver. Currently I'm positive that every driver will do GFP_KERNEL
> allocations when allocating a QP (ehca does both a GFP_KERNEL
> kmem_cache allocation and vmalloc in internal_create_qp(), mlx5 and
> mthca are similar to mlx4 and qib does vmalloc() in qib_create_qp()).
> So this patch needs to be extended to the other 4 IB device drivers in
> the tree.
>
> Also, I don't think GFP_NOFS is enough -- it seems we need GFP_NOIO,
> since we could be swapping to a block device over iSCSI over IPoIB-CM,
> so even non-FS stuff could deadlock.
>
> I don't think it makes any sense to have a "do_not_deadlock" module
> parameter, especially one that defaults to "false." If this is the
> right thing to do, then we should just unconditionally do it.
>
> It does seem that only using GFP_NOIO when we really need to would be
> a very difficult problem--how can we carry information about whether a
> particular packet is involved in freeing memory through all the layers
> of, say, NFS, TCP, IPSEC, bonding, &c?
Agree with all the above... next,
If we don't have away to nicely overcome the layer violations here,
let's change IPoIB so they always ask the
IB driver to allocate QPs used for Connected Mode in a GFP_NOIO manner,
to be practical I suggest the following:
1. Add new QP creation flag IB_QP_CREATE_USE_GFP to the existing
creation flags of struct ib_qp_init_attr
and a new "gfp_t gfp" field to that structure too
2. in the IPoIB CM code, do the vzalloc allocation for new connection in
GFP_NOIO manner and issue
the call to create QP with setting the IB_QP_CREATE_USE_GFP flag and
GFO_NOIO to the gfp field
3. If the QP creation fails, with -EINVAL, issue a warning and retry the
QP creation attempt without the GFP setting
4. implement in the mlx4 driver the support for GFP directives on QP
creation
5. for the rest of the IB drivers, return -EINVAL if
IB_QP_CREATE_USE_GFP is set
This will allow to provide working solution for mlx4 users and gradually
add support for the rest of the IB drivers.
as for proper patch planning
patch #1 / items 1 and 5
patch #2 / item 4
patch #3 / item 3
Re item 5 -- I made a check and the ehca, ipath and mthca driver already
return -EINVAL if provided with any creation flag, so you only need to
patch the qib driver in qib_create_qp() to do that as well which is trivial.
As for the rest of the code, you practically have it all by now, just
need to port the mlx4 changes you did to the suggested framework, remove
the module param (which you don't like either) and add the new gfp_t
field to ib_qp_init_attr
So sounds like a plan that makes sense?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists