[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ3xEMjomTc5JNNEknk-JMBUtvAcb5xY8La4zT4mKG2Ad=V85Q@mail.gmail.com>
Date: Tue, 3 May 2016 10:45:43 +0300
From: Or Gerlitz <gerlitz.or@...il.com>
To: Haggai Abramovsky <hagaya@...lanox.com>,
"David S. Miller" <davem@...emloft.net>
Cc: Linux Netdev List <netdev@...r.kernel.org>,
Sinan Kaya <okaya@...eaurora.org>,
Timur Tabi <timur@...eaurora.org>,
Eran Ben Elisha <eranbe@...lanox.com>,
Yishai Hadas <yishaih@...lanox.com>,
Tal Alon <talal@...lanox.com>,
Saeed Mahameed <saeedm@...lanox.com>
Subject: Re: [PATCH v1 net] net/mlx4: Avoid wrong virtual mappings
On Wed, Apr 27, 2016 at 10:07 AM, Haggai Abramovsky <hagaya@...lanox.com> wrote:
> The dma_alloc_coherent() function returns a virtual address which can
> be used for coherent access to the underlying memory. On some
> architectures, like arm64, undefined behavior results if this memory is
> also accessed via virtual mappings that are not coherent. Because of
> their undefined nature, operations like virt_to_page() return garbage
> when passed virtual addresses obtained from dma_alloc_coherent(). Any
> subsequent mappings via vmap() of the garbage page values are unusable
> and result in bad things like bus errors (synchronous aborts in ARM64
> speak).
>
> The mlx4 driver contains code that does the equivalent of:
> vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the
> device is opened.
>
> Prevent Ethernet driver to run this problematic code by forcing it to
> allocate contiguous memory. As for the Infiniband driver, at first we
> are trying to allocate contiguous memory, but in case of failure roll
> back to work with fragmented memory.
Dave,
The patch changes the driver to do single allocation for potentially
very large HW WQE
descriptor buffers such as those used by the RDMA (mlx5_ib) driver.
The IB driver does
have the means to cope with fragmented allocations, and under RDMA use
cases, QPs
are being frequently set not only on system startup, but rather
throughout all the lifecycles
(e.g every now and then in production systems). As of all the above,
we prefer the patch
to go to net-next and not net. This will make the code (1) correct,
and (2) we have the chance
to do the whatever investigations needed and if required add a follow
up fix for 4.7-rc
Or.
> Signed-off-by: Haggai Abramovsky <hagaya@...lanox.com>
> Signed-off-by: Yishai Hadas <yishaih@...lanox.com>
> Reported-by: David Daney <david.daney@...ium.com>
> Tested-by: Sinan Kaya <okaya@...eaurora.org>
Powered by blists - more mailing lists