lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Jul 2018 19:46:05 +0300
From:   Max Gurtovoy <maxg@...lanox.com>
To:     Sagi Grimberg <sagi@...mberg.me>, Leon Romanovsky <leon@...nel.org>
CC:     Doug Ledford <dledford@...hat.com>,
        Jason Gunthorpe <jgg@...lanox.com>,
        RDMA mailing list <linux-rdma@...r.kernel.org>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Steve Wise <swise@...ngridcomputing.com>,
        linux-netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH mlx5-next] RDMA/mlx5: Don't use cached IRQ affinity mask



On 7/16/2018 5:59 PM, Sagi Grimberg wrote:
> 
>> Hi,
>> I've tested this patch and seems problematic at this moment.
> 
> Problematic how? what are you seeing?

Connection failures and same error Steve saw:

[Mon Jul 16 16:19:11 2018] nvme nvme0: Connect command failed, error 
wo/DNR bit: -16402
[Mon Jul 16 16:19:11 2018] nvme nvme0: failed to connect queue: 2 ret=-18


> 
>> maybe this is because of the bug that Steve mentioned in the NVMe 
>> mailing list. Sagi mentioned that we should fix it in the NVMe/RDMA 
>> initiator and I'll run his suggestion as well.
> 
> Is your device irq affinity linear?

When it's linear and the balancer is stopped the patch works.

> 
>> BTW, when I run the blk_mq_map_queues it works for every irq affinity.
> 
> But its probably not aligned to the device vector affinity.

but I guess it's better in some cases.

I've checked the situation before Leon's patch and set all the vetcors 
to CPU 0. In this case (I think that this was the initial report by 
Steve), we use the affinity_hint (Israel's and Saeed's patches were we 
use dev->priv.irq_info[vector].mask) and it worked fine.

Steve,
Can you share your configuration (kernel, HCA, affinity map, connect 
command, lscpu) ?
I want to repro it in my lab.

-Max.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ