lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <894e18a4-4504-df48-6429-a04c222ca064@kernel.dk>
Date:   Mon, 19 Sep 2022 08:10:31 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Liu Song <liusong@...ux.alibaba.com>, kbusch@...nel.org,
        hch@....de, sagi@...mberg.me
Cc:     linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] nvme: request remote is usually not involved for nvme
 devices

On 9/18/22 10:10 AM, Liu Song wrote:
> 
> On 2022/9/18 00:50, Jens Axboe wrote:
>> On 9/17/22 10:40 AM, Liu Song wrote:
>>> From: Liu Song <liusong@...ux.alibaba.com>
>>>
>>> NVMe devices usually have a 1:1 mapping between "ctx" and "hctx",
>>> so when "nr_ctx" is equal to 1, there is no possibility of remote
>>> request, so the corresponding process can be simplified.
>> If the worry is the call overhead of blk_mq_complete_request_remote(),
>> why don't we just make that available as an inline instead? That seems
>> vastly superior to providing a random shortcut in a driver to avoid
>> calling it.
> 
> Hi
> 
> This is what I think about it. If it is an SSD with only one hw queue,
> remote requests will appear occasionally. As a real multi-queue
> device, nvme usually does not have remote requests, so we don't need
> to care about it. So even if "blk_mq_complete_request_remote" is
> called, there is a high probability that it will return false, and the
> cost of changing the call to "blk_mq_complete_request_remote" to
> determine whether "req->mq_hctx->nr_ctx" is 1 is not only a function
> call, but more judgments in "blk_mq_complete_request_remote". If
> "blk_mq_complete_request_remote" is decorated as inline, it also
> depends on the compiler, there is uncertainty.

I'm not disagreeing with any of that, my point is just that you're
hacking around this in the nvme driver. This is problematic whenever
core changes happen, because now we have to touch individual drivers.
While the expectation is that there are no remote IPI completions for
NVMe, queue starved devices do exist and those do see remote
completions.

This optimization belongs in the blk-mq core, not in nvme. I do think it
makes sense, you just need to solve it in blk-mq rather than in the nvme
driver.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ