[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e2574365-cb5b-4376-aa8e-adf05b788337@suse.de>
Date: Fri, 21 Jun 2024 08:17:21 +0200
From: Hannes Reinecke <hare@...e.de>
To: John Garry <john.g.garry@...cle.com>, axboe@...nel.dk, kbusch@...nel.org,
hch@....de, sagi@...mberg.me, jejb@...ux.ibm.com,
martin.petersen@...cle.com, viro@...iv.linux.org.uk, brauner@...nel.org,
dchinner@...hat.com, jack@...e.cz
Cc: djwong@...nel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-fsdevel@...r.kernel.org, tytso@....edu, jbongio@...gle.com,
linux-scsi@...r.kernel.org, ojaswin@...ux.ibm.com, linux-aio@...ck.org,
linux-btrfs@...r.kernel.org, io-uring@...r.kernel.org, nilay@...ux.ibm.com,
ritesh.list@...il.com, willy@...radead.org, agk@...hat.com,
snitzer@...nel.org, mpatocka@...hat.com, dm-devel@...ts.linux.dev,
Alan Adamson <alan.adamson@...cle.com>
Subject: Re: [Patch v9 10/10] nvme: Atomic write support
On 6/20/24 14:53, John Garry wrote:
> From: Alan Adamson <alan.adamson@...cle.com>
>
> Add support to set block layer request_queue atomic write limits. The
> limits will be derived from either the namespace or controller atomic
> parameters.
>
> NVMe atomic-related parameters are grouped into "normal" and "power-fail"
> (or PF) class of parameter. For atomic write support, only PF parameters
> are of interest. The "normal" parameters are concerned with racing reads
> and writes (which also applies to PF). See NVM Command Set Specification
> Revision 1.0d section 2.1.4 for reference.
>
> Whether to use per namespace or controller atomic parameters is decided by
> NSFEAT bit 1 - see Figure 97: Identify – Identify Namespace Data
> Structure, NVM Command Set.
>
> NVMe namespaces may define an atomic boundary, whereby no atomic guarantees
> are provided for a write which straddles this per-lba space boundary. The
> block layer merging policy is such that no merges may occur in which the
> resultant request would straddle such a boundary.
>
> Unlike SCSI, NVMe specifies no granularity or alignment rules, apart from
> atomic boundary rule. In addition, again unlike SCSI, there is no
> dedicated atomic write command - a write which adheres to the atomic size
> limit and boundary is implicitly atomic.
>
> If NSFEAT bit 1 is set, the following parameters are of interest:
> - NAWUPF (Namespace Atomic Write Unit Power Fail)
> - NABSPF (Namespace Atomic Boundary Size Power Fail)
> - NABO (Namespace Atomic Boundary Offset)
>
> and we set request_queue limits as follows:
> - atomic_write_unit_max = rounddown_pow_of_two(NAWUPF)
> - atomic_write_max_bytes = NAWUPF
> - atomic_write_boundary = NABSPF
>
> If in the unlikely scenario that NABO is non-zero, then atomic writes will
> not be supported at all as dealing with this adds extra complexity. This
> policy may change in future.
>
> In all cases, atomic_write_unit_min is set to the logical block size.
>
> If NSFEAT bit 1 is unset, the following parameter is of interest:
> - AWUPF (Atomic Write Unit Power Fail)
>
> and we set request_queue limits as follows:
> - atomic_write_unit_max = rounddown_pow_of_two(AWUPF)
> - atomic_write_max_bytes = AWUPF
> - atomic_write_boundary = 0
>
> A new function, nvme_valid_atomic_write(), is also called from submission
> path to verify that a request has been submitted to the driver will
> actually be executed atomically. As mentioned, there is no dedicated NVMe
> atomic write command (which may error for a command which exceeds the
> controller atomic write limits).
>
> Note on NABSPF:
> There seems to be some vagueness in the spec as to whether NABSPF applies
> for NSFEAT bit 1 being unset. Figure 97 does not explicitly mention NABSPF
> and how it is affected by bit 1. However Figure 4 does tell to check Figure
> 97 for info about per-namespace parameters, which NABSPF is, so it is
> implied. However currently nvme_update_disk_info() does check namespace
> parameter NABO regardless of this bit.
>
> Signed-off-by: Alan Adamson <alan.adamson@...cle.com>
> Reviewed-by: Keith Busch <kbusch@...nel.org>
> Reviewed-by: Martin K. Petersen <martin.petersen@...cle.com>
> jpg: total rewrite
> Signed-off-by: John Garry <john.g.garry@...cle.com>
> ---
> drivers/nvme/host/core.c | 52 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 52 insertions(+)
>
Reviewed-by: Hannes Reinecke <hare@...e.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists