lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yy1zkMH0f9ski4Sg@infradead.org>
Date:   Fri, 23 Sep 2022 01:51:28 -0700
From:   Christoph Hellwig <hch@...radead.org>
To:     Daniil Lunev <dlunev@...gle.com>
Cc:     Christoph Hellwig <hch@...radead.org>,
        Sarthak Kukreti <sarthakkukreti@...omium.org>,
        Stefan Hajnoczi <stefanha@...hat.com>, dm-devel@...hat.com,
        linux-block@...r.kernel.org, linux-ext4@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org,
        Jens Axboe <axboe@...nel.dk>,
        "Michael S . Tsirkin" <mst@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Alasdair Kergon <agk@...hat.com>,
        Mike Snitzer <snitzer@...nel.org>,
        Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Bart Van Assche <bvanassche@...gle.com>,
        Evan Green <evgreen@...gle.com>,
        Gwendal Grignou <gwendal@...gle.com>
Subject: Re: [PATCH RFC 0/8] Introduce provisioning primitives for thinly
 provisioned storage

On Wed, Sep 21, 2022 at 07:48:50AM +1000, Daniil Lunev wrote:
> > There is no such thing as WRITE UNAVAILABLE in NVMe.
> Apologize, that is WRITE UNCORRECTABLE. Chapter 3.2.7 of
> NVM Express NVM Command Set Specification 1.0b

Write uncorrectable is a very different thing, and the equivalent of the
horribly misnamed SCSI WRITE LONG COMMAND.  It injects an unrecoverable
error, and does not provision anything.

> * Each application is potentially allowed to consume the entirety
>   of the disk space - there is no strict size limit for application
> * Applications need to pre-allocate space sometime, for which
>   they use fallocate. Once the operation succeeded, the application
>   assumed the space is guaranteed to be there for it.
> * Since filesystems on the volumes are independent, filesystem
>   level enforcement of size constraints is impossible and the only
>   common level is the thin pool, thus, each fallocate has to find its
>   representation in thin pool one way or another - otherwise you
>   may end up in the situation, where FS thinks it has allocated space
>   but when it tries to actually write it, the thin pool is already
>   exhausted.
> * Hole-Punching fallocate will not reach the thin pool, so the only
>   solution presently is zero-writing pre-allocate.

To me it sounds like you want a non-thin pool in dm-thin and/or
guaranted space reservations for it.

> * Thus, a provisioning block operation allows an interface specific
>   operation that guarantees the presence of the block in the
>   mapped space. LVM Thin-pool itself is the primary target for our
>   use case but the argument is that this operation maps well to
>   other interfaces which allow thinly provisioned units.

I think where you are trying to go here is badly mistaken.  With flash
(or hard drive SMR) there is no such thing as provisioning LBAs.  Every
write is out of place, and a one time space allocation does not help
you at all.  So fundamentally what you try to here just goes against
the actual physics of modern storage media.  While there are some
layers that keep up a pretence, trying to that an an exposed API
level is a really bad idea.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ