linux-kernel - Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <55AE6670.40903@redhat.com>
Date:	Tue, 21 Jul 2015 10:34:08 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	Dave Chinner <david@...morbit.com>,
	Mike Snitzer <snitzer@...hat.com>
CC:	axboe@...nel.dk, hch@....de, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, dm-devel@...hat.com, xfs@....sgi.com
Subject: Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying
 IO if thinp is out of space

On 7/20/15 5:36 PM, Dave Chinner wrote:
> On Mon, Jul 20, 2015 at 11:18:49AM -0400, Mike Snitzer wrote:
>> If XFS fails to write metadata it will retry the write indefinitely
>> (with the hope that the write will succeed at some point in the future).
>>
>> Others can possibly speak to historic reason(s) why this is a sane
>> default for XFS.  But when XFS is deployed ontop of DM thin provisioning
>> this infinite retry is very unwelcome -- especially if DM thinp was
>> configured to be automatically extended with free space but the admin
>> hasn't provided (or restored) adequate free space.
>>
>> To fix this infinite retry a new bdev_has_space () hook is added to XFS
>> to break out of its metadata retry loop if the underlying block device
>> reports it no longer has free space.  DM thin provisioning is now
>> trained to respond accordingly, which enables XFS to not cause a cascade
>> of tasks blocked on IO waiting for XFS's infinite retry.
>>
>> All other block devices, which don't implement a .has_space method in
>> block_device_operations, will always return true for bdev_has_space().
>>
>> With this change XFS will fail the metadata IO, force shutdown, and the
>> XFS filesystem may be unmounted.  This enables an admin to recover from
>> their oversight, of not having provided enough free space, without
>> having to force a hard reset of the system to get XFS to unwedge.
>>
>> Signed-off-by: Mike Snitzer <snitzer@...hat.com>
> 
> Shouldn't dm-thinp just return the bio with ENOSPC as it's error?
> The scsi layers already do this for hardware thinp ENOSPC failures,
> so dm-thinp should behave exactly the same (i.e. via
> __scsi_error_from_host_byte()). The behaviour of the filesystem
> should be the same in all cases - making it conditional on whether
> the thinp implementation can be polled for available space is wrong
> as most hardware thinp can't be polled by the kernel forthis info..
> 
> 
> If dm-thinp just returns ENOSPC from on the BIO like other hardware
> thinp devices, then it is up to the filesystem to handle that
> appropriately.  i.e. whether an ENOSPC IO error is fatal to the
> filesystem is determined by filesystem configuration and context of
> the IO error, not whether the block device has no space (which we
> should already know from the ENOSPC error delivered by IO
> completion).

The issue we had discussed previously is that there is no agreement
across block devices about whether ENOSPC is a permanent or temporary
condition.  Asking the admin to tune  the fs to each block device's
behavior sucks, IMHO.

This interface could at least be defined to reflect a permanent and
unambiguous state...

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/