lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 21 Jul 2015 08:36:10 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Mike Snitzer <snitzer@...hat.com>
Cc:	axboe@...nel.dk, hch@....de, sandeen@...hat.com,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	dm-devel@...hat.com, xfs@....sgi.com
Subject: Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on
 retrying IO if thinp is out of space

On Mon, Jul 20, 2015 at 11:18:49AM -0400, Mike Snitzer wrote:
> If XFS fails to write metadata it will retry the write indefinitely
> (with the hope that the write will succeed at some point in the future).
> 
> Others can possibly speak to historic reason(s) why this is a sane
> default for XFS.  But when XFS is deployed ontop of DM thin provisioning
> this infinite retry is very unwelcome -- especially if DM thinp was
> configured to be automatically extended with free space but the admin
> hasn't provided (or restored) adequate free space.
> 
> To fix this infinite retry a new bdev_has_space () hook is added to XFS
> to break out of its metadata retry loop if the underlying block device
> reports it no longer has free space.  DM thin provisioning is now
> trained to respond accordingly, which enables XFS to not cause a cascade
> of tasks blocked on IO waiting for XFS's infinite retry.
> 
> All other block devices, which don't implement a .has_space method in
> block_device_operations, will always return true for bdev_has_space().
> 
> With this change XFS will fail the metadata IO, force shutdown, and the
> XFS filesystem may be unmounted.  This enables an admin to recover from
> their oversight, of not having provided enough free space, without
> having to force a hard reset of the system to get XFS to unwedge.
> 
> Signed-off-by: Mike Snitzer <snitzer@...hat.com>

Shouldn't dm-thinp just return the bio with ENOSPC as it's error?
The scsi layers already do this for hardware thinp ENOSPC failures,
so dm-thinp should behave exactly the same (i.e. via
__scsi_error_from_host_byte()). The behaviour of the filesystem
should be the same in all cases - making it conditional on whether
the thinp implementation can be polled for available space is wrong
as most hardware thinp can't be polled by the kernel forthis info..


If dm-thinp just returns ENOSPC from on the BIO like other hardware
thinp devices, then it is up to the filesystem to handle that
appropriately.  i.e. whether an ENOSPC IO error is fatal to the
filesystem is determined by filesystem configuration and context of
the IO error, not whether the block device has no space (which we
should already know from the ENOSPC error delivered by IO
completion).

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ