linux-kernel - Re: [PATCH v3] libata, freezer: avoid block device removal while system is frozen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4198915.dARHSKzc5x@vostro.rjw.lan>
Date:	Wed, 18 Dec 2013 23:08:41 +0100
From:	"Rafael J. Wysocki" <rjw@...ysocki.net>
To:	Tejun Heo <tj@...nel.org>
Cc:	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	Jens Axboe <axboe@...nel.dk>, tomaz.solc@...lix.org,
	aaron.lu@...el.com, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Fengguang Wu <fengguang.wu@...el.com>
Subject: Re: [PATCH v3] libata, freezer: avoid block device removal while system is frozen

On Wednesday, December 18, 2013 07:07:32 AM Tejun Heo wrote:
> Freezable kthreads and workqueues are fundamentally problematic in
> that they effectively introduce a big kernel lock widely used in the
> kernel and have already been the culprit of several deadlock
> scenarios.  This is the latest occurrence.
> 
> During resume, libata rescans all the ports and revalidates all
> pre-existing devices.  If it determines that a device has gone
> missing, the device is removed from the system which involves
> invalidating block device and flushing bdi while holding driver core
> layer locks.  Unfortunately, this can race with the rest of device
> resume.  Because freezable kthreads and workqueues are thawed after
> device resume is complete and block device removal depends on
> freezable workqueues and kthreads (e.g. bdi_wq, jbd2) to make
> progress, this can lead to deadlock - block device removal can't
> proceed because kthreads are frozen and kthreads can't be thawed
> because device resume is blocked behind block device removal.
> 
> 839a8e8660b6 ("writeback: replace custom worker pool implementation
> with unbound workqueue") made this particular deadlock scenario more
> visible but the underlying problem has always been there - the
> original forker task and jbd2 are freezable too.  In fact, this is
> highly likely just one of many possible deadlock scenarios given that
> freezer behaves as a big kernel lock and we don't have any debug
> mechanism around it.
> 
> I believe the right thing to do is getting rid of freezable kthreads
> and workqueues.

I agree.  It may be useful to block them over suspend/resume, but that doesn't
have to be done through the freezer.

> This is something fundamentally broken.  For now,
> implement a funny workaround in libata - just avoid doing block device
> hot[un]plug while the system is frozen.  Kernel engineering at its
> finest.  :(
> 
> v2: Add EXPORT_SYMBOL_GPL(pm_freezing) for cases where libata is built
>     as a module.
> 
> v3: Comment updated and polling interval changed to 10ms as suggested
>     by Rafael.

This one is fine by my FWIW.

Thanks!

> Signed-off-by: Tejun Heo <tj@...nel.org>
> Reported-by: Tomaž Šolc <tomaz.solc@...lix.org>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=62801
> Link: http://lkml.kernel.org/r/20131213174932.GA27070@htj.dyndns.org
> Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Cc: Len Brown <len.brown@...el.com>
> Cc: Oleg Nesterov <oleg@...hat.com>
> Cc: stable@...r.kernel.org
> ---
>  drivers/ata/libata-scsi.c |   19 +++++++++++++++++++
>  kernel/freezer.c          |    6 ++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index db6dfcf..f519868 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3871,6 +3871,25 @@ void ata_scsi_hotplug(struct work_struct *work)
>  		return;
>  	}
>  
> +	/*
> +	 * XXX - UGLY HACK
> +	 *
> +	 * The block layer suspend/resume path is fundamentally broken due
> +	 * to freezable kthreads and workqueue and may deadlock if a block
> +	 * device gets removed while resume is in progress.  I don't know
> +	 * what the solution is short of removing freezable kthreads and
> +	 * workqueues altogether.
> +	 *
> +	 * The following is an ugly hack to avoid kicking off device
> +	 * removal while freezer is active.  This is a joke but does avoid
> +	 * this particular deadlock scenario.
> +	 *
> +	 * https://bugzilla.kernel.org/show_bug.cgi?id=62801
> +	 * http://marc.info/?l=linux-kernel&m=138695698516487
> +	 */
> +	while (pm_freezing)
> +		msleep(10);
> +
>  	DPRINTK("ENTER\n");
>  	mutex_lock(&ap->scsi_scan_mutex);
>  
> diff --git a/kernel/freezer.c b/kernel/freezer.c
> index b462fa1..aa6a8aa 100644
> --- a/kernel/freezer.c
> +++ b/kernel/freezer.c
> @@ -19,6 +19,12 @@ EXPORT_SYMBOL(system_freezing_cnt);
>  bool pm_freezing;
>  bool pm_nosig_freezing;
>  
> +/*
> + * Temporary export for the deadlock workaround in ata_scsi_hotplug().
> + * Remove once the hack becomes unnecessary.
> + */
> +EXPORT_SYMBOL_GPL(pm_freezing);
> +
>  /* protects freezing and frozen transitions */
>  static DEFINE_SPINLOCK(freezer_lock);
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/