linux-kernel - Re: [PATCH v7 0/6] solve deadlock caused by memory allocation with I/O

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACVXFVOipr0VMyPQaZTLckxTaPan7ZneERUqZ1S_mYo11A5AeA@mail.gmail.com>
Date:	Thu, 17 Jan 2013 09:28:14 +0800
From:	Ming Lei <ming.lei@...onical.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	linux-usb@...r.kernel.org, linux-pm@...r.kernel.org,
	linux-mm@...ck.org, Alan Stern <stern@...land.harvard.edu>,
	Oliver Neukum <oneukum@...e.de>,
	Minchan Kim <minchan@...nel.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Jens Axboe <axboe@...nel.dk>,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH v7 0/6] solve deadlock caused by memory allocation with I/O

On Thu, Jan 17, 2013 at 7:37 AM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Sat,  5 Jan 2013 10:25:38 +0800
> Ming Lei <ming.lei@...onical.com> wrote:
>
>> This patchset try to solve one deadlock problem which might be caused
>> by memory allocation with block I/O during runtime PM and block device
>> error handling path. Traditionly, the problem is addressed by passing
>> GFP_NOIO statically to mm, but that is not a effective solution, see
>> detailed description in patch 1's commit log.
>>
>> This patch set introduces one process flag and trys to fix the deadlock
>> problem on block device/network device during runtime PM or usb bus reset.
>
> The patchset doesn't look like the worst thing I've ever applied ;)
>
> One thing I'm wondering: during suspend and resume, why are GFP_KERNEL
> allocation attempts even getting down to the device layer?  Presumably
> the page scanner is encountering dirty pagecache or dirty swapcache
> pages?
>
> If so, I wonder if we could avoid the whole problem by appropriately
> syncing all dirty memory back to storage before starting to turn devices
> off?

The patchset is to address the probable deadlock problem by GFP_KERNEL
during runtime suspend/resume which is per block/network device. I am
wondering if syncing all dirty memory is suitable or necessary during
per-storage/network device runtime resume/suspend:

      - sys_sync is very slow and runtime pm operation is frequent

      - it is not efficient because only sync dirty memory against the affected
        device is needed in theory and not necessary to sync all

     - we still need some synchronization to avoid accessing the storage
       between sys_sync and device suspend, just like system sleep case,
       pm_restrict_gfp_mask is needed even sys_sync has been done
       inside enter_state().

So looks the approach in the patch is simpler and more efficient, :-)

Also, with the patchset, we can avoid many GFP_NOIO allocation
which is fragile and not easy to use.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/