[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200611202318.29207.rjw@sisk.pl>
Date: Mon, 20 Nov 2006 23:18:27 +0100
From: "Rafael J. Wysocki" <rjw@...k.pl>
To: nigel@...pend2.net
Cc: David Chinner <dgc@....com>, Andrew Morton <akpm@...l.org>,
LKML <linux-kernel@...r.kernel.org>, Pavel Machek <pavel@....cz>
Subject: Re: [PATCH -mm 0/2] Use freezeable workqueues to avoid suspend-related XFS corruptions
Hi,
On Monday, 20 November 2006 23:03, Nigel Cunningham wrote:
> Hi.
>
> On Mon, 2006-11-20 at 21:40 +0100, Rafael J. Wysocki wrote:
> > On Monday, 20 November 2006 12:02, Nigel Cunningham wrote:
> > > If I put a BUG_ON() in submit_bio for non suspend2 I/O, it catches this
> > > trace:
> > >
> > > submit_bio
> > > xfs_buf_iorequest
> > > xlog_bdstrat_cb
> > > xlog_state_release_iclog
> > > xlog_state_sync_all
> > > xfs_log_force
> > > xfs_syncsub
> > > xfs_sync
> > > vfs_sync
> > > vfs_sync_worker
> > > xfssyncd
> > > keventd_create_kthread
> >
> > Well, this trace apparently comes from xfssyncd wich explicitly calls
> > try_to_freeze(). When does this happen?
>
> I'm sorry. Since sending the email I looked at this more, deciding in
> the end that I had a false positive. I was resetting the flag that
> enabled the check after thawing kernel threads instead of before.
>
> I've looked back in logs to this time last year, when we were doing the
> modifications that lead to using bdev freezing. This is one instance of
> the BUG_ON() triggering then, from a different path. I don't believe
> it's bogus :)
>
> Nov 12 11:43:06 home-nb kernel: CPU: 0
> Nov 12 11:43:06 home-nb kernel: EIP: 0060:[submit_bio+244/256] Not tainted VLI
> Nov 12 11:43:06 home-nb kernel: EIP: 0060:[<c025ff54>] Not tainted VLI
> Nov 12 11:43:06 home-nb kernel: EFLAGS: 00010246 (2.6.14d)
> Nov 12 11:43:06 home-nb kernel: EIP is at submit_bio+0xf4/0x100
> Nov 12 11:43:06 home-nb kernel: eax: 00000003 ebx: 00001000 ecx: 00000202 edx: 00000000
> Nov 12 11:43:06 home-nb kernel: esi: c63700c0 edi: 00000001 ebp: df86dec8 esp: df86de80
> Nov 12 11:43:07 home-nb kernel: ds: 007b es: 007b ss: 0068
> Nov 12 11:43:07 home-nb kernel: Process xfsbufd (pid: 780, threadinfo=df86c000 task=c17cca50)
> Nov 12 11:43:07 home-nb kernel: Stack: 0000001c 00000008 00000000 00000000 df86deb4 c016b942 dff45440 00000010
> Nov 12 11:43:08 home-nb kernel: 00001000 00001000 00000001 df86dec8 c016bee3 dfcb74f8 c63700c0 00001000
> Nov 12 11:43:08 home-nb kernel: 00000000 00000000 df86df14 e0b4dd52 00000001 c63700c0 00001000 00000000
> Nov 12 11:43:08 home-nb kernel: Call Trace:
> Nov 12 11:43:08 home-nb kernel: [show_stack+127/160] show_stack+0x7f/0xa0
> Nov 12 11:43:08 home-nb kernel: [<c010357f>] show_stack+0x7f/0xa0
> Nov 12 11:43:08 home-nb kernel: [show_registers+344/464] show_registers+0x158/0x1d0
> Nov 12 11:43:09 home-nb kernel: [<c0103718>] show_registers+0x158/0x1d0
> Nov 12 11:43:09 home-nb kernel: [die+204/368] die+0xcc/0x170
> Nov 12 11:43:09 home-nb kernel: [<c01038fc>] die+0xcc/0x170
> Nov 12 11:43:09 home-nb kernel: [do_trap+137/208] do_trap+0x89/0xd0
> Nov 12 11:43:09 home-nb kernel: [<c03138c9>] do_trap+0x89/0xd0
> Nov 12 11:43:09 home-nb kernel: [do_invalid_op+184/208] do_invalid_op+0xb8/0xd0
> Nov 12 11:43:10 home-nb kernel: [<c0103c48>] do_invalid_op+0xb8/0xd0
> Nov 12 11:43:10 home-nb kernel: [error_code+79/84] error_code+0x4f/0x54
> Nov 12 11:43:10 home-nb kernel: [<c010323b>] error_code+0x4f/0x54
> Nov 12 11:43:10 home-nb kernel: [pg0+544066898/1069077504] _pagebuf_ioapply+0x1a2/0x2f0 [xfs]
> Nov 12 11:43:10 home-nb kernel: [<e0b4dd52>] _pagebuf_ioapply+0x1a2/0x2f0 [xfs]
> Nov 12 11:43:10 home-nb kernel: [pg0+544067295/1069077504] pagebuf_iorequest+0x3f/0x180 [xfs]
> Nov 12 11:43:10 home-nb kernel: [<e0b4dedf>] pagebuf_iorequest+0x3f/0x180 [xfs]
> Nov 12 11:43:11 home-nb kernel: [pg0+544091130/1069077504] xfs_bdstrat_cb+0x3a/0x50 [xfs]
> Nov 12 11:43:11 home-nb kernel: [<e0b53bfa>] xfs_bdstrat_cb+0x3a/0x50 [xfs]
> Nov 12 11:43:11 home-nb kernel: [pg0+544069743/1069077504] xfsbufd+0x12f/0x1c0 [xfs]
> Nov 12 11:43:11 home-nb kernel: [<e0b4e86f>] xfsbufd+0x12f/0x1c0 [xfs]
> Nov 12 11:43:11 home-nb kernel: [kthread+182/256] kthread+0xb6/0x100
> Nov 12 11:43:11 home-nb kernel: [<c012c5c6>] kthread+0xb6/0x100
> Nov 12 11:43:11 home-nb kernel: [kernel_thread_helper+5/12] kernel_thread_helper+0x5/0xc
> Nov 12 11:43:12 home-nb kernel: [<c01013a9>] kernel_thread_helper+0x5/0xc
> Nov 12 11:43:12 home-nb kernel: Code: 24 0c 8b 81 9c 00 00 00 89 5c 24 04 c7 04 24 70 c7 34 c0 89 44 24 08 e8 1b 99 eb ff e9 63 ff ff ff f6 46 11 01 0f 85 66 ff ff ff <0f> 0b e9 5f ff ff ff 90 8d 74 26 00 55 89 e5 57 56 53 83 ec 2c
>
> Looking at the start of the xfsbufd function, this now supports
> freezing, so it shouldn't be an issue anymore, right?
Yes, it properly freezes itself so this kind of failure should not happen.
> > > So, it would appear that freezing kthreads without freezing bdevs should
> > > be a possible solution. It may however leave some I/O unsynced
> > > pre-resume and therefore result in possible dataloss if no resume
> > > occurs. I therefore wonder whether it's better to stick with bdev
> > > freezing
> >
> > It looks like xfs is the only filesystem that really implements bdev freezing,
> > so for other filesystems it's irrelevant. However, it may affect the dm, and
> > I'm not sure what happens if someone tries to use a swap file on a dm
> > device for the suspend _after_ we have frozen bdevs.
> >
> > > or create some variant wherein XFS is taught to fully flush
> > > pending writes and not create new I/O.
> >
> > I think we should prevent filesystems from submitting any new I/O after
> > processes have been frozen, this way or another.
>
> Yes. We should have some means of properly telling them to put things in
> a sane state (however you define sane). It seems to belong between
> freezing userspace and freezing kernelspace because we otherwise might
> have problems with processes blocking, waiting for I/O and never
> entering the refrigerator. But doing it there also creates problems with
> (*cough*) fuse. Gah!
>
> I would suggest that other filesystems be encouraged to implement bdev
> freezing.
I think I/O can only be submitted from the process context. Thus if we freeze
all (and I mean _all_) threads that are used by filesystems, including worker
threads, we should effectively prevent fs-related I/O from being submitted
after tasks have been frozen.
This can be done with the help of create_freezeable_workqueue() introduced in
this patch and I'd like to implement it (and there are only a few filesystems
that use work queues).
The freezing of bdevs might be a good solution if:
(1) we were sure it wouldn't interact with dm in a wrong way,
(2) _all_ of the filesystems implemented it.
For now, neither (1) nor (2) are satisfied and we need to know we're safe
_now_.
Greetings,
Rafael
--
You never change things by fighting the existing reality.
R. Buckminster Fuller
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists