lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Jun 2010 08:47:52 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Ilia Mirkin <imirkin@...m.mit.edu>
Cc:	Roman Kononov <roman@...arylife.net>, xfs@....sgi.com,
	linux-kernel@...r.kernel.org
Subject: Re: WARNING in xfs_lwr.c, xfs_write()

On Sat, Jun 12, 2010 at 01:00:52AM -0400, Ilia Mirkin wrote:
> Sorry to pick up an old-ish thread, but I have a similar situation:
> 
> On Sun, May 23, 2010 at 9:19 PM, Dave Chinner <david@...morbit.com> wrote:
> > On Sun, May 23, 2010 at 09:23:44AM -0500, Roman Kononov wrote:
> >> On 2010-05-23, 20:18:56 +1000, Dave Chinner <david@...morbit.com> wrote:
> >> > Can you find out what the application is triggering this?
> 
> I noticed this happening with mysql and xtrabackup -- the latter opens
> up mysql's files while mysql is still running (and modifying its own
> files) and backs them up in a (hopefully) safe way.

That's not safe at all - there's no guarantee you'll end up with a
consistent database image doing backups like this. Have you ever
tried to restore and use one of these backups?

> mysql had been
> running on the machine without any such warnings for a while before we
> ran the backup, so I'm pretty sure that the backup is involved,
> although its process is never listed. Specifically the warning is:
> 
> [2584257.839386] ------------[ cut here ]------------
> [2584257.839395] WARNING: at fs/xfs/linux-2.6/xfs_lrw.c:651
> xfs_write+0x3dc/0x784()
> [2584257.839398] Hardware name: PowerEdge R710
> [2584257.839399] Modules linked in: nfsd cifs iTCO_wdt iTCO_vendor_support
> [2584257.839406] Pid: 7761, comm: mysqld Not tainted 2.6.33-gentoo-r2 #1
> [2584257.839407] Call Trace:
> [2584257.839411]  [<ffffffff8120da46>] ? xfs_write+0x3dc/0x784
> [2584257.839415]  [<ffffffff81038733>] warn_slowpath_common+0x77/0xa4
> [2584257.839417]  [<ffffffff8103876f>] warn_slowpath_null+0xf/0x11
> [2584257.839419]  [<ffffffff8120da46>] xfs_write+0x3dc/0x784
> [2584257.839424]  [<ffffffff810033ce>] ? apic_timer_interrupt+0xe/0x20
> [2584257.839427]  [<ffffffff8120a51a>] xfs_file_aio_write+0x5a/0x5c
> [2584257.839430]  [<ffffffff810d7cbe>] do_sync_write+0xc0/0x106
> [2584257.839435]  [<ffffffff810ff862>] ? __fsnotify_parent+0xc7/0xd3
> [2584257.839437]  [<ffffffff810d8624>] vfs_write+0xab/0x105
> [2584257.839439]  [<ffffffff810d86da>] sys_pwrite64+0x5c/0x7d
> [2584257.839442]  [<ffffffff81002a6b>] system_call_fastpath+0x16/0x1b
> [2584257.839444] ---[ end trace 8b0c2a6e5e86745f ]---
> 
> > Yes, it should be safe, but the kernel code can't know whether this
> > is true or not - there are no specific interlocks with direct IO to
> > prevent concurrent buffered IO to the same region while a direct IO
> > is in progress. XFS does best effort attempts to maintain coherency
> > does not provide any guarantees, hence the warning when known race
> > conditions are tripped.
> 
> Would it be safe to remove the warning at
> fs/xfs/linux-2.6/xfs_lrw.c:651 (which looks like it has moved to
> xfs_file.c in 2.6.34)? It seems undesirable to get a long stream of
> these (51 in this particular instance) every time we run a backup...

You can if you want, but then you won't know when your backup or
database might have been corrupted, right?

> IOW, is the warning purely something along the lines of "Userspace is
> doing something wonky, but the underlying FS will still be fine no
> matter what" kind of deal, or could there be an actual problem with
> the XFS metadata itself?

Nothing wrong with the filesystem metadata will occur - as I said
eariler in the thread that this is a warning to tell us that data
corruption is possible due to userspace doing something stupid, not
a filesystem bug.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ