[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2103902716.11642129.1359619270232.JavaMail.root@redhat.com>
Date: Thu, 31 Jan 2013 03:01:10 -0500 (EST)
From: CAI Qian <caiqian@...hat.com>
To: Dave Chinner <david@...morbit.com>
Cc: xfs@....sgi.com, linux-xfs@...r.kernel.org,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: 3.8-rc5 xfs corruption
----- Original Message -----
> From: "Dave Chinner" <david@...morbit.com>
> To: "CAI Qian" <caiqian@...hat.com>
> Cc: xfs@....sgi.com, linux-xfs@...r.kernel.org, "linux-kernel" <linux-kernel@...r.kernel.org>
> Sent: Thursday, January 31, 2013 12:07:48 PM
> Subject: Re: 3.8-rc5 xfs corruption
>
> On Wed, Jan 30, 2013 at 10:16:47PM -0500, CAI Qian wrote:
> > Hello,
> >
> > (Sorry to post to xfs mailing lists but unsure about which one is
> > the
> > best for this.)
>
> Trimmed to just xfs@....sgi.com.
Thanks for quick response, Dave.
>
> > I have seen something like this once during testing on a system
> > with a
> > EMC VNX FC/multipath back-end.
>
> This is a trace from the verifier code that was added in 3.8-rc1 so
> I doubt it has anything to do with any problem you've seen in the
> past....
>
> Can you tell us what workload you were running and what hardware you
> are using as per:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
This was the system,
- AMD Opteron(tm) Processor 4130 (1 socket, 4 cores)
- PowerEdge R415
- 8G memory
- mptsas local disks
Software version,
- xfsprogs-3.1.10
The workload was running some fs_mark, syscalls tests, some nfs/cifs
connectathon tests, memory, libhugetlbfs tests, and some dynamic debug
(Documentation/dynamic-debug-howto.txt) tests.
>
> As it is, if you mounted the filesystem after this problem was
> detected, log recovery probably propagated it to disk. I'd suggest
> that you run xfs_repair -n on the device and post the output so we
> can see if any corruption has actaully made it to disk. If no
> corruption made it to disk, it's possible that we've got the
> incorrect verifier attached to the buffer.
The system was taken away from me, so I can only occupy it again later
if needed.
Regards,
CAI Qian
>
> > [ 3025.063024] ffff8801a0d50000: 2e 2e 2f 2e 2e 2f 75 73 72 2f 6c
> > 69 62 2f 6d 6f ../../usr/lib/mo
>
> The start of a block contains a path and the only
> type of block that can contain this format of metadata is remote
> symlink block. Remote symlink blocks don't have a verifier attached
> to them as there is nothing that can currently be used to verify
> them as correct.
>
> I can't see exactly how this can occur as stale buffers have the
> verifier ops cleared before being returned to the new user, and
> newly allocated xfs_bufs are zeroed before being initialised. I
> really need to know what you are doing to be able to get to the
> bottom of it....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@...morbit.com
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists