[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200227162843.n2qjuka2rjc44qcv@matt-gen-desktop-p01.matt.pallissard.net>
Date: Thu, 27 Feb 2020 08:28:43 -0800
From: "Pallissard, Matthew" <matt@...lissard.net>
To: linux-kernel@...r.kernel.org
Subject: possible nfsv3 write corruption
Forgive me if this is the wrong list.
Ok, I have this super infrequent data corruption on write that seems to be limited to nfsv3 async mounts. I have not tested nfsv4 yet. I _think_ I've narrowed down to the 5.5.0 > X >= 5.1.4 (maybe earlier) kernels. I had some users report they had random data corruption. A bit of testing shows that it's reproducible and the corruption is nearly identical every time.
I'd like to get to the bottom of this so I can guarantee that a kernel upgrade will resolve the issue.
What winds up happening is every several hundred GiB[ish] we wind up with the first half of a 64 bit segment corrupted. Here is some example output from a test. My test writes a few Gib, alternating between 64 bits of `0`'s and 64 bits of `1`'s. I then read it in and check the contents. Re-reading the file shows that it's corrupted on write, not read.
> 2020-02-14 11:04:34 crit found mis-match on word segment 11911168 / 33554432!
> 2020-02-14 11:04:34 crit found mis-match on byte 7, 188 != 255
> 2020-02-14 11:04:34 crit found mis-match on byte 6, 0 != 255
> 2020-02-14 11:04:34 crit found mis-match on byte 5, 16 != 255
> 2020-02-14 11:04:34 crit found mis-match on byte 4, 128 != 255
> 2020-02-14 11:04:34 crit 1011110000000000000100001000000011111111111111111111111111111111
> 2020-02-14 13:38:11 crit found mis-match on word segment 1982464 / 33554432!
> 2020-02-14 13:38:11 crit found mis-match on byte 7, 188 != 255
> 2020-02-14 13:38:11 crit found mis-match on byte 6, 0 != 255
> 2020-02-14 13:38:11 crit found mis-match on byte 5, 16 != 255
> 2020-02-14 13:38:11 crit found mis-match on byte 4, 128 != 255
> 2020-02-14 13:38:11 crit 1011110000000000000100001000000011111111111111111111111111111111
Knowns;
* does not appear to happen on CentOS/EL 3.10 series kernel
* does not appear to happen on a 5.5 series kernel
* I'm re-running all my tests now to confirm this.
* not hardware dependent
* not processor dependent
* I tested 3 different Intel processors
* appears to only happen on NFS v3 async mounts
* local disk and `-o sync` NFS v3 mounts have been tested
* It happens on random 64 bit segments
* It's *always* the same 4 bytes that are corrupted
* While often identical, the corrupted bytes are not always identical
* the identical corruption pattern can appear on separate computers.
* It's *always* on words that are written with `1`'s <- this is the part I find most interesting
* whether or not I explicitly call `fflush` and `sync` has no effect on the results.
* usually takes ~80-2000Gib to reproduce, sometimes higher or lower but infrequent.
* I've been writing 2GiB files
* sometimes I never hit the corruption case.
* I've yet to see more than one corrupted segment in a file.
A little bit about the build/run environments;
the hardware
CentOS 7.
CentOS glibc 2.17
clang 9 / lld
Dell PowerEdge R620
Dell PowerEdge C6320
Dell PowerEdge C6420
Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
* I did compile locally on every box. I also tested every compiled binary on every box. It didn't seem to affect the results.
* I don't have a tcpdump of this yet. I'm hoping to get that started before the end of the week.
* I read and write to the same file every time, unlinking it before writing again
* I have not tried dropping the cache between any of the steps.
* I have engaged our storage vendor to see what they have to say. They're pretty good at getting useful metrics and insight so if there is anything I should have them gather server-side please let me know.
If anyone as any insight or additional testing I can perform I would *greatly* appreciate it. I would be thrilled if this turned out to be some dumb configuration option or other operational thing performed incorrectly.
Thank you for your time.
Matt Pallissard
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists