lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17967.8536.788076.851546@notabene.brown>
Date:	Wed, 25 Apr 2007 19:37:28 +1000
From:	Neil Brown <neilb@...e.de>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Brad Campbell <brad@...p.net.au>,
	Chuck Ebbert <cebbert@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert

On Wednesday April 25, jens.axboe@...cle.com wrote:
> 
> That's pretty close to where I think the problem is (the front merging
> and cfq_reposition_rq_rb()). The issue with that is that you'd only get
> aliases for O_DIRECT and/or raw IO, and that doesn't seem to be the case
> here. Given that front merges are equally not very likely, I'd be
> surprised is something like that has ever happened.

Well it certainly doesn't happen very often....
And I can imagine a filesystem genuinely wanting to read the same
block twice - maybe a block that contained packed tails of two
different files.
> 
> BUT... That may explain while we are only seeing it on md. Would md
> ever be issuing such requests that trigger this condition?

Can someone remind me which raid level(s) was/were involved?

I think one was raid0 - that just passes requests on from the
filesystem, so md would only issue requests like that if the
filesystem did.
I guess it could happen with raid4/5/6.  A read request that was
properly aligned (and we do encourage proper alignment) will be passed
directly to the underlying device.  A concurrent write elsewhere could
require the same block to be read into the stripe-cache to enable a
parity calculation.  So you could get two reads at the same block
address.
Getting a front-merge would probably require two stripe-heads to be
processed in reverse-sector order, and it tends to preserve the order
of incoming requests (though that isn't firmly enforced).

raid1 is much like raid0 (with totally different code) in that the
request pattern seen by the underlying device matches the request
pattern generated by the filesystem.

If I read the debugging output correctly, the request which I
hypothesise was the subject of a front-merge is a 'sync' request.
raid5 does not generate sync requests to fill the stripe cache (maybe
it should?) so I really think it must have come directly from the
filesystem.

(just checked previous email for more detail of when it hits)
The fact that it hits degraded arrays more easily is interesting.
Maybe we try to read a block on the missing device and so schedule
reads for the other devices.  Then we try to read a block on a good
device and issue a request for exactly the same block that raid5 asked
for.  That still doesn't explain the 'sync' and the 'front merge'.
But that is quite possible, just not common maybe.

It doesn't help us understand the raid0 example though.  May it is
just a 'can happen, but only rarely' thing.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ