lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 May 2011 11:49:06 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Christoph Hellwig <hch@...radead.org>
Cc:	linux-kernel@...r.kernel.org,
	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	Bruno Pr?mont <bonbons@...ux-vserver.org>,
	xfs-masters@....sgi.com, xfs@....sgi.com,
	Alex Elder <aelder@....com>, Dave Chinner <dchinner@...hat.com>
Subject: Re: 2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38

On Thu, May 05, 2011 at 08:39:59AM -0400, Christoph Hellwig wrote:
> > The third problem is that updating the push target is not safe on 32
> > bit machines. We cannot copy a 64 bit LSN without the possibility of
> > corrupting the result when racing with another updating thread. We
> > have function to do this update safely without needing to care about
> > 32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when
> > updating the AIL push target.
> 
> But reading xa_target without xa_lock isn't safe on 32-bit either, is it?

Not sure - I think it depends on the platform. I don't think we
protect LSN reads in any specific way on 32 bit platforms.

In this case, I don't think it matters so much on read, because if
we get a race with a write that mixes upper/lower words of the
target we will eventually hit the stop condition and we won't get a
match. That will trigger the requeue code and we'll start the push
again.

The problem with getting such a race on the target write is that we
could get a cycle/block pair that is beyond the current head of the
log and we'd never be able to push the AIL again as all push
thresholds are truncated to the current head LSN on disk...

> For the first read it can trivially be moved into the critical
> section a few lines below, and the second one should probably use
> XFS_LSN_CMP.
> 
> > @@ -482,19 +481,24 @@ xfs_ail_worker(
> >  	/* assume we have more work to do in a short while */
> >  	tout = 10;
> >  	if (!count) {
> > +out_done:
> 
> Jumping into conditionals is really ugly.  By initializing count a bit
> earlier you can just jump in front of the if/else clauses.  And while
> you're there maybe moving the tout = 10; into an else clause would
> also make the code more readable.
> an uninitialied used of tout.

Ok, I'll rework that.

> > +		if (ailp->xa_target == target ||
> > +		    (test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)))
> 
> no need for braces around the test_and_set_bit call.

*nod*. Left over from developing the fix...

I'll split all these and post them to the xfs-list for review...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ