lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 02 Mar 2011 19:15:04 -0500
From:	Jens Axboe <axboe@...nel.dk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Anton Altaparmakov <aia21@....ac.uk>,
	Christoph Hellwig <hch@....de>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	George Spelvin <linux@...izon.com>
Subject: Re: a major regression in recent kernels? - was: Re: Null pointer
  OOPS in sync_inodes_sb+0xa9/0x104

On 2011-03-02 13:31, Linus Torvalds wrote:
> Hmm. Adding some more people to the cc - in particular Jens. I'm not
> sure that he'd be on the fsdevel list (although maybe he is).
> 
> The whole "backing_dev_info" has been a total disaster. The thing is
> crap. It violates all the normal kernel memory management rules ("Thou
> shalt use reference counts and free only when it goes to zero") and
> the whole thing has been a constant source of "oh, that driver didn't
> set it, but we changed all the code to require it to be correct".
> 
> And the reason we set it to NULL when the device goes away is exactly
> that it's not ref-counted correctly, so we really _have_ to set it to
> NULL, because it's not going to be around.
> 
> (And the reverse of that is why all kernel data structures should use
> refcounts, and not some external lifetime notion)
> 
> Gaah. The caller has already done the check for a NULL s_bdi:
> 
>         /*
>          * This should be safe, as we require bdi backing to actually
>          * write out data in the first place
>          */
>         if (!sb->s_bdi || sb->s_bdi == &noop_backing_dev_info)
>                 return 0;
> 
> before it calls into "sync_inodes_sb(sb);", but since that stupid
> disconnect event can happen at any time, that doesn't protect
> anything. The s_bdi field clearly just becomes NULL after the check.
> 
> Jens? This was introduced in 592b09a42fc3 ("backing-dev: ensure that a
> removed bdi no longer has super_block referencing it") over a year
> ago, but the _real_ problem goes back all the way to commit
> 32a88aa1b6df which introduced that broken "s_bdi" without refcounts to
> begin with.
> 
> Guys, any idea on how to fix this? The hacky way might be to introduce
> a dummy backing_dev_info and instead of setting s_bdi to NULL, we set
> it to the dummy one that doesn't do anything. It's still hacky as
> hell, though. The real problem really is that having pointers to
> structures without refcounts is just _always_ wrong.
> 
> See Documentation/CodingStyle: "Chapter 11: Data structures":
> 
>   "Remember: if another thread can find your data structure, and you don't
>    have a reference count on it, you almost certainly have a bug."
> 
> wiser words have seldom been spoken.

Agree, from the ->s_bdi point of view it has been and is a mess. I guess
the hope was to avoid adding real reference counting for the bdi. I just
took a quick look at it, and I don't think it'll be too problematic to
add. The bdi is mostly just a settings container.

We already pretty much have a dummy backing_dev_info,
default_backing_dev_info. 2.6.38 and stable backport would be to use
that and going forward ensure we probably reference the bdi when it's
attached to the super block.

I've got a flight coming up tomorrow, I'll cook up both patches for
this.


-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ