linux-kernel - Re: [PATCH] fs: Fix mod_timer crash when removing USB sticks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACBanvpzOdC4ns-pg1f92ptxrCJ2O=_oJhpKFD4NOB0hyF_+aA@mail.gmail.com>
Date:	Sun, 18 Mar 2012 15:25:43 -0700
From:	Mandeep Singh Baines <msb@...omium.org>
To:	Alan Stern <stern@...land.harvard.edu>
Cc:	"Ted Ts'o" <tytso@....edu>, Theodore Tso <tytso@...gle.com>,
	Greg KH <greg@...ah.com>, Paul Taysom <taysom@...gle.com>,
	Paul Taysom <taysom@...omium.org>,
	Jens Axboe <axboe@...nel.dk>, Andrew Morton <akpm@...gle.com>,
	linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org, stable@...nel.org
Subject: Re: [PATCH] fs: Fix mod_timer crash when removing USB sticks

On Sun, Mar 18, 2012 at 1:23 PM, Alan Stern <stern@...land.harvard.edu> wrote:
> On Sat, 17 Mar 2012, Ted Ts'o wrote:
>
>> I can't help thinking that the fact that we're constantly playing
>> whack-a-mole trying to fix various random crashes when devices
>> disappear that perhaps we should consider if there's a better way to
>> do things.
>
> Indeed, as Jens's patch mentions, proper reference counting for the BDI
> stuff hasn't been implemented yet.  Obviously it will require somebody
> who really does know the code (i.e., not me).
>
> For example, when Paul's patch assigns &default_backing_dev_info, is
> the assignment synchronized by any sort of lock?  I can't tell -- but
> if it isn't then the possibility of a race will still exist.
>

I think its safe without a lock (assuming the assignment is atomic) but it
wouldn't hurt to add an i_lock. That would also give you a barrier which
is needed to propagate the assignment to other CPUs.

This is not a perfect fix but its pretty safe and is nice in that it works
independent of filesystem or bus-type.

Regards,
Mandeep

>> The fact that at the file system layer I have **no** idea that a
>> device has disappeared, and just blindly going on trying to write to a
>> device which is gone just seems a little crazy to me...  why shouldn't
>> block layer inform the upper layers about something as fundamental as,
>> "the device is gone and is never coming back"?
>
> Playing devil's advocate...  What would you do differently if you did
> know the device was gone?  All I/O operations will fail regardless, and
> presumably with an error code like -ENODEV.  Pretty much all you could
> do would be to fail them a little earlier.
>
>> > I suspect Paul's patch is the right thing to do.  It might even make
>> > the ext4 fix unnecessary, although I don't understand the details well
>> > enough to verify it.  Maybe Paul can check -- the commit I'm referring
>> > to is 7c2e70879fc0949b4220ee61b7c4553f6976a94d (ext4: add ext4-specific
>> > kludge to avoid an oops after the disk disappears).
>>
>> I have no idea either, because it's not obvious to me what data
>> structures can be relied upon, and what can't, and when things are
>> supposed to get freed on sudden device disconnects.  The fact that
>> none of us are sure is part of what makes me think that the current
>> scheme is, perhaps, non-optimal...
>
> That's why someone like Jens or Al needs to take a close look at this
> (hint, hint).
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/