[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220615100432.gd7jeeyjk3qyayyi@quack3.lan>
Date: Wed, 15 Jun 2022 12:04:32 +0200
From: Jan Kara <jack@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Backlund <tmb@....nu>, Jan Kara <jack@...e.cz>,
Suzuki K Poulose <suzuki.poulose@....com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Guenter Roeck <linux@...ck-us.net>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Shuah Khan <shuah@...nel.org>, patches@...nelci.org,
lkft-triage@...ts.linaro.org, Pavel Machek <pavel@...x.de>,
Jon Hunter <jonathanh@...dia.com>,
Florian Fainelli <f.fainelli@...il.com>,
Sudip Mukherjee <sudipm.mukherjee@...il.com>,
Slade Watkins <slade@...dewatkins.com>
Subject: Re: [PATCH 5.15 000/251] 5.15.47-rc2 review
On Tue 14-06-22 11:51:35, Linus Torvalds wrote:
> On Tue, Jun 14, 2022 at 11:20 AM Thomas Backlund <tmb@....nu> wrote:
> >
> > I "think" this is the suggested fix:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git/commit/?h=for_next&id=46b6418e26c7c26f98ff9c2c2310bce5ae2aa4dd
>
> Ugh, this is just too ugly for words.
>
> That's not a fix. That's a "hide the problem" patch.
I agree it is papering over the real problem. I consider that a stopgap
solution so that machines can boot until we find a cleaner solution.
> Now, admittedly clearly the "hide the problem" code already existed,
> and was just moved earlier, but I really think this whole "we're
> calling __mark_inode_dirty() on an inode that isn't even *initialized*
> yet" is a much deeper issue, and shouldn't have some hacky work-around
> in __mark_inode_dirty() that just happens to make it work.
>
> I don't mind that patch per se - moving the code is fine.
>
> But I *do* mind the patch when the reason is to hide that wrong
> ordering of operations.
>
> Now, maybe a proper fix might be to say that new_inode_pseudo() should
> always initialize i_state to I_DIRTY_ALL or something like that. The
> comment already says that they cannot participate in writeback, so
> maybe they should be disabled that way (ie a pseudo inode is always
> dirty and marking it dirty does nothing).
Sadly it is not so simple. Firstly, new_inode_pseudo() gets used for all
inodes (through new_inode()), secondly, tmpfs allocates fully standard
inodes through new_inode() as any other filesystem. We could check
writeback capabilities of the sb->s_bdi in new_inode_pseudo() but that
would not work for inodes that will become block device inodes because
blockdev_superblock has noop_backing_dev_info so we'd have to specialcase
that. Overall it looks a bit hairy to my taste.
> And then you get rid of the noop_backing_dev_info entirely.
And this would be even more difficult because there are other places that
expect there's *some* bdi associated with each sb.
> Or just make sure that noop_backing_dev_info is fully initialized
> before it's used.
>
> Because I think the real problem here is that things have a pointer to
> an uninitialized backing_dev_info.
I fully agree with this. IMHO we need to initialize noop_backing_dev_info
earlier but early init is not exactly my comfort zone so I have to verify
whether various stuff in cgwb_bdi_init() is safe to call so early...
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists