linux-kernel - Re: [PATCH 5.15 000/251] 5.15.47-rc2 review

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220615100432.gd7jeeyjk3qyayyi@quack3.lan>
Date:   Wed, 15 Jun 2022 12:04:32 +0200
From:   Jan Kara <jack@...e.cz>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Thomas Backlund <tmb@....nu>, Jan Kara <jack@...e.cz>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Guenter Roeck <linux@...ck-us.net>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        stable <stable@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>, patches@...nelci.org,
        lkft-triage@...ts.linaro.org, Pavel Machek <pavel@...x.de>,
        Jon Hunter <jonathanh@...dia.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Sudip Mukherjee <sudipm.mukherjee@...il.com>,
        Slade Watkins <slade@...dewatkins.com>
Subject: Re: [PATCH 5.15 000/251] 5.15.47-rc2 review

On Tue 14-06-22 11:51:35, Linus Torvalds wrote:
> On Tue, Jun 14, 2022 at 11:20 AM Thomas Backlund <tmb@....nu> wrote:
> >
> > I "think" this is the suggested fix:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git/commit/?h=for_next&id=46b6418e26c7c26f98ff9c2c2310bce5ae2aa4dd
> 
> Ugh, this is just too ugly for words.
> 
> That's not a fix. That's a "hide the problem" patch.

I agree it is papering over the real problem. I consider that a stopgap
solution so that machines can boot until we find a cleaner solution.

> Now, admittedly clearly the "hide the problem" code already existed,
> and was just moved earlier, but I really think this whole "we're
> calling __mark_inode_dirty() on an inode that isn't even *initialized*
> yet" is a much deeper issue, and shouldn't have some hacky work-around
> in __mark_inode_dirty() that just happens to make it work.
> 
> I don't mind that patch per se - moving the code is fine.
> 
> But I *do* mind the patch when the reason is to hide that wrong
> ordering of operations.
> 
> Now, maybe a proper fix might be to say that new_inode_pseudo() should
> always initialize i_state to I_DIRTY_ALL or something like that. The
> comment already says that they cannot participate in writeback, so
> maybe they should be disabled that way (ie a pseudo inode is always
> dirty and marking it dirty does nothing).

Sadly it is not so simple. Firstly, new_inode_pseudo() gets used for all
inodes (through new_inode()), secondly, tmpfs allocates fully standard
inodes through new_inode() as any other filesystem. We could check
writeback capabilities of the sb->s_bdi in new_inode_pseudo() but that
would not work for inodes that will become block device inodes because
blockdev_superblock has noop_backing_dev_info so we'd have to specialcase
that. Overall it looks a bit hairy to my taste.

> And then you get rid of the noop_backing_dev_info entirely.

And this would be even more difficult because there are other places that
expect there's *some* bdi associated with each sb.

> Or just make sure that noop_backing_dev_info is fully initialized
> before it's used.
> 
> Because I think the real problem here is that things have a pointer to
> an uninitialized backing_dev_info.

I fully agree with this. IMHO we need to initialize noop_backing_dev_info
earlier but early init is not exactly my comfort zone so I have to verify
whether various stuff in cgwb_bdi_init() is safe to call so early...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR