[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YRVMtVXezKgRZm=p7EcRmc6Mt4Dnnn1MmoPqK06YicPw@mail.gmail.com>
Date: Tue, 14 Mar 2023 11:05:22 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: "Theodore Ts'o" <tytso@....edu>
Cc: Jan Kara <jack@...e.cz>, Tudor Ambarus <tudor.ambarus@...aro.org>,
syzbot <syzbot+8785e41224a3afd04321@...kaller.appspotmail.com>,
adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org, llvm@...ts.linux.dev,
nathan@...nel.org, ndesaulniers@...gle.com,
syzkaller-bugs@...glegroups.com, trix@...hat.com,
Lee Jones <joneslee@...gle.com>
Subject: Re: [syzbot] [ext4?] KASAN: slab-out-of-bounds Read in ext4_group_desc_csum
On Tue, 14 Mar 2023 at 10:45, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> On Tue, 14 Mar 2023 at 03:26, Theodore Ts'o <tytso@....edu> wrote:
> >
> > On Mon, Mar 13, 2023 at 03:53:57PM +0100, Dmitry Vyukov wrote:
> > > > Long-term we are moving ext4 in a direction where we can disallow block
> > > > device modifications while the fs is mounted but we are not there yet. I've
> > > > discussed some shorter-term solution to avoid such known problems with syzbot
> > > > developers and what seems plausible would be a kconfig option to disallow
> > > > writing to a block device when it is exclusively open by someone else.
> > > > But so far I didn't get to trying whether this would reasonably work. Would
> > > > you be interested in having a look into this?
> > >
> > > Does this affect only the loop device or also USB storage devices?
> > > Say, if the USB device returns different contents during mount and on
> > > subsequent reads?
> >
> > Modifying the block device while the file system is mounted is
> > something that we have to allow for now because tune2fs uses it to
> > modify the superblock. It has historically also been used (rarely) by
> > people who know what they are doing to do surgery on a mounted file
> > system. If we create a way for tune2fs to be able to update the
> > superblock via some kind of ioctl, we could disallow modifying the
> > block device while the file system is mounted. Of course, it would
> > require waiting at least 5-6 years since sometimes people will update
> > the kernel without updating userspace. We'd also need to check to
> > make sure there aren't boot loader installer (such as grub-install)
> > that depend on being able to modify the block device while the root
> > file system is mounted, at least in some rare cases.
> >
> > The "how" to exclude mounted file systems is relatively easy. The
> > kernel already knows when the file system is mounted, and it is
> > already a supported feature that a userspace application that wants to
> > be careful can open a block device with O_EXCL, and if it is in use by
> > the kernel --- mounted by a file system, being used by dm-thin, et. al
> > -- the open(2) system call will fail. From the open(2) man page.
> >
> > In general, the behavior of O_EXCL is undefined if it is used without
> > O_CREAT. There is one exception: on Linux 2.6 and later, O_EXCL can
> > be used without O_CREAT if pathname refers to a block device. If the
> > block device is in use by the system (e.g., mounted), open() fails
> > with the error EBUSY.
> >
> > Something which the syzbot could to do today is to simply use O_EXCL
> > whenever trying to open a block device. This would avoid a class of
> > syzbot false positives, since normally it requires root privileges
> > and/or an experienced sysadmin to try to modify a block device while
> > it is mounted and/or in use by LVM.
> >
> > - Ted
> >
> > P.S. Trivia note: Aproximately month after I started work at VA Linux
> > Systems, a sysadmin intern which was given the root password to
> > sourceforge.net, while trying to fix a disk-to-disk backup, ran
> > mkfs.ext3 on /dev/hdXX, which was also being used as one-half of a
> > RAID 0 setup on which open source code critical to the community
> > (including, for example, OpenGL) was mounted and serving. The intern
> > got about 50% the way through zeroing the inode table on /dev/hdXX
> > before the file system noticed and threw an error, at which point
> > wiser heads stopped what the intern was doing and tried to clean up
> > the mess. Of course, there were no backups, since that was what the
> > intern was trying to fix!
> >
> > There are a couple of things that we could learn from this incident.
> > One was that giving the root password to an untrained intern not
> > familiar with the setup on the serving system was... an unfortunate
> > choice. Another was that adding the above-mentioned O_EXCL feature
> > and teaching mkfs to use it was an obvious post-mortem action item to
> > prevent this kind of problem in the future...
>
> I am struggling to make my mind re how to think about this case.
>
> "root" is very overloaded, but generally it does not mean "randomly
> corrupting memory". Normally it gives access to system-wide changes
> but with the same protection/consistency guarantees as for
> unprivileged system calls.
>
> There are, of course, things like /dev/{mem,kmem}. But at the same
> time there is also lockdown LSM and more distros today enable it.
>
> Btw, should this "prohibit writes to mounted device" be part of
> LOCKDOWN_INTEGRITY? It looks like it gives capabilities similar to
> /dev/{mem,kmem}.
>
> Disabling in testing something that's enabled in production is
> generally not very useful.
> So one option is to do nothing about this for now.
> If it's a true recognized issue that is in the process of fixing,
> syzbot will just show that it's still present. One of the goals of
> syzbot is to show the current state of things in an objective manner.
> If some kernel developers are aware of an issue, it does not mean that
> most distros/users are aware.
>
> It makes sense to disable in testing things that are also recommended
> to be disabled in production settings.
> And LOCKDOWN_INTEGRITY may play such a role: we include this
> restriction into LOCKDOWN_INTEGRITY and enable it on syzbot.
> Though, unfortunately, we still don't enable it because it prohibits
> access to debugfs, which is required for fuzzing. Need to ask lockdown
> maintainers what they think about
> LOCKDOWN_TEST_ONLY_DONT_ENABLE_IN_PROD_INTEGRITY which would whitelist
> debugfs.
Asked lockdown maintainers about adding this it lockdown and adding
special mode for fuzzing:
https://lore.kernel.org/all/CACT4Y+Z-9KCgKwkktvdJwNJZxxeA1f74zkP7KD6c=OmKXxXfjw@mail.gmail.com/
Powered by blists - more mailing lists