[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzyxP8o7V7Q6xaS7@magnolia>
Date: Tue, 4 Oct 2022 15:18:39 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: Lukas Czerner <lczerner@...hat.com>
Cc: zhanchengbin <zhanchengbin1@...wei.com>,
Theodore Ts'o <tytso@....edu>, linux-ext4@...r.kernel.org,
liuzhiqiang26@...wei.com, linfeilong <linfeilong@...wei.com>,
kzak@...hat.com, util-linux@...r.kernel.org
Subject: Re: [bug report] misc/fsck.c: Processes may kill other processes.
[cc util-linux and karel zak]
TLDR: util-linux's fsck program has an interesting bug in it where if
someone runs "fsck -N", it will set up a fsck_instance context for each
filesystem with inst->pid = -1. If someone sends the fsck process a
SIGINT/SIGTERM before it finishes enumerating filesystems, it will try
to kill all the fsck instances via "kill(inst->pid, ...);" which will
terminate every process on the system.
On Fri, Sep 30, 2022 at 09:20:42AM +0200, Lukas Czerner wrote:
> On Fri, Sep 30, 2022 at 09:42:52AM +0800, zhanchengbin wrote:
> >
> >
> > On 2022/9/29 19:28, Lukas Czerner wrote:
> > > Hi,
> > >
> > > indeed we'd like to avoid killing the instance that was not ran because
> > > of noexecute. Can you try the following patch?
> > >
> > > Thanks!
> > > -Lukas
> >
> > Yes, you're right, I think we can fix it in this way.
> >
> > diff --git a/misc/fsck.c b/misc/fsck.c
> > index 1f6ec7d9..91edbf17 100644
> > --- a/misc/fsck.c
> > +++ b/misc/fsck.c
> > @@ -547,6 +547,8 @@ static int kill_all(int signum)
> > for (inst = instance_list; inst; inst = inst->next) {
> > if (inst->flags & FLAG_DONE)
> > continue;
> > + if (inst->pid == -1)
> > + continue;
>
> Yeah, that works as well although I find the "if (noexecute)" condition
> more obvious. We can do both. Also rather than checking for -1 we can
> check for <= 0 since anything other than real pid at this point is a bug.
>
> Feel free to send a proper patch.
I was about to ask why we even care about misc/fsck.c because it's
clearly a weird old program that has been bitrotting for years and
likely replaced by some other code in util-linux. Then I thought I had
better check util-linux, and...
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/disk-utils/fsck.c
/*
* fsck --- A generic, parallelizing front-end for the fsck program.
* It will automatically try to run fsck programs in parallel if the
* devices are on separate spindles. It is based on the same ideas as
* the generic front end for fsck by David Engel and Fred van Kempen,
* but it has been completely rewritten from scratch to support
* parallel execution.
*
* Written by Theodore Ts'o, <tytso@....edu>
LOL, it's the same source code, and I think it has the same bug, since
"noexecute" mode sets pid = -1 at like 688:
/* Fork and execute the correct program. */
if (noexecute)
pid = -1;
and then sets inst->pid = pid at line 703:
inst->pid = pid;
and kill_all() passes that to kill() at line 733:
for (inst = instance_list; inst; inst = inst->next) {
if (inst->flags & FLAG_DONE)
continue;
kill(inst->pid, signum);
n++;
}
>From that I conclude that this is a real bug in util-linux, and we
ought to be talking to them about this. Evidently this has been broken
since e2fsprogs commit 33922999 in January 1999, though it was only
added to util-linux in commit 607c2a72952f in February 2009.
--D
> Thanks!
> -Lukas
>
> > kill(inst->pid, signum);
> > n++;
> > }
> > >
> > >
> > > diff --git a/misc/fsck.c b/misc/fsck.c
> > > index 1f6ec7d9..8fae7730 100644
> > > --- a/misc/fsck.c
> > > +++ b/misc/fsck.c
> > > @@ -497,9 +497,10 @@ static int execute(const char *type, const char *device, const char *mntpt,
> > > }
> > > /* Fork and execute the correct program. */
> > > - if (noexecute)
> > > + if (noexecute) {
> > > pid = -1;
> > > - else if ((pid = fork()) < 0) {
> > > + inst->flags |= FLAG_DONE;
> > > + } else if ((pid = fork()) < 0) {
> > > perror("fork");
> > > free(inst);
> > > return errno;
> > > @@ -544,6 +545,9 @@ static int kill_all(int signum)
> > > struct fsck_instance *inst;
> > > int n = 0;
> > > + if (noexecute)
> > > + return 0;
> > > +
> > > for (inst = instance_list; inst; inst = inst->next) {
> > > if (inst->flags & FLAG_DONE)
> > > continue;
> > regards,
> > Zhan Chengbin
> >
>
Powered by blists - more mailing lists