linux-kernel - Re: [PATCH] umh: fix UAF when the process is being killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y6P2MUcTGU9LIrDg@bombadil.infradead.org>
Date:   Wed, 21 Dec 2022 22:16:17 -0800
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Schspa Shi <schspa@...il.com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org,
        syzbot+10d19d528d9755d9af22@...kaller.appspotmail.com,
        syzbot+70d5d5d83d03db2c813d@...kaller.appspotmail.com,
        syzbot+83cb0411d0fcf0a30fc1@...kaller.appspotmail.com
Subject: Re: [PATCH] umh: fix UAF when the process is being killed

On Thu, Dec 22, 2022 at 01:45:46PM +0800, Schspa Shi wrote:
> 
> Schspa Shi <schspa@...il.com> writes:
> 
> > Luis Chamberlain <mcgrof@...nel.org> writes:
> >
> >> Peter, Ingo, Steven would like you're review.
> >>
> >> On Tue, Dec 13, 2022 at 03:03:53PM -0800, Luis Chamberlain wrote:
> >>> On Mon, Dec 12, 2022 at 09:38:31PM +0800, Schspa Shi wrote:
> >>> > I'd like to upload a V2 patch with the new solution if you prefer the
> >>> > following way.
> >>> > 
> >>> > diff --git a/kernel/umh.c b/kernel/umh.c
> >>> > index 850631518665..8023f11fcfc0 100644
> >>> > --- a/kernel/umh.c
> >>> > +++ b/kernel/umh.c
> >>> > @@ -452,6 +452,11 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
> >>> >                 /* umh_complete() will see NULL and free sub_info */
> >>> >                 if (xchg(&sub_info->complete, NULL))
> >>> >                         goto unlock;
> >>> > +               /*
> >>> > +                * kthreadd (or new kernel thread) will call complete()
> >>> > +                * shortly.
> >>> > +                */
> >>> > +               wait_for_completion(&done);
> >>> >         }
> >>> 
> >>> Yes much better. Did you verify it fixes the splat found by the bots?
> >>
> >> Wait, I'm not sure yet why this would fix it... I first started thinking
> >> that this may be a good example of a Coccinelle SmPL rule, something like:
> >>
> >> 	DECLARE_COMPLETION_ONSTACK(done);
> >> 	foo *foo;
> >> 	...
> >> 	foo->completion = &done;
> >> 	...
> >> 	queue_work(system_unbound_wq, &foo->work);
> >> 	....
> >> 	ret = wait_for_completion_state(&done, state);
> >> 	...
> >> 	if (!ret)
> >> 		S
> >> 	...
> >> 	+wait_for_completion(&done);
> >>
> >> But that is pretty complex, and while it may be useful to know how many
> >> patterns we have like this, it begs the question if generalizing this
> >> inside the callers is best for -ERESTARTSYS condition is best. What
> >> do folks think?
> >>
> >> The rationale here is that if you queue stuff and give access to the
> >> completion variable but its on-stack obviously you can end up with the
> >> queued stuff complete() on a on-stack variable. The issue seems to
> >> be that wait_for_completion_state() for -ERESTARTSYS still means
> >> that the already scheduled queue'd work is *about* to run and
> >> the process with the completion on-stack completed. So we race with
> >> the end of the routine and the completion on-stack.
> >>
> >> It makes me wonder if wait_for_completion() above really is doing
> >> something more, if it is just helping with timing and is still error
> >> prone.
> >>
> >> The queued work will try the the completion as follows:
> >>
> >> static void umh_complete(struct subprocess_info *sub_info)
> >> {
> >> 	struct completion *comp = xchg(&sub_info->complete, NULL);              
> >> 	/*
> >> 	 * See call_usermodehelper_exec(). If xchg() returns NULL
> >> 	 * we own sub_info, the UMH_KILLABLE caller has gone away
> >> 	 * or the caller used UMH_NO_WAIT.
> >> 	 */
> >> 	if (comp)
> >> 		complete(comp);
> >> 	else
> >> 		call_usermodehelper_freeinfo(sub_info);
> >> }
> >>
> >> So the race is getting -ERESTARTSYS on the process with completion
> >> on-stack and the above running complete(comp). Why would sprinkling
> >> wait_for_completion(&done) *after* wait_for_completion_state(&done, state)
> >> fix this UAF?
> >
> > The wait_for_completion(&done) is added when xchg(&sub_info->complete,
> > NULL) return NULL. When it returns NULL, it means the umh_complete was
> > using the completion variable at the same time and will call complete
> > in a very short time.
> >
> Hi Luis:
> 
> Is there any further progress on this problem? Does the above
> explanation answer your doubts?

I think it would be useful to proove your work for you to either
hunt with SmPL coccinelle a similar flaw / how rampant this issue
is and then also try to create the same UAF there and prove how
your changes fixes it.

  Luis