linux-kernel - Re: [PATCH] cgroup_freezer: Freezing and task move race fix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C6252CF.1090100@inria.fr>
Date:	Wed, 11 Aug 2010 09:35:43 +0200
From:	Tomasz Buchert <Tomasz.Buchert@...ia.fr>
To:	Matt Helsley <matthltc@...ibm.com>
CC:	Paul Menage <menage@...gle.com>, Li Zefan <lizf@...fujitsu.com>,
	containers@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cgroup_freezer: Freezing and task move race fix

Matt Helsley a écrit :
> On Wed, Aug 11, 2010 at 12:18:44AM +0200, Tomasz Buchert wrote:
>> Matt Helsley a écrit :
>>> On Tue, Aug 10, 2010 at 09:53:21PM +0200, Tomasz Buchert wrote:
>>>> Writing 'FROZEN' to freezer.state file does not
>>>> forbid the task to be moved away from its cgroup
>>>> (for a very short time). Nevertheless the moved task
>>>> can become frozen OUTSIDE its cgroup which puts
>>>> discussed task in a permanent 'D' state.
>>>>
>>>> This patch forbids migration of either FROZEN
>>>> or FREEZING tasks.
>>>>
>>>> This behavior was observed and easily reproduced on
>>>> a single core laptop. Program and instructions how
>>>> to reproduce the bug can be fetched from:
>>>> http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug.c
>>> Thanks for the report and the test code.
>>>
>>> I'm will try to reproduce this race in the next few hours and analyze
>>> it since I'm not sure the patch really fixes the race -- it may only
>>> make the race trigger less frequently.
>>>
>>> At the very least the patch won't break the current code since it's
>>> essentially a more-strict version of is_task_frozen_enough() -- it lets
>>> fewer tasks attach/detach to/from frozen cgroups.
>>>
>>> Cheers,
>>> 	-Matt Helsley
>> Hi Matt!
>> I am a novice if it comes to the kernel and I find the cgroup_freezer
>> code especially complicated, so definetely this may be not enough to fix that.
>> Notice also that if you uncomment the line 55 in my testcase this will also
>> trigger the race! This, however, makes sense since process may not be in the cgroup anymore
>> and consequently won't be thawed.
> 
> OK, I triggered it with that. Interesting.
> 

Good!

>> I think that this patch fixes these problems because it does the flag checking in a right order:
>> first freezing() is used and then frozen() which assures (see frozen_process()) that
>> the race will not happen. Right? :)
> 
> I see what you mean. It still seems like it wouldn't actually fix the race -- just make it
> harder to trigger. I think you're saying this is what happens without the patch:
> 
> Time	"bug" goes through these states		cgroup code checks for these states
> -----------------------------------------------------------------------------------
> |	freezing
> |						is_frozen? Nope.
> |	frozen
> |						is_freezing? Nope.
> |						<move>
> V
> 
My first scenario was a bit different:
Time	"bug" goes through these states		cgroup code checks for these states
-----------------------------------------------------------------------------------
|	freezing
|						is_task_frozen_enough? Nope.
|						<move>
|	frozen
V
but the problem is the same.

> But, without having carefully investigated the details, this could just as easily happen
> with your patch:
> 
> Time	"bug" goes through these states		cgroup code checks for these states
> -----------------------------------------------------------------------------------
> |						is_freezing? Nope.
> |						is_frozen? Nope.
> |	freezing
> |						<move>
> |	frozen
> V
> 

This can't happen as far as I know because there is cgroup_lock around the code in freezer_write()
and freezer_can_attach().
The task can't enter 'freezing' state when can_attach is executed.

> or:
> 
> Time	"bug" goes through these states		cgroup code checks for these states
> -----------------------------------------------------------------------------------
> |						is_freezing? Nope.
> |						is_frozen? Nope.
> |	freezing
> |	frozen
> |						<move>
> V
> 

Same thing here.

> Time	"bug" goes through these states		cgroup code checks for these states
> -----------------------------------------------------------------------------------
> |						is_freezing? Nope.
> |	freezing
> |						is_frozen? Nope.
> |						<move>
> |	frozen
> V
> 

Again.

> or:
> 
> Time	"bug" goes through these states		cgroup code checks for these states
> -----------------------------------------------------------------------------------
> |						is_freezing? Nope.
> |	freezing
> |						is_frozen? Nope.
> |	frozen
> |						<move>
> V
> 
> (even with 1 cpu/core)

Well, once more.

> 
> Your patch only improves things in the sense that it works for the first
> example. We need to prevent the latter cases as well.
> 
> Cheers,
> 	-Matt

What do you think?

Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/