linux-kernel - Re: IO scheduler based IO Controller V2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A027348.6000808@cn.fujitsu.com>
Date:	Thu, 07 May 2009 13:36:08 +0800
From:	Li Zefan <lizf@...fujitsu.com>
To:	Vivek Goyal <vgoyal@...hat.com>
CC:	Gui Jianfeng <guijianfeng@...fujitsu.com>, nauman@...gle.com,
	dpshah@...gle.com, mikew@...gle.com, fchecconi@...il.com,
	paolo.valente@...more.it, jens.axboe@...cle.com,
	ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, jmoyer@...hat.com, dhaval@...ux.vnet.ibm.com,
	balbir@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, righi.andrea@...il.com,
	agk@...hat.com, dm-devel@...hat.com, snitzer@...hat.com,
	m-ikeda@...jp.nec.com, akpm@...ux-foundation.org
Subject: Re: IO scheduler based IO Controller V2

Vivek Goyal wrote:
> On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote:
>> Vivek Goyal wrote:
>>> Hi All,
>>>
>>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4.
>>> First version of the patches was posted here.
>> Hi Vivek,
>>
>> I did some simple test for V2, and triggered an kernel panic.
>> The following script can reproduce this bug. It seems that the cgroup
>> is already removed, but IO Controller still try to access into it.
>>
> 
> Hi Gui,
> 
> Thanks for the report. I use cgroup_path() for debugging. I guess that
> cgroup_path() was passed null cgrp pointer that's why it crashed.
> 
> If yes, then it is strange though. I call cgroup_path() only after
> grabbing a refenrece to css object. (I am assuming that if I have a valid
> reference to css object then css->cgrp can't be null).
> 

Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here.
The code dealing with css refcnt and cgroup rmdir has changed quite a lot,
and is much more complex than it was.

> Anyway, can you please try out following patch and see if it fixes your
> crash.
...
> BTW, I tried following equivalent script and I can't see the crash on 
> my system. Are you able to hit it regularly?
> 

I modified the script like this:

======================
#!/bin/sh
echo 1 > /proc/sys/vm/drop_caches
mkdir /cgroup 2> /dev/null
mount -t cgroup -o io,blkio io /cgroup
mkdir /cgroup/test1
mkdir /cgroup/test2
echo 100 > /cgroup/test1/io.weight
echo 500 > /cgroup/test2/io.weight

dd if=/dev/zero bs=4096 count=128000 of=500M.1 &
pid1=$!
echo $pid1 > /cgroup/test1/tasks

dd if=/dev/zero bs=4096 count=128000 of=500M.2 &
pid2=$!
echo $pid2 > /cgroup/test2/tasks

sleep 5
kill -9 $pid1
kill -9 $pid2

for ((;count != 2;))
{
        rmdir /cgroup/test1 > /dev/null 2>&1
        if [ $? -eq 0 ]; then
                count=$(( $count + 1 ))
        fi

        rmdir /cgroup/test2 > /dev/null 2>&1
        if [ $? -eq 0 ]; then
                count=$(( $count + 1 ))
        fi
}

umount /cgroup
rmdir /cgroup
======================

I ran this script and got lockdep BUG. Full log and my config are attached.

Actually this can be triggered with the following steps on my box:
# mount -t cgroup -o blkio,io xxx /mnt
# mkdir /mnt/0
# echo $$ > /mnt/0/tasks
# echo 3 > /proc/sys/vm/drop_cache
# echo $$ > /mnt/tasks
# rmdir /mnt/0

And when I ran the script for the second time, my box was freezed
and I had to reset it.

> Instead of killing the tasks I also tried moving the tasks into root cgroup
> and then deleting test1 and test2 groups, that also did not produce any crash.
> (Hit a different bug though after 5-6 attempts :-)
> 
> As I mentioned in the patchset, currently we do have issues with group
> refcounting and cgroup/group going away. Hopefully in next version they
> all should be fixed up. But still, it is nice to hear back...
> 

View attachment "myconfig" of type "text/plain" (64514 bytes)

View attachment "dmesg.txt" of type "text/plain" (90539 bytes)