[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130107230218.GB3411@BohrerMBP.rgmadvisors.com>
Date: Mon, 7 Jan 2013 17:02:18 -0600
From: Shawn Bohrer <sbohrer@...advisors.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...e.hu, peterz@...radead.org
Subject: Re: kernel BUG at kernel/sched_rt.c:493!
On Mon, Jan 07, 2013 at 11:58:18AM -0600, Shawn Bohrer wrote:
> On Sat, Jan 05, 2013 at 11:46:32AM -0600, Shawn Bohrer wrote:
> > I've tried reproducing the issue, but so far I've been unsuccessful
> > but I believe that is because my RT tasks aren't using enough CPU
> > cause borrowing from the other runqueues. Normally our RT tasks use
> > very little CPU so I'm not entirely sure what conditions caused them
> > to run into throttling on the day that this happened.
>
> I've managed to reproduce this a couple times now on 3.1.9 I'll give
> this a try later with a more recent kernel. Here is what I've done to
> reproduce the issue.
>
>
> # Setup in shell 1
> root@...box39:/cgroup/cpuset# mkdir package0
> root@...box39:/cgroup/cpuset# echo 0 > package0/cpuset.mems
> root@...box39:/cgroup/cpuset# echo 0,2,4,6 > package0/cpuset.cpus
> root@...box39:/cgroup/cpuset# cat cpuset.sched_load_balance
> 1
> root@...box39:/cgroup/cpuset# cat package0/cpuset.sched_load_balance
> 1
> root@...box39:/cgroup/cpuset# cat sysdefault/cpuset.sched_load_balance
> 1
> root@...box39:/cgroup/cpuset# echo 1,3,5,7 > sysdefault/cpuset.cpus
> root@...box39:/cgroup/cpuset# echo 0 > sysdefault/cpuset.mems
> root@...box39:/cgroup/cpuset# echo $$ > package0/tasks
>
> # Setup in shell 2
> root@...box39:~# cd /cgroup/cpuset/
> root@...box39:/cgroup/cpuset# chrt -f -p 60 $$
> root@...box39:/cgroup/cpuset# echo $$ > sysdefault/tasks
>
> # In shell 1
> root@...box39:/cgroup/cpuset# chrt -f 1 /root/burn.sh &
> root@...box39:/cgroup/cpuset# chrt -f 1 /root/burn.sh &
>
> # In shell 2
> root@...box39:/cgroup/cpuset# echo 0 > cpuset.sched_load_balance
> root@...box39:/cgroup/cpuset# echo 1 > cpuset.sched_load_balance
> root@...box39:/cgroup/cpuset# echo 0 > cpuset.sched_load_balance
> root@...box39:/cgroup/cpuset# echo 1 > cpuset.sched_load_balance
>
> I haven't found the exact magic combination but I've been going back
> and forth adding/killing burn.sh processes and toggling
> cpuset.sched_load_balance and in a couple of minutes I can usually get
> the machine to trigger the bug.
I've also managed to reproduce this on 3.8.0-rc2 so it appears the bug
is still present in the latest kernel.
Also just re-reading my instructions above /root/burn.sh is just a
simple:
while true; do : ; done
I've also had to make the kworker threads SCHED_FIFO with a higher
priority than burn.sh or as expected I can lock up the system due to
some xfs threads getting starved.
Let me know if anyone needs any more information, or needs me to try
anything since it appears I can trigger this fairly easily now.
--
Shawn
--
---------------------------------------------------------------
This email, along with any attachments, is confidential. If you
believe you received this message in error, please contact the
sender immediately and delete all copies of the message.
Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists