[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160204183259.GF29586@e106622-lin>
Date: Thu, 4 Feb 2016 18:32:59 +0000
From: Juri Lelli <juri.lelli@....com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Clark Williams <williams@...hat.com>,
John Kacur <jkacur@...hat.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Juri Lelli <juri.lelli@...il.com>
Subject: Re: [BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets
On 04/02/16 12:31, Steven Rostedt wrote:
> On Thu, 4 Feb 2016 16:30:49 +0000
> Juri Lelli <juri.lelli@....com> wrote:
>
> > I've actually changed a bit this approach, and things seem better here.
> > Could you please give this a try? (You can also fetch the same branch).
>
> It appears to fix the one issue I pointed out, but it doesn't fix the
> issue with cpusets.
>
> # burn&
> # TASK=$!
> # schedtool -E -t 2000000:20000000 $TASK
> # grep dl /proc/sched_debug
> dl_rq[0]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[1]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[2]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[3]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[4]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[5]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[6]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
> dl_rq[7]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 104857
>
> # mkdir /sys/fs/cgroup/cpuset/my_cpuset
> # echo 1 > /sys/fs/cgroup/cpuset/my_cpuset/cpuset.cpus
> # grep dl /proc/sched_debug
> dl_rq[0]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[1]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[2]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[3]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[4]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[5]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[6]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
> dl_rq[7]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 209714
>
> It appears to add double the bandwidth.
>
Mmm.. IIUC that's because we don't destroy any root_domain in this case,
as sched_load_balance of the parent is still set. So we add again to the
existing one. I could fix that with some flag indicating when we
actually destroy root_domain(s), but I fear it will make this solution
uglier than it is already :/. More thinking required.
Thanks for testing.
Best,
- Juri
Powered by blists - more mailing lists