lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCAZ7r+Wra4mogLd+=GVo_71dtUbpPieRyoCU3dHXQa6g@mail.gmail.com>
Date:   Wed, 4 Mar 2020 18:51:42 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Christian Borntraeger <borntraeger@...ibm.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: 5.6-rc3: WARNING: CPU: 48 PID: 17435 at kernel/sched/fair.c:380 enqueue_task_fair+0x328/0x440

On Wed, 4 Mar 2020 at 18:42, Christian Borntraeger
<borntraeger@...ibm.com> wrote:
>
>
>
> On 04.03.20 16:26, Vincent Guittot wrote:
> > On Tue, 3 Mar 2020 at 08:55, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> >>
> >> On Tue, 3 Mar 2020 at 08:37, Christian Borntraeger
> >> <borntraeger@...ibm.com> wrote:
> >>>
> >>>
> >>>
> > [...]
> >>>>>> ---
> >>>>>>  kernel/sched/fair.c | 2 +-
> >>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>>>>> index 3c8a379c357e..beb773c23e7d 100644
> >>>>>> --- a/kernel/sched/fair.c
> >>>>>> +++ b/kernel/sched/fair.c
> >>>>>> @@ -4035,8 +4035,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> >>>>>>             __enqueue_entity(cfs_rq, se);
> >>>>>>     se->on_rq = 1;
> >>>>>>
> >>>>>> +   list_add_leaf_cfs_rq(cfs_rq);
> >>>>>>     if (cfs_rq->nr_running == 1) {
> >>>>>> -           list_add_leaf_cfs_rq(cfs_rq);
> >>>>>>             check_enqueue_throttle(cfs_rq);
> >>>>>>     }
> >>>>>>  }
> >>>>>
> >>>>> Now running for 3 hours. I have not seen the issue yet. I can tell tomorrow if this fixes
> >>>>> the issue.
> >>>>
> >>>>
> >>>> Still running fine. I can tell for sure tomorrow, but I have the impression that this makes the
> >>>> WARN_ON go away.
> >>>
> >>> So I guess this change "fixed" the issue. If you want me to test additional patches, let me know.
> >>
> >> Thanks for the test. For now, I don't have any other patch to test. I
> >> have to look more deeply how the situation happens.
> >> I will let you know if I have other patch to test
> >
> > So I haven't been able to figure out how we reach this situation yet.
> > In the meantime I'm going to make a clean patch with the fix above.
> >
> > Is it ok if I add a reported -by and a tested-by you ?
>
> Sure-
> I just realized that this system has something special. Some month ago I created 2 slices
> $ head /etc/systemd/system/*.slice
> ==> /etc/systemd/system/machine-production.slice <==
> [Unit]
> Description=VM production
> Before=slices.target
> Wants=machine.slice
> [Slice]
> CPUQuota=2000%
> CPUWeight=1000
>
> ==> /etc/systemd/system/machine-test.slice <==
> [Unit]
> Description=VM production
> Before=slices.target
> Wants=machine.slice
> [Slice]
> CPUQuota=300%
> CPUWeight=100
>
>
> And the guests are then put into these slices. that also means that this test will never use more than the 2300%.
> No matter how much CPUs the system has.

Thanks for the information, I will try to see how this could impact the enqueue

>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ