[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZXLgwLehNbaHy3yb@DESKTOP-2CCOB1S.>
Date: Fri, 8 Dec 2023 10:24:16 +0100
From: Tobias Huschle <huschle@...ux.ibm.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Abel Wu <wuyun.abel@...edance.com>, Peter Zijlstra <peterz@...radead.org>,
Linux Kernel <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org,
virtualization@...ts.linux.dev, netdev@...r.kernel.org,
jasowang@...hat.com
Subject: Re: Re: Re: EEVDF/vhost regression (bisected to 86bfbb7ce4f6
sched/fair: Add lag based placement)
On Thu, Dec 07, 2023 at 01:48:40AM -0500, Michael S. Tsirkin wrote:
> On Thu, Dec 07, 2023 at 07:22:12AM +0100, Tobias Huschle wrote:
> > 3. vhost looping endlessly, waiting for kworker to be scheduled
> >
> > I dug a little deeper on what the vhost is doing. I'm not an expert on
> > virtio whatsoever, so these are just educated guesses that maybe
> > someone can verify/correct. Please bear with me probably messing up
> > the terminology.
> >
> > - vhost is looping through available queues.
> > - vhost wants to wake up a kworker to process a found queue.
> > - kworker does something with that queue and terminates quickly.
> >
> > What I found by throwing in some very noisy trace statements was that,
> > if the kworker is not woken up, the vhost just keeps looping accross
> > all available queues (and seems to repeat itself). So it essentially
> > relies on the scheduler to schedule the kworker fast enough. Otherwise
> > it will just keep on looping until it is migrated off the CPU.
>
>
> Normally it takes the buffers off the queue and is done with it.
> I am guessing that at the same time guest is running on some other
> CPU and keeps adding available buffers?
>
It seems to do just that, there are multiple other vhost instances
involved which might keep filling up thoses queues.
Unfortunately, this makes the problematic vhost instance to stay on
the CPU and prevents said kworker to get scheduled. The kworker is
explicitly woken up by vhost, so it wants it to do something.
At this point it seems that there is an assumption about the scheduler
in place which is no longer fulfilled by EEVDF. From the discussion so
far, it seems like EEVDF does what is intended to do.
Shouldn't there be a more explicit mechanism in use that allows the
kworker to be scheduled in favor of the vhost?
It is also concerning that the vhost seems cannot be preempted by the
scheduler while executing that loop.
Powered by blists - more mailing lists