linux-kernel - Re: EEVDF/vhost regression (bisected to 86bfbb7ce4f6 sched/fair: Add lag based placement)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <9b69e534-73a6-4ac7-af92-808c985f82a1@oracle.com>
Date:   Fri, 8 Dec 2023 11:28:57 -0600
From:   Mike Christie <michael.christie@...cle.com>
To:     Tobias Huschle <huschle@...ux.ibm.com>,
        "Michael S. Tsirkin" <mst@...hat.com>
Cc:     Abel Wu <wuyun.abel@...edance.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        kvm@...r.kernel.org, virtualization@...ts.linux.dev,
        netdev@...r.kernel.org, jasowang@...hat.com
Subject: Re: EEVDF/vhost regression (bisected to 86bfbb7ce4f6 sched/fair: Add
 lag based placement)

On 12/8/23 3:24 AM, Tobias Huschle wrote:
> On Thu, Dec 07, 2023 at 01:48:40AM -0500, Michael S. Tsirkin wrote:
>> On Thu, Dec 07, 2023 at 07:22:12AM +0100, Tobias Huschle wrote:
>>> 3. vhost looping endlessly, waiting for kworker to be scheduled
>>>
>>> I dug a little deeper on what the vhost is doing. I'm not an expert on
>>> virtio whatsoever, so these are just educated guesses that maybe
>>> someone can verify/correct. Please bear with me probably messing up 
>>> the terminology.
>>>
>>> - vhost is looping through available queues.
>>> - vhost wants to wake up a kworker to process a found queue.
>>> - kworker does something with that queue and terminates quickly.
>>>
>>> What I found by throwing in some very noisy trace statements was that,
>>> if the kworker is not woken up, the vhost just keeps looping accross
>>> all available queues (and seems to repeat itself). So it essentially
>>> relies on the scheduler to schedule the kworker fast enough. Otherwise
>>> it will just keep on looping until it is migrated off the CPU.
>>
>>
>> Normally it takes the buffers off the queue and is done with it.
>> I am guessing that at the same time guest is running on some other
>> CPU and keeps adding available buffers?
>>
> 
> It seems to do just that, there are multiple other vhost instances
> involved which might keep filling up thoses queues. 
> 
> Unfortunately, this makes the problematic vhost instance to stay on
> the CPU and prevents said kworker to get scheduled. The kworker is
> explicitly woken up by vhost, so it wants it to do something.
> 
> At this point it seems that there is an assumption about the scheduler
> in place which is no longer fulfilled by EEVDF. From the discussion so
> far, it seems like EEVDF does what is intended to do.
> 
> Shouldn't there be a more explicit mechanism in use that allows the
> kworker to be scheduled in favor of the vhost?
> 
> It is also concerning that the vhost seems cannot be preempted by the
> scheduler while executing that loop.
> 

Hey,

I recently noticed this change:

commit 05bfb338fa8dd40b008ce443e397fc374f6bd107
Author: Josh Poimboeuf <jpoimboe@...nel.org>
Date:   Fri Feb 24 08:50:01 2023 -0800

    vhost: Fix livepatch timeouts in vhost_worker()

We used to do:

while (1)
	for each vhost work item in list
		execute work item
		if (need_resched())
                	schedule();

and after that patch we do:

while (1)
	for each vhost work item in list
		execute work item
		cond_resched()


Would the need_resched check we used to have give you what
you wanted?