lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5f16511c-1cb7-e40f-e9aa-87ee97d5a266@redhat.com>
Date:   Tue, 27 Sep 2022 11:52:14 -0400
From:   Waiman Long <longman@...hat.com>
To:     Mukesh Ojha <quic_mojha@...cinc.com>,
        Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
        will@...nel.org
Cc:     linux-kernel@...r.kernel.org, "<boqun.feng"@gmail.com
Subject: Re: locking/rwsem: RT throttling issue due to RT task hogging the cpu

On 9/27/22 11:30, Mukesh Ojha wrote:
> Hi Waiman,
>
> Thanks for the reply.
>
> On 9/27/2022 8:56 PM, Waiman Long wrote:
>> On 9/27/22 11:25, Waiman Long wrote:
>>>
>>> On 9/20/22 12:19, Mukesh Ojha wrote:
>>>> Hi,
>>>>
>>>> We are observing one issue where, sem->owner is not set and 
>>>> sem->count=6 [1] which means both RWSEM_FLAG_WAITERS and 
>>>> RWSEM_FLAG_HANDOFF bits are set. And if unfold the sem->wait_list 
>>>> we see the following order of process waiting [2] where [a] is 
>>>> waiting for write, while [b],[c] are waiting for read and [d] is 
>>>> the RT task for which waiter.handoff_set=true and it is 
>>>> continuously running on cpu7 and not letting the first write waiter 
>>>> [a] on cpu7.
>>>>
>>>> [1]
>>>>
>>>>   sem = 0xFFFFFFD57DDC6680 -> (
>>>>     count = (counter = 6),
>>>>     owner = (counter = 0),
>>>>
>>>> [2]
>>>>
>>>> [a] kworker/7:0 pid: 32516 ==> [b] iptables-restor pid: 18625 ==> 
>>>> [c]HwBinder:1544_3  pid: 2024 ==> [d] RenderEngine pid: 2032 cpu: 7 
>>>> prio:97 (RT task)
>>>>
>>>>
>>>> Sometime back, Waiman has suggested this which could help in RT task
>>>> leaving the cpu.
>>>>
>>>> https://lore.kernel.org/all/8c33f989-8870-08c6-db12-521de634b34e@redhat.com/ 
>>>>
>>>>
>>> Sorry for the late reply. There is now an alternative way of dealing 
>>> with this RT task hogging issue with the commit 48dfb5d2560d 
>>> ("locking/rwsem: Disable preemption while trying for rwsem lock"). 
>>> Could you try it to see if it can address your problem?
>>
>> FYI, this commit is in the tip tree. It is not in the mainline yet.
>
>
> I only posted that patch so, i am aware about it. In that issue 
> sem->count was 7 and here it is 6 and current issue occurs after fix
> 48dfb5d2560d ("locking/rwsem: Disable preemption while trying for 
> rwsem lock").

Thanks for the quick reply. So it doesn't completely fix this RT hogging 
issue. It is harder than I thought. Will look further into this.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ