lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 2 Apr 2021 12:33:38 -0700
From:   Josh Hunt <johunt@...mai.com>
To:     Jiri Kosina <jikos@...nel.org>,
        John Fastabend <john.fastabend@...il.com>
Cc:     Cong Wang <xiyou.wangcong@...il.com>,
        Paolo Abeni <pabeni@...hat.com>,
        Kehuan Feng <kehuan.feng@...il.com>,
        Hillf Danton <hdanton@...a.com>,
        Jike Song <albcamus@...il.com>,
        Jonas Bonn <jonas.bonn@...rounds.com>,
        Michael Zhivich <mzhivich@...mai.com>,
        David Miller <davem@...emloft.net>,
        LKML <linux-kernel@...r.kernel.org>,
        Michal Kubecek <mkubecek@...e.cz>,
        Netdev <netdev@...r.kernel.org>
Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

On 4/2/21 12:25 PM, Jiri Kosina wrote:
> On Thu, 3 Sep 2020, John Fastabend wrote:
> 
>>>> At this point I fear we could consider reverting the NOLOCK stuff.
>>>> I personally would hate doing so, but it looks like NOLOCK benefits are
>>>> outweighed by its issues.
>>>
>>> I agree, NOLOCK brings more pains than gains. There are many race
>>> conditions hidden in generic qdisc layer, another one is enqueue vs.
>>> reset which is being discussed in another thread.
>>
>> Sure. Seems they crept in over time. I had some plans to write a
>> lockless HTB implementation. But with fq+EDT with BPF it seems that
>> it is no longer needed, we have a more generic/better solution.  So
>> I dropped it. Also most folks should really be using fq, fq_codel,
>> etc. by default anyways. Using pfifo_fast alone is not ideal IMO.
> 
> Half a year later, we still have the NOLOCK implementation
> present, and pfifo_fast still does set the TCQ_F_NOLOCK flag on itself.
> 
> And we've just been bitten by this very same race which appears to be
> still unfixed, with single packet being stuck in pfifo_fast qdisc
> basically indefinitely due to this very race that this whole thread began
> with back in 2019.
> 
> Unless there are
> 
> 	(a) any nice ideas how to solve this in an elegant way without
> 	    (re-)introducing extra spinlock (Cong's fix) or
> 
> 	(b) any objections to revert as per the argumentation above
> 
> I'll be happy to send a revert of the whole NOLOCK implementation next
> week.
> 

Jiri

If you have a reproducer can you try 
https://lkml.org/lkml/2021/3/24/1485 ? If that doesn't work I think your 
suggestion of reverting nolock makes sense to me. We've moved to using 
fq as our default now b/c of this bug.

Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ