lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1d5fc498-c783-4857-b8e5-851e00561898@candelatech.com>
Date:   Thu, 30 Sep 2021 09:44:34 -0700
From:   Ben Greear <greearb@...delatech.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        netdev <netdev@...r.kernel.org>
Subject: Re: 5.15-rc3+ crash in fq-codel?

On 9/29/21 6:36 PM, Ben Greear wrote:
> On 9/29/21 5:40 PM, Eric Dumazet wrote:
>>
>>
>> On 9/29/21 5:29 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 9/29/21 5:04 PM, Ben Greear wrote:
>>>> On 9/29/21 4:48 PM, Ben Greear wrote:
>>>>> On 9/29/21 4:42 PM, Eric Dumazet wrote:
>>>>>>
>>>>>>
>>>>>> On 9/29/21 4:28 PM, Eric Dumazet wrote:
>>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Actually the bug seems to be in pktgen, vs NET_XMIT_CN
>>>>>>>
>>>>>>> You probably would hit the same issues with other qdisc also using NET_XMIT_CN
>>>>>>>
>>>>>>
>>>>>> I would try the following patch :
>>>>>>
>>>>>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>>>>>> index a3d74e2704c42e3bec1aa502b911c1b952a56cf1..0a2d9534f8d08d1da5dfc68c631f3a07f95c6f77 100644
>>>>>> --- a/net/core/pktgen.c
>>>>>> +++ b/net/core/pktgen.c
>>>>>> @@ -3567,6 +3567,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>>>>>           case NET_XMIT_DROP:
>>>>>>           case NET_XMIT_CN:
>>>>>>                   /* skb has been consumed */
>>>>>> +               pkt_dev->last_ok = 1;
>>>>>>                   pkt_dev->errors++;
>>>>>>                   break;
>>>>>>           default: /* Drivers are not supposed to return other values! */
>>>>
>>>> While patching my variant of pktgen, I took a look at the 'default' case.  I think
>>>> it should probably go above NET_XMIT_DROP and fallthrough into the consumed pkt path?
>>>>
>>>> Although, probably not a big deal since only bugs elsewhere would hit that path, and
>>>> we don't really know if skb would be consumed in that case or not.
>>>>
>>>
>>> This is probably dead code after commit
>>>
>>> commit f466dba1832f05006cf6caa9be41fb98d11cb848    pktgen: ndo_start_xmit can return NET_XMIT_xxx values
>>>
>>> So this does not really matter anymore.
>>>
>>>
>>
>> Alternative would be the following patch.
>> NET_XMIT_CN means the packet has been queued for transmit,
>> but that we might have dropped prior packets.
>>
>> Probably not a big deal to make the difference in pktgen.
>>
>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>> index a3d74e2704c42e3bec1aa502b911c1b952a56cf1..5c612cbc74c790f64aff5ce602843378284c7119 100644
>> --- a/net/core/pktgen.c
>> +++ b/net/core/pktgen.c
>> @@ -3557,6 +3557,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>          switch (ret) {
>>          case NETDEV_TX_OK:
>> +       case NET_XMIT_CN:
>>                  pkt_dev->last_ok = 1;
>>                  pkt_dev->sofar++;
>>                  pkt_dev->seq_num++;
>> @@ -3565,8 +3566,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>                          goto xmit_more;
>>                  break;
>>          case NET_XMIT_DROP:
>> -       case NET_XMIT_CN:
>>                  /* skb has been consumed */
>> +               pkt_dev->last_ok = 1;
>>                  pkt_dev->errors++;
>>                  break;
>>          default: /* Drivers are not supposed to return other values! */
>>
> 
> Yes, I like that the XMIT_CN then means to increment the seq_num, though for my own purposes,
> I wouldn't want to increment the sofar++ in that case (and maybe not do other logic in that case),
> since we know at least something dropped.
> 
> For fq-codel, seems that XMIT_CN could mean that the attempted packet actually was queued
> for xmit, but at least some other packets were purged.
> 
> Thanks,
> Ben
> 

This does fix the crash for me (my patch in my tree is slightly different, but same idea).

Thanks,
Ben

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ