lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c419633d-f1d6-9c60-1c89-d8fefd24d4ff@gmail.com>
Date:   Sat, 26 May 2018 12:43:04 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Song Liu <liu.song.a23@...il.com>,
        Song Liu <songliubraving@...com>, ast@...com
Cc:     netdev@...r.kernel.org, kernel-team@...com,
        "David S . Miller" <davem@...emloft.net>
Subject: Re: [PATCH net] net: sched: check netif_xmit_frozen_or_stopped() in
 sch_direct_xmit()

On 05/25/2018 12:46 PM, Song Liu wrote:
> On Fri, May 25, 2018 at 11:11 AM, Song Liu <songliubraving@...com> wrote:
>> Summary:
>>
>> At the end of sch_direct_xmit(), we are in the else path of
>> !dev_xmit_complete(ret), which means ret == NETDEV_TX_OK. The following
>> condition will always fail and netif_xmit_frozen_or_stopped() is not
>> checked at all.
>>
>>     if (ret && netif_xmit_frozen_or_stopped(txq))
>>          return false;
>>
>> In this patch, this condition is fixed as:
>>
>>     if (netif_xmit_frozen_or_stopped(txq))
>>          return false;
>>
>> and further simplifies the code as:
>>
>>     return !netif_xmit_frozen_or_stopped(txq);
>>
>> Fixes: 29b86cdac00a ("net: sched: remove remaining uses for qdisc_qlen in xmit path")
>> Cc: John Fastabend <john.fastabend@...il.com>
>> Cc: David S. Miller <davem@...emloft.net>
>> Signed-off-by: Song Liu <songliubraving@...com>
>> ---
>>  net/sched/sch_generic.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 39c144b..8261d48 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -346,10 +346,7 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
>>                 return false;
>>         }
>>
>> -       if (ret && netif_xmit_frozen_or_stopped(txq))
>> -               return false;
>> -
>> -       return true;
>> +       return !netif_xmit_frozen_or_stopped(txq);
>>  }
>>
>>  /*
>> --
>> 2.9.5
>>
> 
> Alexei and I discussed about this offline. We would like to share our
> discussion here to
> clarify the motivation.
> 
> Before 29b86cdac00a, ret in condition "if (ret &&
> netif_xmit_frozen_or_stopped()" is not
> the value from dev_hard_start_xmit(), because ret is overwritten by
> either qdisc_qlen()
> or dev_requeue_skb(). Therefore, 29b86cdac00a changed the behavior of
> this condition.
> 
> For ret from dev_hard_start_xmit(), I dig into the function and found
> it is from return value
> of ndo_start_xmit(). Per netdevice.h, ndo_start_xmit() should only
> return NETDEV_TX_OK
> or NETDEV_TX_BUSY. I survey many drivers, and they all follow the rule. The only
> exception is vlan.
> 
> Given ret could only be NETDEV_TX_OK or NETDEV_TX_BUSY (ignore vlan for now),
> if it fails condition "if (!dev_xmit_complete(ret))", ret must be
> NETDEV_TX_OK == 0. So
> netif_xmit_frozen_or_stopped() will always be bypassed.
> 
> It is probably OK to ignore netif_xmit_frozen_or_stopped(), and return true from
> sch_direct_xmit(), as I didn't see that break any functionality. But
> it is more like "correct
> by accident" to me. This is the motivation of my original patch.
> 
> Alexei pointed out that, the following condition is more like original logic:
> 
>       if (qdisc_qlen(q) && netif_xmit_frozen_or_stopped(txq))
>             return false;
> 
> However, I think John would like to remove qdisc_qlen() from the tx
> path. I didn't see

Yep qdisc_qlen() is not very friendly for lockless users. At
some point we will get around to writing a distributed rate
limiter qdisc and it will be nice to not have to work-around
qdisc_qlen().

> any issue without the extra qdisc_qlen() check, so the patch is
> probably good AS-IS.
> 
> Please share your comments and feedback on this.
> 

Thanks for the detailed analysis. The above patch looks OK
to me. Actually I'm debating if we should just drop the check.
But, there looks to be a case where drivers return NETDEV_TX_OK
and then stop the queue because it is nearly overrun. By putting
the check there we stop early instead of doing some extra work
before realizing the driver ring is full.

Still this overrun case should be rare so removing the check
should be OK. Plus as you note its not been running anyways. My
current recommendation is just remove the check altogether.

Thanks,
John 

> Thanks,
> Song
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ