netdev - Re: [PATCH net] net/sched: taprio: fix duration_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240527114314.jqqw7sqwayjsgoby@skbuf>
Date: Mon, 27 May 2024 14:43:14 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Jamal Hadi Salim <jhs@...atatu.com>,
	Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
	netdev@...r.kernel.org, eric.dumazet@...il.com,
	syzbot <syzkaller@...glegroups.com>,
	Vinicius Costa Gomes <vinicius.gomes@...el.com>
Subject: Re: [PATCH net] net/sched: taprio: fix duration_to_length()

On Mon, May 27, 2024 at 10:07:31AM +0200, Eric Dumazet wrote:
> On Fri, May 24, 2024 at 6:07 PM Vladimir Oltean <vladimir.oltean@....com> wrote:
> >
> > On Fri, May 24, 2024 at 05:52:17PM +0200, Eric Dumazet wrote:
> > > On Fri, May 24, 2024 at 5:50 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > > >
> > > > On Fri, May 24, 2024 at 5:39 PM Vladimir Oltean <vladimir.oltean@....com> wrote:
> > > > >
> > > > > On Thu, May 23, 2024 at 01:45:49PM +0000, Eric Dumazet wrote:
> > > > > > duration_to_length() is incorrectly using div_u64()
> > > > > > instead of div64_u64().
> > > > > > ---
> > > > > >  net/sched/sch_taprio.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> > > > > > index 1ab17e8a72605385280fad9b7f656a6771236acc..827fb81fc63a098304bad198fadd4aed55d1fec4 100644
> > > > > > --- a/net/sched/sch_taprio.c
> > > > > > +++ b/net/sched/sch_taprio.c
> > > > > > @@ -256,7 +256,8 @@ static int length_to_duration(struct taprio_sched *q, int len)
> > > > > >
> > > > > >  static int duration_to_length(struct taprio_sched *q, u64 duration)
> > > > > >  {
> > > > > > -     return div_u64(duration * PSEC_PER_NSEC, atomic64_read(&q->picos_per_byte));
> > > > > > +     return div64_u64(duration * PSEC_PER_NSEC,
> > > > > > +                      atomic64_read(&q->picos_per_byte));
> > > > > >  }
> > > > >
> > > > > There's a netdev_dbg() in taprio_set_picos_per_byte(). Could you turn
> > > > > that on? I'm curious what was the q->picos_per_byte value that triggered
> > > > > the 64-bit division fault. There are a few weird things about
> > > > > q->picos_per_byte's representation and use as an atomic64_t (s64) type.
> > > >
> > > >
> > > > No repro yet.
> > > >
> > > > Anything with 32 low order bits cleared would trigger a divide by 0.
> > > >
> > > > (1ULL << 32) picoseconds is only 4.294 ms
> > >
> > > BTW, just a reminder, div_u64() is a divide by a 32bit value...
> > >
> > > static inline u64 div_u64(u64 dividend, u32 divisor)
> > > ...
> >
> > The thing is that I don't see how q->picos_per_byte could take any sane
> > value of either 0 or a multiple of 2^32. Its formula is "(USEC_PER_SEC * 8) / speed"
> > where "speed" is the link speed: 10, 100, 1000 etc. The special cases
> > of speed=0 and speed=SPEED_UNKNOWN are handled by falling back to SPEED_10
> > in the picos_per_byte calculation.
> >
> > For q->picos_per_byte to be larger than 2^32, "speed" would have to be
> > smaller than 8000000 / U32_MAX (0.001862645).
> >
> > For q->picos_per_byte to be exactly 0, "speed" would have to be larger
> > than 8000000. But the largest defined speed in include/uapi/linux/ethtool.h
> > is precisely SPEED_800000, leading to an expected q->picos_per_byte of 1.
> 
> This suggests q->picos_per_byte should be a mere u32, and that
> taprio_set_picos_per_byte()
> should make sure to not set  0 in q->picos_per_byte

This is what I was hinting at, indeed. But we're getting farther away
from the problem, which is the fact that syzbot _was_ able to trigger a
division by zero somehow, when zero was not a valid value that I can see.

> Presumably some devices must get a speed bigger than SPEED_800000
> 
> team driver could do that, according to team_ethtool_get_link_ksettings()

I misspoke in the earlier email. SPEED_800000 is still 1 order of
magnitude lower than the maximum representable speed (picos_per_byte
should be 10 for it, not 1). So, we should still be good.

> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 1ab17e8a72605385280fad9b7f656a6771236acc..71087a53630362863cc6c5e462b29dbef8cd5d74
> 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -89,9 +89,9 @@ struct taprio_sched {
>         bool offloaded;
>         bool detected_mqprio;
>         bool broken_mqprio;
> -       atomic64_t picos_per_byte; /* Using picoseconds because for 10Gbps+
> -                                   * speeds it's sub-nanoseconds per byte
> -                                   */
> +       atomic_t picos_per_byte; /* Using picoseconds because for 10Gbps+
> +                                 * speeds it's sub-nanoseconds per byte
> +                                 */
> 
>         /* Protects the update side of the RCU protected current_entry */
>         spinlock_t current_entry_lock;
> @@ -251,12 +251,12 @@ static ktime_t get_interval_end_time(struct
> sched_gate_list *sched,
> 
>  static int length_to_duration(struct taprio_sched *q, int len)
>  {
> -       return div_u64(len * atomic64_read(&q->picos_per_byte), PSEC_PER_NSEC);
> +       return div_u64((u64)len * atomic_read(&q->picos_per_byte),
> PSEC_PER_NSEC);
>  }
> 
>  static int duration_to_length(struct taprio_sched *q, u64 duration)
>  {
> -       return div_u64(duration * PSEC_PER_NSEC,
> atomic64_read(&q->picos_per_byte));
> +       return div_u64(duration * PSEC_PER_NSEC,
> atomic_read(&q->picos_per_byte));
>  }
> 
>  /* Sets sched->max_sdu[] and sched->max_frm_len[] to the minimum between the
> @@ -666,8 +666,8 @@ static void taprio_set_budgets(struct taprio_sched *q,
>                 if (entry->gate_duration[tc] == sched->cycle_time)
>                         budget = INT_MAX;
>                 else
> -                       budget =
> div64_u64((u64)entry->gate_duration[tc] * PSEC_PER_NSEC,
> -                                          atomic64_read(&q->picos_per_byte));
> +                       budget = div_u64((u64)entry->gate_duration[tc]
> * PSEC_PER_NSEC,
> +                                        atomic_read(&q->picos_per_byte));
> 
>                 atomic_set(&entry->budget[tc], budget);
>         }
> @@ -1291,7 +1291,7 @@ static void taprio_set_picos_per_byte(struct
> net_device *dev,
>  {
>         struct ethtool_link_ksettings ecmd;
>         int speed = SPEED_10;
> -       int picos_per_byte;
> +       u32 picos_per_byte;
>         int err;
> 
>         err = __ethtool_get_link_ksettings(dev, &ecmd);
> @@ -1303,11 +1303,11 @@ static void taprio_set_picos_per_byte(struct
> net_device *dev,
> 
>  skip:
>         picos_per_byte = (USEC_PER_SEC * 8) / speed;
> -
> -       atomic64_set(&q->picos_per_byte, picos_per_byte);
> -       netdev_dbg(dev, "taprio: set %s's picos_per_byte to: %lld,
> linkspeed: %d\n",
> -                  dev->name, (long long)atomic64_read(&q->picos_per_byte),
> -                  ecmd.base.speed);
> +       if (!picos_per_byte)
> +               picos_per_byte = 1U;
> +       atomic_set(&q->picos_per_byte, picos_per_byte);
> +       netdev_dbg(dev, "taprio: set %s's picos_per_byte to: %u,
> linkspeed: %d\n",
> +                  dev->name, picos_per_byte, ecmd.base.speed);
>  }

I would be cautious about making this change not having certainty what
was the picos_per_byte value (and associated speed) that triggered the fault.
I'm hoping we're not masking some larger issue about how the speed is
retrieved or processed.