lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lfgpeeas.fsf@nanos.tec.linutronix.de>
Date:   Fri, 02 Oct 2020 01:07:07 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Erez Geva <erez.geva.ext@...mens.com>,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        Cong Wang <xiyou.wangcong@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>, Andrei Vagin <avagin@...il.com>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Ingo Molnar <mingo@...nel.org>,
        John Stultz <john.stultz@...aro.org>,
        Michal Kubecek <mkubecek@...e.cz>,
        Oleg Nesterov <oleg@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Richard Cochran <richardcochran@...il.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Vladis Dronov <vdronov@...hat.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>
Cc:     Jesus Sanchez-Palencia <jesus.sanchez-palencia@...el.com>,
        Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        Vedang Patel <vedang.patel@...el.com>,
        Simon Sudler <simon.sudler@...mens.com>,
        Andreas Meisinger <andreas.meisinger@...mens.com>,
        Andreas Bucher <andreas.bucher@...mens.com>,
        Henning Schild <henning.schild@...mens.com>,
        Jan Kiszka <jan.kiszka@...mens.com>,
        Andreas Zirkler <andreas.zirkler@...mens.com>,
        Ermin Sakic <ermin.sakic@...mens.com>,
        An Ninh Nguyen <anninh.nguyen@...mens.com>,
        Michael Saenger <michael.saenger@...mens.com>,
        Bernd Maehringer <bernd.maehringer@...mens.com>,
        Gisela Greinert <gisela.greinert@...mens.com>,
        Erez Geva <erez.geva.ext@...mens.com>,
        Erez Geva <ErezGeva2@...il.com>
Subject: Re: [PATCH 5/7] Traffic control using high-resolution timer issue

On Thu, Oct 01 2020 at 22:51, Erez Geva wrote:

Issue? You again fail to decribe the problem.

>   - Add new schedule function for Qdisc watchdog.
>     The function avoids reprogram if watchdog already expire
>     before new expire.

Why can't the existing function not be changed to do that?

>   - Use new schedule function in ETF.
>
>   - Add ETF range value to kernel configuration.
>     as the value is characteristic to Hardware.

No. That's completely wrong. Hardware properties need to be established
at boot/runtime otherwise you can't build a kernel which runs on
different platforms.

> +void qdisc_watchdog_schedule_soon_ns(struct qdisc_watchdog *wd, u64 expires,
> +				     u64 delta_ns)

schedule soon? That sounds like schedule it sooner than later, but I
don't care.

> +{
> +	if (test_bit(__QDISC_STATE_DEACTIVATED,
> +		     &qdisc_root_sleeping(wd->qdisc)->state))
> +		return;
> +
> +	if (wd->last_expires == expires)
> +		return;

How is this supposed to be valid without checking whether the timer is
queued in the first place?

Maybe the function name should be schedule_soon_or_not()

> +	/**

Can you please use proper comment style? This is neither network comment
style nor the regular sane kernel comment style. It's kerneldoc comment
style which is reserved for function and struct documentation.

> +	 * If expires is in [0, now + delta_ns],
> +	 * do not program it.

This range info is just confusing. Just use plain english.

> +	 */
> +	if (expires <= ktime_to_ns(hrtimer_cb_get_time(&wd->timer)) + delta_ns)
> +		return;

So if the watchdog is NOT queued and expiry < now + delta_ns then this
returns and nothing starts the timer ever.

That might all be correct, but without any useful explanation it looks
completely bogus.

> +	/**
> +	 * If timer is already set in [0, expires + delta_ns],
> +	 * do not reprogram it.
> +	 */
> +	if (hrtimer_is_queued(&wd->timer) &&
> +	    wd->last_expires <= expires + delta_ns)
> +		return;
> +
> +	wd->last_expires = expires;
> +	hrtimer_start_range_ns(&wd->timer,
> +			       ns_to_ktime(expires),
> +			       delta_ns,
> +			       HRTIMER_MODE_ABS_PINNED);
> +}
> +EXPORT_SYMBOL(qdisc_watchdog_schedule_soon_ns);
> +
>  void qdisc_watchdog_cancel(struct qdisc_watchdog *wd)
>  {
>  	hrtimer_cancel(&wd->timer);
> diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
> index c48f91075b5c..48b2868c4672 100644
> --- a/net/sched/sch_etf.c
> +++ b/net/sched/sch_etf.c
> @@ -20,6 +20,11 @@
>  #include <net/pkt_sched.h>
>  #include <net/sock.h>
>  
> +#ifdef CONFIG_NET_SCH_ETF_TIMER_RANGE
> +#define NET_SCH_ETF_TIMER_RANGE CONFIG_NET_SCH_ETF_TIMER_RANGE
> +#else
> +#define NET_SCH_ETF_TIMER_RANGE (5 * NSEC_PER_USEC)
> +#endif
>  #define DEADLINE_MODE_IS_ON(x) ((x)->flags & TC_ETF_DEADLINE_MODE_ON)
>  #define OFFLOAD_IS_ON(x) ((x)->flags & TC_ETF_OFFLOAD_ON)
>  #define SKIP_SOCK_CHECK_IS_SET(x) ((x)->flags & TC_ETF_SKIP_SOCK_CHECK)
> @@ -128,8 +133,9 @@ static void reset_watchdog(struct Qdisc *sch)
>  		return;
>  	}
>  
> -	next = ktime_sub_ns(skb->tstamp, q->delta);
> -	qdisc_watchdog_schedule_ns(&q->watchdog, ktime_to_ns(next));
> +	next = ktime_sub_ns(skb->tstamp, q->delta + NET_SCH_ETF_TIMER_RANGE);
> +	qdisc_watchdog_schedule_soon_ns(&q->watchdog, ktime_to_ns(next),
> +					NET_SCH_ETF_TIMER_RANGE);

This is changing 5 things at once. That's just wrong.

patch 1: Add the new function and explain why it's required

patch 2: Make reset_watchdog() use it

patch 3: Add a mechanism to retrieve the magic hardware range from the
         underlying hardware driver and add that value to q->delta or
         set q->delta to it. Whatever makes sense. The current solution
         makes no sense at all

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ