lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 9 Dec 2020 20:18:42 +0000
From:   "Geva, Erez" <erez.geva.ext@...mens.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
CC:     Network Development <netdev@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Arnd Bergmann <arnd@...db.de>,
        Cong Wang <xiyou.wangcong@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Alexei Starovoitov <ast@...nel.org>,
        Colin Ian King <colin.king@...onical.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Eyal Birger <eyal.birger@...il.com>,
        "Gustavo A . R . Silva" <gustavoars@...nel.org>,
        Jakub Sitnicki <jakub@...udflare.com>,
        John Ogness <john.ogness@...utronix.de>,
        Jon Rosen <jrosen@...co.com>,
        Kees Cook <keescook@...omium.org>,
        Marc Kleine-Budde <mkl@...gutronix.de>,
        Martin KaFai Lau <kafai@...com>,
        Matthieu Baerts <matthieu.baerts@...sares.net>,
        Andrei Vagin <avagin@...il.com>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Ingo Molnar <mingo@...nel.org>,
        John Stultz <john.stultz@...aro.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Michal Kubecek <mkubecek@...e.cz>,
        Or Cohen <orcohen@...oaltonetworks.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Richard Cochran <richardcochran@...il.com>,
        Stefan Schmidt <stefan@...enfreihafen.org>,
        Xie He <xie.he.0141@...il.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vladis Dronov <vdronov@...hat.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>,
        Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        Vedang Patel <vedang.patel@...el.com>,
        "Molzahn, Ines" <ines.molzahn@...mens.com>,
        "Sudler, Simon" <simon.sudler@...mens.com>,
        "Meisinger, Andreas" <andreas.meisinger@...mens.com>,
        "Bucher, Andreas" <andreas.bucher@...mens.com>,
        "henning.schild@...mens.com" <henning.schild@...mens.com>,
        "jan.kiszka@...mens.com" <jan.kiszka@...mens.com>,
        "Zirkler, Andreas" <andreas.zirkler@...mens.com>,
        "Sakic, Ermin" <ermin.sakic@...mens.com>,
        "anninh.nguyen@...mens.com" <anninh.nguyen@...mens.com>,
        "Saenger, Michael" <michael.saenger@...mens.com>,
        "Maehringer, Bernd" <bernd.maehringer@...mens.com>,
        "gisela.greinert@...mens.com" <gisela.greinert@...mens.com>,
        Erez Geva <ErezGeva2@...il.com>
Subject: Re: [PATCH 1/3] Add TX sending hardware timestamp.


On 09/12/2020 18:37, Willem de Bruijn wrote:
> On Wed, Dec 9, 2020 at 10:25 AM Geva, Erez <erez.geva.ext@...mens.com> wrote:
>>
>>
>> On 09/12/2020 15:48, Willem de Bruijn wrote:
>>> On Wed, Dec 9, 2020 at 9:37 AM Erez Geva <erez.geva.ext@...mens.com> wrote:
>>>>
>>>> Configure and send TX sending hardware timestamp from
>>>>    user space application to the socket layer,
>>>>    to provide to the TC ETC Qdisc, and pass it to
>>>>    the interface network driver.
>>>>
>>>>    - New flag for the SO_TXTIME socket option.
>>>>    - New access auxiliary data header to pass the
>>>>      TX sending hardware timestamp.
>>>>    - Add the hardware timestamp to the socket cookie.
>>>>    - Copy the TX sending hardware timestamp to the socket cookie.
>>>>
>>>> Signed-off-by: Erez Geva <erez.geva.ext@...mens.com>
>>>
>>> Hardware offload of pacing is definitely useful.
>>>
>> Thanks for your comment.
>> I agree, it is not limited of use.
>>
>>> I don't think this needs a new separate h/w variant of SO_TXTIME.
>>>
>> I only extend SO_TXTIME.
> 
> The patchset passes a separate timestamp from skb->tstamp along
> through the ip cookie, cork (transmit_hw_time) and with the skb in
> shinfo.
> 
> I don't see the need for two timestamps, one tied to software and one
> to hardware. When would we want to pace twice?

As the Net-Link uses system clock and the network interface hardware uses it's own PHC.
The current ETF depends on synchronizing the system clock and the PHC.
 
>>> Indeed, we want pacing offload to work for existing applications.
>>>
>> As the conversion of the PHC and the system clock is dynamic over time.
>> How do you propse to achive it?
> 
> Can you elaborate on this concern?

Using single time stamp have 3 possible solutions:

1. Current solution, synchronize the system clock and the PHC.
    Application uses the system clock.
    The ETF can use the system clock for ordering and pass the packet to the driver on time
    The network interface hardware compare the time-stamp to the PHC.

2. The application convert the PHC time-stamp to system clock based.
     The ETF works as solution 1
     The network driver convert the system clock time-stamp back to PHC time-stamp.
     This solution need a new Net-Link flag and modify the relevant network drivers.
     Yet this solution have 2 problems:
     * As applications today are not aware that system clock and PHC are not synchronized and
        therefore do not perform any conversion, most of them only use the system clock.
     * As the conversion in the network driver happens ~300 - 600 microseconds after 
        the application send the packet.
        And as the PHC and system clock frequencies and offset can change during this period.
        The conversion will produce a different PHC time-stamp from the application original time-stamp.
        We require a precession of 1 nanoseconds of the PHC time-stamp.

3. The application uses PHC time-stamp for skb->tstamp
    The ETF convert the  PHC time-stamp to system clock time-stamp.
    This solution require implementations on supporting reading PHC clocks
    from IRQ/kernel thread context in kernel space.

Just for clarification:
ETF as all Net-Link, only uses system clock (the TAI)
The network interface hardware only uses the PHC.
Nor Net-Link neither the driver perform any conversions.
The Kernel does not provide and clock conversion beside system clock.
Linux kernel is a single clock system.

> 
> The simplest solution for offloading pacing would be to interpret
> skb->tstamp either for software pacing, or skip software pacing if the
> device advertises a NETIF_F hardware pacing feature.

That will defy the purpose of ETF.
ETF exist for ordering packets.
Why should the device driver defer it?
Simply do not use the QDISC for this interface.

> 
> Clockbase is an issue. The device driver may have to convert to
> whatever format the device expects when copying skb->tstamp in the
> device tx descriptor.

We do hope our definition is clear.
In the current kernel skb->tstamp uses system clock.
The hardware time-stamp is PHC based, as it is used today for PTP two steps.
We only propose to use the same hardware time-stamp.

Passing the hardware time-stamp to the skb->tstamp might seems a bit tricky
The gaol is the leave the driver unaware to whether we
* Synchronizing the PHC and system clock
* The ETF pass the hardware time-stamp to skb->tstamp
Only the applications and the ETF are aware.
The application can detect by checking the ETF flag.
The ETF flags are part of the network administration.
That also configure the PTP and the system clock synchronization.

> 
>>
>>> It only requires that pacing qdiscs, both sch_etf and sch_fq,
>>> optionally skip queuing in their .enqueue callback and instead allow
>>> the skb to pass to the device driver as is, with skb->tstamp set. Only
>>> to devices that advertise support for h/w pacing offload.
>>>
>> I did not use "Fair Queue traffic policing".
>> As for ETF, it is all about ordering packets from different applications.
>> How can we achive it with skiping queuing?
>> Could you elaborate on this point?
> 
> The qdisc can only defer pacing to hardware if hardware can ensure the
> same invariants on ordering, of course.

Yes, this is why we suggest ETF order packets using the hardware time-stamp.
And pass the packet based on system time.
So ETF query the system clock only and not the PHC.

> 
> Btw: this is quite a long list of CC:s
> 
I need to update my company colleagues as well as the Linux group.

Powered by blists - more mailing lists