[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com>
Date: Sat, 1 Oct 2022 10:58:46 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
David Dworken <ddworken@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4
("tcp: change source port randomizarion at connect() time")
On Sat, Oct 1, 2022 at 10:43 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Sat, Oct 1, 2022 at 10:16 AM Christophe Leroy
> <christophe.leroy@...roup.eu> wrote:
> >
> > Hi,
> >
> > With recent kernels I have a huge irqsoff latency in my boards, shortly
> > after startup, from the call to net_get_random_once() in
> > __inet_hash_connect().
> >
> > On a non instrumented kernel, IRQs are disabled during approximately 80
> > milliseconds. With the traces in goes to 126 milliseconds.
> >
> > Was apparently introduced by commit 190cc82489f4 ("tcp: change source
> > port randomizarion at connect() time")
> >
> > Trace below.
> >
> > Would there be a way to perform the call to get_random_bytes() without
> > disabling IRQ ?
>
> This looks a question for drivers/char/random.c maintainer, because we
> do not block interrupts at this point in __inet_hash_connect()
Oh well, this is probably coming from __do_once_start() / __do_once_done()
We need something better for process contexts...
>
>
>
>
> >
> > Thanks
> > Christophe
> >
> > # tracer: irqsoff
> > #
> > # irqsoff latency trace v1.1.5 on 6.0.0-rc5-s3k-dev-02351-gebc95f69a7d4
> > # --------------------------------------------------------------------
> > # latency: 126337 us, #8207/8207, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
> > # -----------------
> > # | task: CORSurv-352 (uid:0 nice:0 policy:0 rt_prio:0)
> > # -----------------
> > # => started at: _raw_spin_lock_irqsave
> > # => ended at: _raw_spin_unlock_irqrestore
> > #
> > #
> > # _------=> CPU#
> > # / _-----=> irqs-off/BH-disabled
> > # | / _----=> need-resched
> > # || / _---=> hardirq/softirq
> > # ||| / _--=> preempt-depth
> > # |||| / _-=> migrate-disable
> > # ||||| / delay
> > # cmd pid |||||| time | caller
> > # \ / |||||| \ | /
> > CORSurv-352 0d.... 4us : _raw_spin_lock_irqsave
> > CORSurv-352 0d.... 13us+: preempt_count_add
> > <-_raw_spin_lock_irqsave
> > CORSurv-352 0d..1. 25us+: do_raw_spin_lock
> > <-_raw_spin_lock_irqsave
> > CORSurv-352 0d..1. 36us : get_random_bytes <-__inet_hash_connect
> > CORSurv-352 0d..1. 45us : _get_random_bytes.part.0
> > <-__inet_hash_connect
> > CORSurv-352 0d..1. 55us : crng_make_state
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 65us+: ktime_get_seconds <-crng_make_state
> > CORSurv-352 0d..1. 77us+: crng_fast_key_erasure <-crng_make_state
> > CORSurv-352 0d..1. 89us+: chacha_block_generic
> > <-crng_fast_key_erasure
> > CORSurv-352 0d..1. 101us+: chacha_permute <-chacha_block_generic
> > CORSurv-352 0d..1. 129us : chacha_block_generic
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 139us+: chacha_permute <-chacha_block_generic
> > CORSurv-352 0d..1. 160us : chacha_block_generic
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 170us+: chacha_permute <-chacha_block_generic
> > CORSurv-352 0d..1. 191us : chacha_block_generic
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 200us+: chacha_permute <-chacha_block_generic
> > CORSurv-352 0d..1. 221us : chacha_block_generic
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 231us+: chacha_permute <-chacha_block_generic
> >
> > 8182 x the above two line
> >
>
> It seems hard irqs are blocked for short periods, no worries here.
>
> But perhaps your problem is a lack of cond_resched() in a long loop
> (_get_random_bytes() I guess)
>
> Problem is : I do not think _get_random_bytes() can always schedule,
> we probably would need to add
> extra parameters.
>
> > CORSurv-352 0d..1. 126275us : chacha_block_generic
> > <-_get_random_bytes.part.0
> > CORSurv-352 0d..1. 126285us+: chacha_permute <-chacha_block_generic
> > CORSurv-352 0d..1. 126309us : _raw_spin_unlock_irqrestore
> > <-__do_once_done
> > CORSurv-352 0d..1. 126318us+: do_raw_spin_unlock
> > <-_raw_spin_unlock_irqrestore
> > CORSurv-352 0d..1. 126330us+: _raw_spin_unlock_irqrestore
> > CORSurv-352 0d..1. 126346us+: trace_hardirqs_on
> > <-_raw_spin_unlock_irqrestore
> > CORSurv-352 0d..1. 126387us : <stack trace>
> > => tcp_v4_connect
> > => __inet_stream_connect
> > => inet_stream_connect
> > => __sys_connect
> > => system_call_exception
> > => ret_from_syscall
Powered by blists - more mailing lists