[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKzfzOUPc+g0Brfzyi2efnXE0jLUebBz5fQMWVt9UCtfA@mail.gmail.com>
Date: Sat, 1 Oct 2022 10:43:34 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
David Dworken <ddworken@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4
("tcp: change source port randomizarion at connect() time")
On Sat, Oct 1, 2022 at 10:16 AM Christophe Leroy
<christophe.leroy@...roup.eu> wrote:
>
> Hi,
>
> With recent kernels I have a huge irqsoff latency in my boards, shortly
> after startup, from the call to net_get_random_once() in
> __inet_hash_connect().
>
> On a non instrumented kernel, IRQs are disabled during approximately 80
> milliseconds. With the traces in goes to 126 milliseconds.
>
> Was apparently introduced by commit 190cc82489f4 ("tcp: change source
> port randomizarion at connect() time")
>
> Trace below.
>
> Would there be a way to perform the call to get_random_bytes() without
> disabling IRQ ?
This looks a question for drivers/char/random.c maintainer, because we
do not block interrupts at this point in __inet_hash_connect()
>
> Thanks
> Christophe
>
> # tracer: irqsoff
> #
> # irqsoff latency trace v1.1.5 on 6.0.0-rc5-s3k-dev-02351-gebc95f69a7d4
> # --------------------------------------------------------------------
> # latency: 126337 us, #8207/8207, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
> # -----------------
> # | task: CORSurv-352 (uid:0 nice:0 policy:0 rt_prio:0)
> # -----------------
> # => started at: _raw_spin_lock_irqsave
> # => ended at: _raw_spin_unlock_irqrestore
> #
> #
> # _------=> CPU#
> # / _-----=> irqs-off/BH-disabled
> # | / _----=> need-resched
> # || / _---=> hardirq/softirq
> # ||| / _--=> preempt-depth
> # |||| / _-=> migrate-disable
> # ||||| / delay
> # cmd pid |||||| time | caller
> # \ / |||||| \ | /
> CORSurv-352 0d.... 4us : _raw_spin_lock_irqsave
> CORSurv-352 0d.... 13us+: preempt_count_add
> <-_raw_spin_lock_irqsave
> CORSurv-352 0d..1. 25us+: do_raw_spin_lock
> <-_raw_spin_lock_irqsave
> CORSurv-352 0d..1. 36us : get_random_bytes <-__inet_hash_connect
> CORSurv-352 0d..1. 45us : _get_random_bytes.part.0
> <-__inet_hash_connect
> CORSurv-352 0d..1. 55us : crng_make_state
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 65us+: ktime_get_seconds <-crng_make_state
> CORSurv-352 0d..1. 77us+: crng_fast_key_erasure <-crng_make_state
> CORSurv-352 0d..1. 89us+: chacha_block_generic
> <-crng_fast_key_erasure
> CORSurv-352 0d..1. 101us+: chacha_permute <-chacha_block_generic
> CORSurv-352 0d..1. 129us : chacha_block_generic
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 139us+: chacha_permute <-chacha_block_generic
> CORSurv-352 0d..1. 160us : chacha_block_generic
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 170us+: chacha_permute <-chacha_block_generic
> CORSurv-352 0d..1. 191us : chacha_block_generic
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 200us+: chacha_permute <-chacha_block_generic
> CORSurv-352 0d..1. 221us : chacha_block_generic
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 231us+: chacha_permute <-chacha_block_generic
>
> 8182 x the above two line
>
It seems hard irqs are blocked for short periods, no worries here.
But perhaps your problem is a lack of cond_resched() in a long loop
(_get_random_bytes() I guess)
Problem is : I do not think _get_random_bytes() can always schedule,
we probably would need to add
extra parameters.
> CORSurv-352 0d..1. 126275us : chacha_block_generic
> <-_get_random_bytes.part.0
> CORSurv-352 0d..1. 126285us+: chacha_permute <-chacha_block_generic
> CORSurv-352 0d..1. 126309us : _raw_spin_unlock_irqrestore
> <-__do_once_done
> CORSurv-352 0d..1. 126318us+: do_raw_spin_unlock
> <-_raw_spin_unlock_irqrestore
> CORSurv-352 0d..1. 126330us+: _raw_spin_unlock_irqrestore
> CORSurv-352 0d..1. 126346us+: trace_hardirqs_on
> <-_raw_spin_unlock_irqrestore
> CORSurv-352 0d..1. 126387us : <stack trace>
> => tcp_v4_connect
> => __inet_stream_connect
> => inet_stream_connect
> => __sys_connect
> => system_call_exception
> => ret_from_syscall
Powered by blists - more mailing lists