lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOi1vP9fvjfu-WsfL4RUw8mAK01rcbU94VvVxfacnskW6JRMNA@mail.gmail.com>
Date: Mon, 9 Feb 2026 12:03:59 +0100
From: Ilya Dryomov <idryomov@...il.com>
To: "Ionut Nechita (Wind River)" <ionut.nechita@...driver.com>
Cc: Alex Markuze <amarkuze@...hat.com>, Viacheslav Dubeyko <slava@...eyko.com>, 
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Clark Williams <clrkwllms@...nel.org>, 
	Steven Rostedt <rostedt@...dmis.org>, ceph-devel@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev, 
	Ionut Nechita <ionut_n2001@...oo.com>, Xiubo Li <xiubli@...hat.com>, 
	Jeff Layton <jlayton@...nel.org>, Sage Weil <sage@...dream.net>, superm1@...nel.org, 
	jkosina@...e.com
Subject: Re: [PATCH] libceph: handle EADDRNOTAVAIL more gracefully

On Sun, Feb 8, 2026 at 5:42 PM Ionut Nechita (Wind River)
<ionut.nechita@...driver.com> wrote:
>
> From: Ionut Nechita <ionut.nechita@...driver.com>
>
> When connecting to Ceph monitors/OSDs, kernel_connect() may return
> -EADDRNOTAVAIL if the source address is temporarily unavailable.
> This commonly occurs during:
> - IPv6 Duplicate Address Detection (DAD), which takes 1-2 seconds
> - IPv4/IPv6 interface state changes (link up/down events)
> - Address removal or reconfiguration on the interface
> - Network namespace transitions in containerized environments
> - CNI reconfigurations in Kubernetes
>
> Currently, libceph treats EADDRNOTAVAIL like any other connection error
> and enters exponential backoff
> (250ms, 500ms, 1s, 2s, 4s, ...), causing delays of 15+ seconds
> before successful reconnection even after the address becomes
> available.
>
> This is particularly problematic in Kubernetes environments running Ceph
> on real-time kernels, where:
> - Storage pods undergo frequent rolling updates
> - Network policies and CNI configurations change dynamically
> - Low I/O latency is critical for RT workloads
> - sync() calls can block for 120+ seconds waiting for reconnection
>
> This patch improves the situation by:
> 1. Detecting EADDRNOTAVAIL on both IPv4 and IPv6 connections
> 2. Using a shorter retry interval (100ms) instead of exponential backoff
> 3. Logging a more informative rate-limited warning message
> 4. Supporting both msgr1 and msgr2 protocol versions
> 5. Clearing the flag on successful connection and when reopening
>
> The fast retry approach is appropriate because:
> - EADDRNOTAVAIL is typically transient (address becomes valid in 1-2s)

Hi Ionut,

I'm missing how an error that is typically transient and goes away in
1-2s can cause a delay of 15+ seconds against a 250ms, 500ms, 1s, 2s,
4s, 8s, 15s backoff loop.  If the address becomes valid in 1-2s, I'd
expect the third or the forth attempt to succeed, with a total wait of
1.75s or 3.75s.  Can you please elaborate?

> - Each retry attempt is inexpensive (kernel_connect fails immediately)
> - Quick recovery is critical for maintaining storage availability
> - The connection succeeds as soon as the address becomes valid
>
> Real-world impact: In production logs showing 'task sync blocked for
> more than 122 seconds' with error -99 (EADDRNOTAVAIL), this patch
> reduces reconnection time from 120+ seconds to 2-3 seconds.

How many attempts do you see per session (i.e. to a particular monitor
OSD or MDS) and in total for the event (e.g. rolling upgrade) before and
after this patch?  Perhaps the effect gets multiplied in an unexpected
way or the backoff mechanism isn't working as expected in general.

Thanks,

                Ilya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ