lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZenacXkUAh4I1gkK@kbusch-mbp>
Date: Thu, 7 Mar 2024 08:17:05 -0700
From: Keith Busch <kbusch@...nel.org>
To: Len Brown <lenb@...nel.org>
Cc: Max Gurtovoy <mgurtovoy@...dia.com>, linux-nvme@...ts.infradead.org,
	maxg@...lanox.com, axboe@...nel.dk, hch@....de, sagi@...mberg.me,
	linux-kernel@...r.kernel.org, Len Brown <len.brown@...el.com>
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown
 timeout

On Thu, Mar 07, 2024 at 09:27:21AM -0500, Len Brown wrote:
> On Thu, Mar 7, 2024 at 4:29 AM Max Gurtovoy <mgurtovoy@...dia.com> wrote:
> 
> > > Some words are alarming in routine kernel messages.
> > > "timeout" is one of them...
>
> > > Here NVME is routinely setting a timeout value,
> > > rather than reporting that a timeout has occurred.
> >
> > No.
> > see the original commit message
> >
> > "When an NVMe controller reports RTD3 Entry Latency larger than the
> > value of shutdown_timeout module parameter, we update the
> > shutdown_timeout accordingly to honor RTD3 Entry Latency. Use an
> > informational debug level instead of a warning level for it."
> >
> > So this is not a routine flow. This informs users about using a
> > different value than the module param they set.
> 
> I have machines in automated testing.
> Those machines have zero module params.
> This message appears in their dmesg 100% of the time,
> and our dmesg scanner complains about them 100% of the time.
> 
> Is this a bug in the NVME hardware or software?
> 
> If yes, I'll be happy to help  debug it.
> 
> If no, then exactly what action is the informed user supposed to take
> upon seeing this message?
> 
> If none, then the message serves no purpose and should be deleted entirely.

It lets you know that your device takes longer to safely power off than
the module's default tolerance. System low power transitions may take a
long time, and at one point, people wanted to know about that since it
may affect their power management decisions.

This print was partly from when NVMe protocol did not provide a way to
advertise an appropriate shutdown time, and we had no idea what devices
in the wild actually needed. We often just get a dmesg with bug reports,
and knowing device's shutdown timings was helpful at one point with
suspend and power off issues.

You can make the print go away by adding param

  nvme_core.shutdown_timeout=<Largest Observed Value>

But personally, I don't find this print very useful anymore, so I don't
care if it gets removed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ