linux-kernel - Re: [BUG report] kernel warnings with Samsung 970 EVO 2TB SSD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <DC5UKZ9F6CQZ.2NDFY4S322T2G@cknow.org>
Date: Mon, 18 Aug 2025 22:48:48 +0200
From: "Diederik de Haas" <didi.debian@...ow.org>
To: "Keith Busch" <kbusch@...nel.org>
Cc: <linux-nvme@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
 "Diederik de Haas" <didi.debian@...ow.org>
Subject: Re: [BUG report] kernel warnings with Samsung 970 EVO 2TB SSD

Hi,

First of all: thanks for taking the time to answer my questions :)

On Mon Aug 18, 2025 at 8:58 PM CEST, Keith Busch wrote:
> On Sat, Aug 16, 2025 at 04:11:00PM +0200, Diederik de Haas wrote:
>> On Sat Aug 16, 2025 at 3:20 PM CEST, Keith Busch wrote:
>> 
>> > If you want to see what the driver is reacting to, you can check the
>> > subnqn from command line:
>> >
>> >   # nvme id-ctrl /dev/nvme0 | grep subnqn
>> >
>> > It'll probably be all zeros. The field has been required by spec, but
>> > the driver tolerates ones that don't implement it.
>> 
>> root@...opi-r5s:~# nvme id-ctrl /dev/nvme0 | grep subnqn
>> subnqn    :
>> 
>> So it seems to be just empty?
>
> They, it's interpreted as a string. All 0's would be an empty string.

Ah yes, makes sense.

>> The other kernel warning is this:
>> 
>>   nvme nvme0: using unchecked data buffer
>> 
>> The SUBNQN message appears every time, this one appears often, but not
>> always.
>
> That one means you've sent a user space passthrough command to a device
> that doesn't support SGL DMA. Without that, the nvme protocol uses
> implicitly sized DMA that the driver can't be sure is accurate. The user
> could theoretically provide a short buffer that can corrupt memory if
> done by accident, or be used as an attack vector if done by malicious
> software.
>
> This is also not something to worry about unless you run malicious or
> buggy software.

I would be surprised if I was running malicious software, but pretty
much all software has bugs, so that's ofc possible.
(I run Debian Testing or Unstable on pretty much all my devices)

I thought it was a HW problem as the problem seemed to disappear from my
PC when I removed the NVMe drive from it. And when put in my NanoPi R5S
it appeared again on that device.
Seemed, as I just found out it happened on my PC as well (with Samsung 
960 PRO 1TB) this boot (but not the 20 boots prior).

Uninstalled the 3 programs from R5S that showed up the most around the
warning message and it's still there. 
Would 'dyndbg' be helpful to determine what program is buggy?
 
>> When researching this/these issues, I discovered the nvme-cli package
>> (with the nvme command) and via its manpage I found this command:
>> 
>>   nvme get-feature /dev/nvme0 -f 3
>> 
>> I didn't even know NVMe's had namespaces, but this didn't look good:
>> 
>>   The namespace or the format of that namespace is invalid(0x200b)
>> 
>> ... without actually understanding what it means and/or what its
>> consequences are. It could be harmless and/or normal though.
>
> The feature you're requesting is the LBA range, which is namespace
> scoped. You need to specify a namespace id, either by opening the
> namespace's block device (/dev/nvme0n1) instead of the admin handle
> (/dev/nvme0), or you can manually specify the namespace with paramters
> "--namespace-id=1" or just "-n1".

Adding "-n1" does show normal (AFAICT) output. It's all zeros though.
And now the error message makes sense too :-)
The nvme-cli man page could/should have a better (ie working) example,
but that's not a kernel problem.

Thanks for your help and reassurances :-)

Cheers,
  Diederik

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)