[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4270c0e3-161e-42d5-a6d3-f16b7fbcdc00@simg.de>
Date: Mon, 3 Feb 2025 19:48:11 +0100
From: Stefan <linux-kernel@...g.de>
To: "Dr. David Alan Gilbert" <linux@...blig.org>, bugzilla-daemon@...nel.org
Cc: Christoph Hellwig <hch@....de>, Thorsten Leemhuis <linux@...mhuis.info>,
Mario Limonciello <mario.limonciello@....com>,
Bruno Gravato <bgravato@...il.com>, Keith Busch <kbusch@...nel.org>,
Adrian Huang <ahuang12@...ovo.com>,
Linux kernel regressions list <regressions@...ts.linux.dev>,
linux-nvme@...ts.infradead.org, Jens Axboe <axboe@...com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [Bug 219609] File corruptions on SSD in 1st M.2 socket of AsRock
X600M-STX + Ryzen 8700G
Hi,
just got feedback from ASRock. They asked me to make a video from the
corruptions occurring on my remotely (and headless) running system.
Maybe I should make video of printing out the logs that can be found an
the Linux and Debian bug trackers ...
Seems that ASRock is unwilling to solve the problem.
Regards Stefan
Am 28.01.25 um 15:24 schrieb Stefan:
> Hi,
>
> Am 28.01.25 um 13:52 schrieb Dr. David Alan Gilbert:
>> Is there any characterisation of the corrupted data; last time I
>> looked at the bz there wasn't.
>
> Yes, there is. (And I already reported it at least on the Debian bug
> tracker, see links in the initial message.)
>
> f3 reports overwritten sectors, i.e. it looks like the pseudo-random
> test pattern is written to wrong position. These corruptions occur in
> clusters whose size is an integer multiple of 2^17 bytes in most cases
> (about 80%) and 2^15 in all cases.
>
> The frequency of these corruptions is roughly 1 cluster per 50 GB written.
>
> Can others confirm this or do they observe a different characteristic?
>
> Regards Stefan
>
>
>> I mean, is it reliably any of:
>> a) What's the size of the corruption?
>> block, cache line, word, bit???
>> b) Position?
>> e.g. last word in a block or something?
>> c) Data?
>> pile of zero's/ff's junk/etc?
>>
>> d) Is it a missed write, old data, or partially written block?
>>
>> Dave
>>
>>>> Puh. I'm kinda lost on what we could do about this on the Linux
>>>> side.
>>>
>>> Because it also depends on the CPU series, a firmware or hardware issue
>>> seems to be more likely than a Linux bug.
>>>
>>> ATM ASRock is still trying to reproduce the issue. (I'm in contact with
>>> them to. But they have Chinese new year holidays in Taiwan this week.)
>>>
>>> If they can't reproduce it, they have to provide an explanation why the
>>> issues are seen by so many users.
>>>
>>> Regards Stefan
>>>
>>>
>
Powered by blists - more mailing lists