lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc61408d-8fc7-ca29-d284-0c92c2e1828c@huawei.com>
Date:   Mon, 24 Oct 2022 13:44:56 +0100
From:   John Garry <john.garry@...wei.com>
To:     Niklas Cassel <Niklas.Cassel@....com>
CC:     Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
        "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
        "martin.petersen@...cle.com" <martin.petersen@...cle.com>,
        "jinpu.wang@...ud.ionos.com" <jinpu.wang@...ud.ionos.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linuxarm <linuxarm@...wei.com>,
        yangxingui <yangxingui@...wei.com>,
        yanaijie <yanaijie@...wei.com>
Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

Hi Niklas,

> 
> For the record, I tested the pm80xx driver on a HoneyComb LX2 board
> (an arm64 board using ACPI).
> 
> I tried v6.1-rc1 both with and without your series in $subject.
> 
> I couldn't see any issues.

ok, thanks for the effort.

> 
> 
> What I tried:
> -Running fio:
> fio --name=test --filename=/dev/sdc --ioengine=io_uring --rw=randrw --direct=1 --iodepth=32 --bs=1M
> on three different HDDs simultaneously for 15+ minutes,
> without any errors in fio or dmesg.
> 
> -Creating and mounting a btrfs volume, doing a huge dd to the filesystem
> without issues.
> 
> -sg_sat_read_gplog -d --log=0x10 /dev/sda
> which successfully returned the log.
> 
> 
> It is worth mentioning that this arm64 board has reserved memory regions,
> but does not yet have a firmware that supplies a IORT RMR (reserved memory
> regions) revision E.d node, which means that in order to get this board to
> boot successfully, we need to supply:
> "arm-smmu.disable_bypass=0 iommu.passthrough=1"
> on the kernel command line.

hmmm... that's interesting. I can try again with the IOMMU turned off, 
but, as I recall, it did not make a difference before. I think that 
requiring reserved memory regions would totally bust the driver (if not 
present) with IOMMU enabled. As I recall, sas 3008 card would not work 
without RMR for us.

It's also interesting that this LX2 board has A72 cores. For my system, 
we have newer custom arm v8 cores with quite weak memory ordering 
implementation. With that same system, I have detected a couple of other 
driver memory ordering bugs which we did not see on our A72-based platforms.

I always suspected that this issue was a memory ordering issue, but 
since the hang so reliably occurs I ruled it out. Maybe it is...

thanks,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ