[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e4040cf-bb54-7652-72cc-0ad2d1288cb8@bell.net>
Date: Wed, 13 Sep 2023 20:29:12 -0400
From: John David Anglin <dave.anglin@...l.net>
To: Damien Le Moal <dlemoal@...nel.org>, Helge Deller <deller@....de>,
James Bottomley <James.Bottomley@...senPartnership.com>
Cc: linux-parisc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kbuild@...r.kernel.org,
Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH] linux/export: fix reference to exported functions for
parisc64
On 2023-09-13 7:45 p.m., Damien Le Moal wrote:
> On 9/14/23 06:22, John David Anglin wrote:
>> On 2023-09-13 1:58 p.m., John David Anglin wrote:
>>> On 2023-09-12 5:53 p.m., John David Anglin wrote:
>>>> On 2023-09-10 5:30 p.m., John David Anglin wrote:
>>>>> Hi Masahiro,
>>>>>
>>>>> The attached change fixed boot at ddb5cdbafaaa 😁
>>>>>
>>>>> However, v6.5.x boot is still broken:
>>>>>
>>>>> Run /init as init process
>>>>> process '/usr/bin/sh' started with executable stack
>>>>> Loading, please wait...
>>>>> Starting systemd-udevd version 254.1-3
>>>>> e1000 alternatives: applied 0 out of 569 patches
>>>>> e1000: Intel(R) PRO/1000 Network Driver
>>>>> e1000: Copyright (c) 1999-2006 Intel Corporation.
>>>>> scsi_mod alternatives: applied 0 out of 7 patches
>>>>> SCSI subsystem initialized
>>>>> usbcore alternatives: applied 0 out of 18 patches
>>>>> usbcore: registered new interface driver usbfs
>>>>> libata alternatives: applied 0 out of 3 patches
>>>>> usbcore: registered new interface driver hub
>>>>> usbcore: registered new device driver usb
>>>>> mptbase alternatives: applied 0 out of 73 patches
>>>>> ehci_hcd alternatives: applied 0 out of 114 patches
>>>>> sata_sil24 alternatives: applied 0 out of 56 patches
>>>>> Fusion MPT base driver 3.04.20
>>>>> Copyright (c) 1999-2008 LSI Corporation
>>>>> sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
>>>>> scsi host0: sata_sil24
>>>>> scsi host1: sata_sil24
>>>>> pata_sil680 0000:60:02.0: sil680: 133MHz clock.
>>>>> scsi host2: sata_sil24
>>>>> ehci_pci alternatives: applied 0 out of 2 patches
>>>>> ohci_hcd alternatives: applied 0 out of 144 patches
>>>>> ehci-pci 0000:60:01.2: EHCI Host Controller
>>>>> scsi host3: pata_sil680
>>>>> ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1
>>>>> scsi host4: sata_sil24
>>>>> ata1: SATA max UDMA/100 host m128@...fffffff80088000 port 0xffffffff80080000 ir6
>>>>> ata2: SATA max UDMA/100 host m128@...fffffff80088000 port 0xffffffff80082000 ir6
>>>>> ata3: SATA max UDMA/100 host m128@...fffffff80088000 port 0xffffffff80084000 ir6
>>>>> ata4: SATA max UDMA/100 host m128@...fffffff80088000 port 0xffffffff80086000 ir6
>>>>> e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77
>>>>> ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000
>>>>> scsi host5: pata_sil680
>>>>> ata5: PATA max UDMA/133 cmd 0x26058 ctl 0x26064 bmdma 0x26040 irq 72
>>>>> ata6: PATA max UDMA/133 cmd 0x26050 ctl 0x26060 bmdma 0x26048 irq 72
>>>>> e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection
>>>>> ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95
>>>>> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.05
>>>>> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
>>>>> usb usb1: Product: EHCI Host Controller
>>>>> usb usb1: Manufacturer: Linux 6.5.2-dirty ehci_hcd
>>>>> usb usb1: SerialNumber: 0000:60:01.2
>>>>> hub 1-0:1.0: USB hub found
>>>>> hub 1-0:1.0: 5 ports detected
>>>>> ata1: SATA link down (SStatus 0 SControl 0)
>>>>> ata2: SATA link down (SStatus 0 SControl 0)
>>>>> ata3: SATA link down (SStatus 0 SControl 0)
>>>>> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>>>>> ata4.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>>>>> ata4.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>>>>> ata4.00: configured for UDMA/100
>>>>> scsi 4:0:0:0: Direct-Access ATA ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>>>>> ata6.00: ATAPI: HL-DT-STDVD+-RW GSA-H21L, 1.04, max UDMA/44
>>>>> scsi 5:0:0:0: CD-ROM HL-DT-ST DVD+-RW GSA-H21L 1.04 PQ: 0 ANSI: 5
>>>>> random: crng init done
>>>>> Timed out for waiting the udev queue being empty.
>>>>> Begin: Loading essential drivers ... done.
>>>>> Begin: Running /scripts/init-premount ... done.
>>>>> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
>>>>> Begin: Running /scripts/local-premount ... done.
>>>>> Timed out for waiting the udev queue being empty.
>>>>> Begin: Waiting for root file system ... Begin: Running /scripts/local-block ....
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> Begin: Running /scripts/local-block ... done.
>>>>> done.
>>>>> Gave up waiting for root file system device. Common problems:
>>>>> - Boot args (cat /proc/cmdline)
>>>>> - Check rootdelay= (did the system wait long enough?)
>>>>> - Missing modules (cat /proc/modules; ls /dev)
>>>>> ALERT! LABEL=ROOT does not exist. Dropping to a shell!
>>>>> Rebooting automatically due to panic= boot argument
>>>>>
>>>>> I'll see if I can find the commit that breaks 6.5.
>>>> I've traced this to the following merge commit:
>>>>
>>>> dave@...as:~/linux/linux$ git bisect good
>>>> ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7 is the first bad commit
>>>> commit ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7
>>>> Merge: 1546cd4bfda4 af92c02fb209
>>>> Author: Linus Torvalds <torvalds@...ux-foundation.org>
>>>> Date: Fri Jun 30 11:57:07 2023 -0700
>>>>
>>>> Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>>>
>>>> Pull SCSI updates from James Bottomley:
>>>> "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
>>>> lpfc, qla2xxx).
>>>>
>>>> We have a couple of major core changes impacting other systems:
>>>>
>>>> - Command Duration Limits, which spills into block and ATA
>>>>
>>>> - block level Persistent Reservation Operations, which touches block,
>>>> nvme, target and dm
>>>>
>>>> Both of these are added with merge commits containing a cover letter
>>>> explaining what's going on"
>>>>
>>>> * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
>>>> scsi: core: Improve warning message in scsi_device_block()
>>>> scsi: core: Replace scsi_target_block() with scsi_block_targets()
>>>> scsi: core: Don't wait for quiesce in scsi_device_block()
>>>> scsi: core: Don't wait for quiesce in scsi_stop_queue()
>>>> scsi: core: Merge scsi_internal_device_block() and device_block()
>>>> scsi: sg: Increase number of devices
>>>> scsi: bsg: Increase number of devices
>>>> scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
>>>> scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
>>>> scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
>>>> scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
>>>> scsi: ufs: ufs-qcom: Switch to the new ICE API
>>>> scsi: ufs: dt-bindings: qcom: Add ICE phandle
>>>> scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
>>>> scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
>>>> scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
>>>> scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
>>>> scsi: ufs: core: Remove dedicated hwq for dev command
>>>> scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
>>>> scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
>>>> ...
>>>>
>>>> dave@...as:~/linux/linux$ lspci
>>>> 00:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
>>>> 40:01.0 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>>>> 40:01.1 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>>>> 60:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>>>> 60:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>>>> 60:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 02)
>>>> 60:02.0 IDE interface: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
>>>> 60:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
>>> This was introduced by the following commit:
>>>
>>> dave@...as:~/linux/linux$ git bisect good
>>> 624885209f31eb9985bf51abe204ecbffe2fdeea is the first bad commit
>>> commit 624885209f31eb9985bf51abe204ecbffe2fdeea
>>> Author: Damien Le Moal <dlemoal@...nel.org>
>>> Date: Thu May 11 03:13:41 2023 +0200
>>>
>>> scsi: core: Detect support for command duration limits
>>>
>>> Introduce the function scsi_cdl_check() to detect if a device supports
>>> command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32
>>> and WRITE 32 commands are checked using the function scsi_report_opcode()
>>> to probe the rwcdlp and cdlp bits as they indicate the mode page defining
>>> the command duration limits descriptors that apply to the command being
>>> tested.
>>>
>>> If any of these commands support CDL, the field cdl_supported of struct
>>> scsi_device is set to 1 to indicate that the device supports CDL.
>>>
>>> Support for CDL for a device is advertizes through sysfs using the new
>>> cdl_supported device attribute. This attribute value is 1 for a device
>>> supporting CDL and 0 otherwise.
>>>
>>> Signed-off-by: Damien Le Moal <dlemoal@...nel.org>
>>> Reviewed-by: Hannes Reinecke <hare@...e.de>
>>> Co-developed-by: Niklas Cassel <niklas.cassel@....com>
>>> Signed-off-by: Niklas Cassel <niklas.cassel@....com>
>>> Link: https://lore.kernel.org/r/20230511011356.227789-9-nks@flawful.org
>>> Signed-off-by: Martin K. Petersen <martin.petersen@...cle.com>
>>>
>>> Documentation/ABI/testing/sysfs-block-device | 9 ++++
>>> drivers/scsi/scsi.c | 81 ++++++++++++++++++++++++++++
>>> drivers/scsi/scsi_scan.c | 3 ++
>>> drivers/scsi/scsi_sysfs.c | 2 +
>>> include/scsi/scsi_device.h | 3 ++
>>> 5 files changed, 98 insertions(+)
>>>
>>> Sometimes I see when booting a bad commit:
>>> [...]
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> done.
>>> Gave up waiting for root file system device. Common problems:
>>> - Boot args (cat /proc/cmdline)
>>> - Check rootdelay= (did the system wait long enough?)
>>> - Missing modules (cat /proc/modules; ls /dev)
>>> ALERT! LABEL=ROOT does not exist. Dropping to a shell!
>>> Rebooting automatically due to panic= boot argument
>>> ata4: SATA link down (SStatus 0 SControl 0)
>>> ata5: SATA link down (SStatus 0 SControl 0)
>>> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>>> ata6.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>>> ata6.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>>> ata6.00: configured for UDMA/100
>>> scsi 5:0:0:0: Direct-Access ATA ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>> System boots master at e56b2b605799 if I disable CDL:
>>
>> dave@...as:~/linux/linux$ git diff drivers/scsi/scsi.c
>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
>> index d0911bc28663..dc3a283ebd75 100644
>> --- a/drivers/scsi/scsi.c
>> +++ b/drivers/scsi/scsi.c
>> @@ -578,6 +578,8 @@ static bool scsi_cdl_check_cmd(struct scsi_device *sdev, u8 opcode, u16 sa,
>> int ret;
>> u8 cdlp;
>>
>> + return false;
>> +
>> /* Check operation code */
>> ret = scsi_report_opcode(sdev, buf, SCSI_CDL_CHECK_BUF_LEN, opcode, sa);
>> if (ret <= 0)
> It is weird that this solves anything... the MAINTENANCE_IN command issued by
> scsi_report_opcode() ends up being emulated in libata with
> ata_scsiop_maint_in(). There are no actual commands issued to the drive, so
> nothing that could actually fail/cause issues. By the time this is issued, the
> ATA drive is also fully probed...
>
> Or is the drive connected to the Broadcom HBA you have ? In that case, libata is
> not used and the HBA FW SAT (scsi-ata-translation) is likely to blame.
/boot, / and swap partitions reside on a ST373207LW drive connected to a Broadcom HBA. A
ST4000VN008-2DR1 drive is connected to the Silicon Image, Inc. SiI 3124 PCI-X Serial
ATA Controller. It mounts on /home. There's also a cdrom connected to the Silicon
Image, Inc. PCI0680 Ultra ATA-133 Host Controller and another ST4000VN008-2DR1 drive
connected to a Broadcom HBA. There are two Broadcom HBAs.
I think the issue is with the root ST373207LW drive. The console output indicates that the
ROOT drive doesn't exist when the boot fails.
Your change only appeared to affect actual SCSI drives. That's why I tried disabling CDL.
>
> Could you send a full dmesg output for a clean boot and for a failed one so that
> I can compare ?
I'll try to get this together tomorrow.
Dave
--
John David Anglin dave.anglin@...l.net
Powered by blists - more mailing lists