lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0655cea93e52928d3e4a12b4fe2d2a4375492ed3.camel@linux.ibm.com>
Date: Wed, 10 Apr 2024 15:36:18 -0400
From: James Bottomley <jejb@...ux.ibm.com>
To: Cyril Brulebois <kibi@...ian.org>, regressions@...ts.linux.dev,
        stable@...r.kernel.org
Cc: Mike Christie <michael.christie@...cle.com>,
        "Martin K. Petersen"
 <martin.petersen@...cle.com>,
        Sasha Levin <sashal@...nel.org>, gregkh@...uxfoundation.org,
        Bart Van Assche <bvanassche@....org>, Christoph
 Hellwig <hch@....de>,
        John Garry <john.g.garry@...cle.com>, linux-scsi@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Diederik de Haas
 <didi.debian@...ow.org>,
        Salvatore Bonaccorso <carnil@...ian.org>
Subject: Re: [REGRESSION] Loss of some SMART information in v6.1.81

On Wed, 2024-04-10 at 21:32 +0200, Cyril Brulebois wrote:
> Hi,
> 
> Munin uses the following command to get sensor-type information out
> of SMART-aware disks (e.g. temperature):
> 
>     /usr/sbin/smartctl -A --nocheck=standby -d ata /dev/sda
> 
> This broke following an upgrade from v6.1.76 (as found in Debian 12)
> to v6.1.82 (as currently found in the proposed-updates repository for
> the next point release of Debian 12), with smartctl's now reporting:
> 
>     smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-19-amd64]
> (local build)
>     Copyright (C) 2002-22, Bruce Allen, Christian Franke,
> www.smartmontools.org
>     
>     Device is in SLEEP mode, exit(2)
> 
> This happens on baremetal with 2 pairs of disks:
>  - 2×ST4000VN008-2DR1 (sda, sdb)
>  - 2×ST8000VN004-2M21 (sdc, sdd)
> 
> and that's an obvious lie with one pair doing system stuff and the
> other
> one doing media stuff.
> 
> This also happens within a Debian 12 QEMU VM running on a Debian 12
> libvirt host, when using a SATA disk, which is what I've used to test
> various builds from the stable/linux-6.1.y branch and associated
> tags.
> 
> Building stable releases, I pinpointed it as a regression between
> v6.1.80 and v6.1.81, then pinpointed it to commit cf33e6ca12d8.
> 
> #regzbot introduced: v6.1.80..v6.1.81
> #regzbot introduced: cf33e6ca12d8
> 
> This is also affecting v6.1.84 and v6.1.85 (released during my git
> bisect session).
> 
> Reported in Debian via: https://bugs.debian.org/1068675 (which
> included a trace with the distribution-provided v6.1.82 package).
> 
> Most recent trace, with v6.1.85 (mainline, using the distribution's
> config but without any patches):
> 
>     [   30.547027] ------------[ cut here ]------------
>     [   30.547034] WARNING: CPU: 0 PID: 697 at
> drivers/scsi/scsi_lib.c:214 scsi_execute_cmd+0x42/0x2c0 [scsi_mod]
>     [   30.547082] Modules linked in: tls tun intel_rapl_msr
> intel_rapl_common kvm_intel kvm irqbypass ghash_clmulni_intel
> sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3
> snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg
> snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd cryptd rapl
> snd_hda_core snd_hwdep bochs drm_vram_helper pcspkr drm_ttm_helper
> snd_pcm iTCO_wdt snd_timer intel_pmc_bxt ttm iTCO_vendor_support snd
> watchdog soundcore virtio_console virtio_balloon drm_kms_helper
> button joydev evdev serio_raw sg binfmt_misc fuse loop drm efi_pstore
> dm_mod configfs qemu_fw_cfg virtio_rng ip_tables x_tables autofs4
> ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sd_mod
> t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic ahci libahci
> virtio_scsi virtio_blk virtio_net net_failover failover xhci_pci
> crct10dif_pclmul crct10dif_common crc32_pclmul libata crc32c_intel
> xhci_hcd psmouse i2c_i801 i2c_smbus scsi_mod scsi_common lpc_ich
> virtio_pci
>     [   30.547194]  virtio_pci_legacy_dev virtio_pci_modern_dev
> usbcore usb_common virtio virtio_ring
>     [   30.547205] CPU: 0 PID: 697 Comm: smartctl Not tainted 6.1.85
> #1
>     [   30.547210] Hardware name: QEMU Standard PC (Q35 + ICH9,
> 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>     [   30.547217] RIP: 0010:scsi_execute_cmd+0x42/0x2c0 [scsi_mod]

This is a different manifestation of the same bug in stable that was
introduced by a backport of scsi_execute_cmd.  The proposed fix for the
domain validation problem here will also sort out this problem:

https://lore.kernel.org/linux-scsi/yq1frvvpymp.fsf@ca-mkp.ca.oracle.com/

James


Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ