lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN4PR0401MB35988DE053A515756DBB123C9B250@SN4PR0401MB3598.namprd04.prod.outlook.com>
Date:   Sat, 12 Sep 2020 09:09:37 +0000
From:   Johannes Thumshirn <Johannes.Thumshirn@....com>
To:     Damien Le Moal <Damien.LeMoal@....com>,
        Borislav Petkov <bp@...en8.de>
CC:     Christoph Hellwig <hch@....de>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Hannes Reinecke <hare@...e.de>, Jens Axboe <axboe@...nel.dk>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: first bad commit: [5795eb443060148796beeba106e4366d7f1458a6]
 scsi: sd_zbc: emulate ZONE_APPEND commands

On 12/09/2020 04:31, Damien Le Moal wrote:
> On 2020/09/12 8:07, Borislav Petkov wrote:
>> On Sat, Sep 12, 2020 at 12:17:59AM +0200, Borislav Petkov wrote:
>>> Enabling it, fixes the issue.
>>
>> Btw, I just hit the below warn with 5.8, while booting with the above
>> config option enabled. Looks familiar and I didn't trigger it with
>> 5.9-rc4+ so you guys either fixed it or something changed in-between:
>>
>> [    5.124321] ata4.00: NCQ Send/Recv Log not supported
>> [    5.131484] ata4.00: configured for UDMA/133
>> [    5.135847] scsi 3:0:0:0: Direct-Access     ATA      ST8000AS0022-1WL SN01 PQ: 0 ANSI: 5
>> [    5.143972] sd 3:0:0:0: Attached scsi generic sg1 type 0
>> [    5.144033] sd 3:0:0:0: [sdb] Host-aware zoned block device
>> [    5.177105] sd 3:0:0:0: [sdb] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB)
>> [    5.184880] sd 3:0:0:0: [sdb] 4096-byte physical blocks
>> [    5.190084] sd 3:0:0:0: [sdb] 29808 zones of 524288 logical blocks + 1 runt zone
>> [    5.197439] sd 3:0:0:0: [sdb] Write Protect is off
>> [    5.202220] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>> [    5.207260] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> [    5.356631]  sdb: sdb1
>> [    5.359014] sdb: disabling host aware zoned block device support due to partitions
>> [    5.389941] ------------[ cut here ]------------
>> [    5.394557] WARNING: CPU: 8 PID: 164 at block/blk-settings.c:236 blk_queue_max_zone_append_sectors+0x12/0x40
>> [    5.404300] Modules linked in:
>> [    5.407365] CPU: 8 PID: 164 Comm: kworker/u32:6 Not tainted 5.8.0 #7
>> [    5.413682] Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PRO (MS-7B79), BIOS 1.70 01/23/2019
>> [    5.424191] Workqueue: events_unbound async_run_entry_fn
>> [    5.429482] RIP: 0010:blk_queue_max_zone_append_sectors+0x12/0x40
>> [    5.435543] Code: fe 0f 00 00 53 48 89 fb 0f 86 3d 07 00 00 48 89 b3 e0 03 00 00 5b c3 90 0f 1f 44 00 00 8b 87 40 04 00 00 ff c8 83 f8 01 76 03 <0f> 0b c3 8b 87 f8 03 00 00 39 87 f0 03 00 00 0f 46 87 f0 03 00 00
>> [    5.454099] RSP: 0018:ffffc90000697c60 EFLAGS: 00010282
>> [    5.459306] RAX: 00000000ffffffff RBX: ffff8887fa0a9400 RCX: 0000000000000000
>> [    5.466390] RDX: ffff8887faf0d400 RSI: 0000000000000540 RDI: ffff8887f0dde6c8
>> [    5.473474] RBP: 0000000000007471 R08: 00000000001d1c40 R09: ffff8887fee29ad0
>> [    5.480559] R10: 00000001434bac00 R11: 0000000000358275 R12: 0000000000080000
>> [    5.487643] R13: ffff8887f0dde6c8 R14: ffff8887fa0a9738 R15: 0000000000000000
>> [    5.494726] FS:  0000000000000000(0000) GS:ffff8887fee00000(0000) knlGS:0000000000000000
>> [    5.502757] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    5.508474] CR2: 0000000000000000 CR3: 0000000002209000 CR4: 00000000003406e0
>> [    5.515558] Call Trace:
>> [    5.518026]  sd_zbc_read_zones+0x323/0x480
>> [    5.522122]  sd_revalidate_disk+0x122b/0x2000
>> [    5.526472]  ? __device_add_disk+0x2f7/0x4e0
>> [    5.530738]  sd_probe+0x347/0x44b
>> [    5.534058]  really_probe+0x2c4/0x3f0
>> [    5.537720]  driver_probe_device+0xe1/0x150
>> [    5.541902]  ? driver_allows_async_probing+0x50/0x50
>> [    5.546852]  bus_for_each_drv+0x6a/0xa0
>> [    5.550683]  __device_attach_async_helper+0x8c/0xd0
>> [    5.555547]  async_run_entry_fn+0x4a/0x180
>> [    5.559636]  process_one_work+0x1a5/0x3a0
>> [    5.563637]  worker_thread+0x50/0x3a0
>> [    5.567300]  ? process_one_work+0x3a0/0x3a0
>> [    5.571480]  kthread+0x117/0x160
>> [    5.574715]  ? kthread_park+0x90/0x90
>> [    5.578377]  ret_from_fork+0x22/0x30
>> [    5.581960] ---[ end trace 94141003236730cf ]---
>> [    5.586578] sd 3:0:0:0: [sdb] Attached SCSI disk
>> [    6.186783] ata5: failed to resume link (SControl 0)
>> [    6.191818] ata5: SATA link down (SStatus 0 SControl 0)
>>


This looks like we're trying to configure zone append max sectors 
on a device that doesn't have the zoned flag set.

> 
> Can you try this:
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 95018e650f2d..620539162ef1 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2968,8 +2968,13 @@ static void sd_read_block_characteristics(struct
> scsi_disk *sdkp)
>         } else {
>                 sdkp->zoned = (buffer[8] >> 4) & 3;
>                 if (sdkp->zoned == 1 && !disk_has_partitions(sdkp->disk)) {
> +#ifdef CONFIG_BLK_DEV_ZONED
>                         /* Host-aware */
>                         q->limits.zoned = BLK_ZONED_HA;
> +#else
> +                       /* Host-aware drive is treated as a regular disk */
> +                       q->limits.zoned = BLK_ZONED_NONE;
> +#endif
>                 } else {
>                         /*
>                          * Treat drive-managed devices and host-aware devices
> @@ -3404,12 +3409,12 @@ static int sd_probe(struct device *dev)
>         sdkp->first_scan = 1;
>         sdkp->max_medium_access_timeouts = SD_MAX_MEDIUM_TIMEOUTS;
> 
> +       sd_revalidate_disk(gd);
> +
>         error = sd_zbc_init_disk(sdkp);
>         if (error)
>                 goto out_free_index;
> 
> -       sd_revalidate_disk(gd);
> -


I don't get how my patch may have broken this. If we have 
CONFIG_BLK_DEV_ZONED=n, sd_zbc_init_disk() is stubbed out and return 0
unconditionally. So the call path will remain exactly the same.

>         gd->flags = GENHD_FL_EXT_DEVT;
>         if (sdp->removable) {
>                 gd->flags |= GENHD_FL_REMOVABLE;
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 4933e7daf17d..f4dc81d48a01 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -241,6 +241,8 @@ static inline void sd_zbc_release_disk(struct scsi_disk
> *sdkp) {}
>  static inline int sd_zbc_read_zones(struct scsi_disk *sdkp,
>                                     unsigned char *buf)
>  {
> +       if (sd_is_zoned(sdkp))
> +               sdkp->capacity = 0;
>         return 0;
>  }
> 
> That should fix the above as well as the hang on boot with CONFIG_BLK_DEV_ZONED
> disabled (for that one I do not totally understand what is going on...).
> 
> We do not have any host-aware disk for testing (as far as I know, nobody is
> selling these anymore), so our test setup is a bit lame in this area. We'll rig
> something up with tcmu-runner emulation to add tests for these devices to avoid
> a repeat of such problem. And we'll make sure to add a test for
> host-aware+partitions, since we at least know for sure there is one user :)
> 
> Johannes: The "goto out_free_index;" on sd_zbc_init_disk() failure is wrong I
> think: the disk is already added and a ref taken on the dev, but out_free_index
> does not seem to do cleanup for that. Need to revisit this.

Yes just seen it as well, will be cooking a fix for that.

I'll build a test env to nail this down.

Byte,
	Johannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ