linux-kernel - Re: [bug report] memory corruption panic caused by SG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <94815265-6db3-35b3-b027-47819b963d4a@interlog.com>
Date:   Fri, 3 Aug 2018 13:44:27 -0400
From:   Douglas Gilbert <dgilbert@...erlog.com>
To:     gaowanlong <gaowanlong@...wei.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Cc:     "Wencongyang (UVP)" <wencongyang2@...wei.com>,
        "Wanghui (John)" <john.wanghui@...wei.com>,
        guijianfeng <guijianfeng@...wei.com>,
        "lipengfei (Y)" <lipengfei58@...wei.com>,
        qiaonuohan <qiaonuohan@...wei.com>
Subject: Re: [bug report] memory corruption panic caused by SG_IO ioctl()

On 2018-08-03 12:17 PM, Douglas Gilbert wrote:
> On 2018-08-03 11:47 AM, gaowanlong wrote:
>> Doug,
>>
>> On 2018-08-03 04:46 AM, Wanlong Gao wrote:
>>> Hi Martinand all folks,
>>>
>>>
>>>> Recently we find a kernel panic with memory corruption caused by SG_IO ioctl(),
>>>> and it can be easily reproduced by running following reproducer about
>>>> minutes,any idea?
>>
>>> Which kernel?
>>
>> We've tested with 4.17.11 and 4.18.rc7 and both reproduced.
>>
>>> And what are the underlying devices (e.g. does /dev/sg0 refer to a SATA disk,
>>> a real SCSI disk (SAS for example), USB mass storage, etc)?
>>
>> We tested in a qemu-kvm guest and the sg0 refer to a virtual SATA disk.
> 
> Thanks for the prompt reply.
> 
> The first test I am doing, and you can also do, is to replace the virtual
> SATA disk with a scsi_debug pseudo SCSI disk(s). This will tell us
> whether libata has a hand in this (as that was the case in a previous
> syzkaller report on the SG_IO ioctl()).
> 
>>> Also can you get a copy of the kernel panic?
>>
>> Since the call traces are different every time it reproduced, that I didn't 
>> paste the
>> call trace or the vmcore, but this reproducer is very useful and I believe you 
>> can reproduce
>> it easily using the following code.
> 
> Okay.
> 
> As I write I'm running your reproducer with lk 4.18.0-rc6 against pseudo
> scsi_debug "disks". So far no problems (5 minutes) with no noise in syslog.

Ran for an hour before I stopped it. Before that I did a
   echo 1 > /sys/bus/pseudo/drivers/scsi_debug/opts

which causes a lot of noise in syslog. Then I could see every command was
being rejected with "LBA out of range". So I restarted scsi_debug with this:

   modprobe scsi_debug max_luns=8 sector_size=4096 virtual_gb=2000 ndelay=5000

To give 8 pseudo scsi disks of 2 TB size. Then it worked, this from syslog:
   sd 0:0:0:0: scsi_debug: tag=0x7e, cmd 08 f0 a8 77 d3 be 87 5d da 65 79 3f c7

That is certainly strange, a READ(6) [deprecated] with 13 bytes in the command!
But it doesn't seem to hurt scsi_debug. Still running 15 minutes later ...

Doug Gilbert