linux-kernel - Re: [PATCH] scsi: fix crash in scsi_remove

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <833d4fff-7fb9-31c5-3ed8-ed9a7753a5db@oracle.com>
Date:   Tue, 18 Oct 2022 11:47:50 -0500
From:   Mike Christie <michael.christie@...cle.com>
To:     Khazhismel Kumykov <khazhy@...omium.org>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>
Cc:     Gabriel Krisman Bertazi <krisman@...labora.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        Khazhismel Kumykov <khazhy@...gle.com>
Subject: Re: [PATCH] scsi: fix crash in scsi_remove_host after alloc failure

On 10/17/22 12:11 PM, Khazhismel Kumykov wrote:
> If transport_register_device returns error, shost_gendev has already
> been cleaned up - however since we ignore the error device setup
> continues happily. We will eventually call transport_unregister_device,
> attempting to delete shost_gendev again, resulting in a crash.
> 
> It looks like when this cleanup behavior was added, iscsi was updated,
> but scsi was missed.
> 
> Fixes: cd7ea70bb00a ("scsi: drivers: base: Propagate errors through the transport component")
> 

Where do you crash? Do we need to handle the cases transport_add_device
is called directly and we don't handle the failure then later call
transport_remove_device directly?

The thing is that transport device addition success was
supposed to be optional where we were supposed to be able to
still at least setup the device, boot and use it. Tools might
be missing some attrs but we were still supposed to run.
I think that's why we didn't propagate errors originally.
We were also supposed to also be able to call transport_configure_device
even if the transport_add_device call failed (see comment in
that function for info). Does that still work?