lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd3b6e5c-6963-40e4-b040-2fc523873937@rowland.harvard.edu>
Date: Fri, 12 Apr 2024 14:32:00 -0400
From: Alan Stern <stern@...land.harvard.edu>
To: Sam Sun <samsun1006219@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
  Greg KH <gregkh@...uxfoundation.org>, swboyd@...omium.org,
  ricardo@...liere.net, hkallweit1@...il.com, heikki.krogerus@...ux.intel.com,
  mathias.nyman@...ux.intel.com, royluo@...gle.com,
  syzkaller-bugs@...glegroups.com, xrivendell7@...il.com
Subject: Re: [Linux kernel bug] general protection fault in disable_store

On Fri, Apr 12, 2024 at 02:11:49PM -0400, Alan Stern wrote:
> On Sat, Apr 13, 2024 at 12:26:07AM +0800, Sam Sun wrote:
> > On Fri, Apr 12, 2024 at 10:40 PM Alan Stern <stern@...land.harvard.edu> wrote:
> > > I suspect the usb_hub_to_struct_hub() call is racing with the
> > > spinlock-protected region in hub_disconnect() (in hub.c).
> > >
> > > > If there is any other thing I could help, please let me know.
> > >
> > > Try the patch below.  It should eliminate that race, which hopefully
> > > will fix the problem.
> 
> > I applied this patch and tried to execute several times, no more
> > kernel core dump in my environment. I think this bug is fixed by the
> > patch. But I do have one more question about it. Since it is a data
> > race bug, it has reproducibility issues originally. How can I confirm
> > if a racy bug is fixed by test? This kind of bug might still have a
> > race window but is harder to trigger. Just curious, not for this
> > patch. I think this patch eliminates the racy window.
> 
> If you don't what what is racing, then testing cannot prove that a race 
> is eliminated.  However, if you do know where a race occurs then it's 
> easy to see how mutual exclusion can prevent the race from happening.
> 
> In this case the bug might have had a different cause, something other 
> than a race between usb_hub_to_struct_hub() and hub_disconnect().  If 
> that's so then testing this patch would not be a definite proof that the 
> bug is gone.  But if that race _is_ the cause of the bug then this patch 
> will fix it -- you can see that just by reading the code with no need 
> for testing.

In fact, there still might be a remaining bug, because even with this 
patch it is possible for usb_hub_to_struct_hub() to return NULL.  You 
can test for this possibility by editing the disable_show() routine: 
Move the initializations of the local variables out of the declaration 
lines and into the main part of the routine, and add a delay (like 
msleep(1000)) before the call to usb_hub_to_struct_hub() -- this will 
make the bug a lot easier to trigger (you could even do it by hand).

I think to fully fix this problem it will be necessary to test whether 
hub is NULL before using it.  The same problem ought to exist in 
disable_show(), even though your testing didn't trigger it.

Alan Stern

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ