lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Dec 2006 12:57:47 -0600
From:	Andy Fleming <afleming@...escale.com>
To:	"Maciej W. Rozycki" <macro@...ux-mips.org>
Cc:	Andrew Morton <akpm@...l.org>,
	Ben Collins <ben.collins@...ntu.com>,
	linux-kernel@...r.kernel.org, Linus Torvalds <torvalds@...l.org>,
	Jeff Garzik <jeff@...zik.org>
Subject: Re: [PATCH] Export current_is_keventd() for libphy


On Dec 5, 2006, at 11:48, Maciej W. Rozycki wrote:

>
>  Essentially there is a race when disconnecting from a PHY, because
> interrupt delivery uses the event queue for processing.  The  
> function to
> handle interrupts that is called from the event queue is phy_change().
> It takes a pointer to a structure that is associated with the PHY.   
> At the
> time phy_stop_interrupts() is called there may be one or more calls to
> phy_change() still pending on the event queue.  They may not be  
> able to be
> processed until the structure passed to phy_change() have been  
> freed, at
> which point calling the function is wrong.
>
>  One way of avoiding it is calling flush_scheduled_work() from
> phy_stop_interrupts().  This is fine as long as a caller of
> phy_stop_interrupts() (not necessarily the immediate one calling into
> libphy) does not hold the netlink lock.
>
>  If a caller indeed holds the netlink lock, then a driver effectively
> calling phy_stop_interrupts() may arrange for the function to be  
> itself
> scheduled through the event queue.  This has the effect of avoiding  
> the
> race as well, as the queue is processed in order, except it causes  
> more
> hassle for the driver.  Hence the choice was left to the driver's  
> author
> -- if a driver "knows" the netlink lock is not going to be held at  
> that 
> point, it may call phy_stop_interrupts() directly, otherwise it  
> shall use
> the event queue.


We need to make sure there are no more pending phy_change()  
invocations in the work queue, this is true, however I'm not  
convinced that this avoids the problem.  And now that I come back to  
this email after Linus's response, let me add that I agree with his  
suggestion.  I still don't think it solves the original problem,  
though.  Unless I'm missing something, Maciej's suggested fix (having  
the driver invoke phy_stop_interrupts() from a work queue) doesn't  
stop the race:

* Schedule stop_interrupts and freeing of memory.
* interrupt occurs, and schedules phy_change
* work_queue triggers, and stop_interrupts is invoked.  It doesn't  
call flush_scheduled_work, because it's being called from keventd.
* The invoker of stop_interrupts (presumably some function in the  
driver) frees up memory, including the phy_device.
* phy_change is invoked() from the work queue, and starts accessing  
freed memory

Am I misunderstanding how work queues operate?

If I'm right, an ugly solution would have to disable the PHY  
interrupts before scheduling the cleanup.  But is there really no way  
to tell the kernel to squash all pending work that came from *this*  
work_queue?



>
>  With such an assumption in place the function has to check somehow
> whether it has been scheduled through the queue or not and act
> accordingly, which is why that "if" clause is there.
>
>  Now I gather the conclusion was the whole mess was going to be  
> included
> within libphy and not exposed to Ethernet MAC drivers.  This way the
> library would schedule both phy_stop_interrupts() and  
> mdiobus_unregister()
> (which is actually the function freeing the PHY device structure)  
> through
> the event queue as needed without a MAC driver having to know.


I suggested this, mostly so that drivers wouldn't have to be aware of  
this.  But I'm not quite sure what happens when you unload a module.   
Does some stuff stay behind if needed?  If you unload the ethernet  
driver, that will usually remove the bus controller for the PHY,  
which would prevent any scheduled code from accessing the PHY.


Andy


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ