lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 13 Dec 2014 00:03:31 +0200
From:	Imre Deak <imre.deak@...el.com>
To:	Peter Hurley <peter@...leysoftware.com>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jslaby@...e.cz>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] vt: fix console lock vs. kernfs s_active lock order

On Fri, 2014-12-12 at 16:03 -0500, Peter Hurley wrote:
> On 12/12/2014 03:29 PM, Imre Deak wrote:
> > Hi Peter,
> > 
> > thanks for your review.
> > 
> > On Fri, 2014-12-12 at 13:32 -0500, Peter Hurley wrote:
> >> Hi Imre,
> >>
> >> On 12/12/2014 11:38 AM, Imre Deak wrote:
> >>> Currently there is a lock order problem between the console lock and the
> >>> kernfs s_active lock of the console driver's bind sysfs entry. When
> >>> writing to the sysfs entry the lock order is first s_active then console
> >>> lock, when unregistering the console driver via
> >>> do_unregister_con_driver() the order is the opposite. See the below
> >>> bugzilla reference for one instance of a lockdep backtrace.
> >>
> >> This description didn't make sense to me because the driver core doesn't
> >> try to take the console_lock. So I had to go pull the lockdep report
> >> and I'm not sure I agree with your analysis.
> >>
> >> I see a three-way dependency which includes the fb atomic notifier call
> >> chain?
> > 
> > From the lockdep report in the bugzilla ticket I referenced, you can see
> > the following two paths:
> > 
> > i915_driver_load()
> >   console_lock()  -> takes console_sem
> >   do_unregister_con_driver()
> >     vtconsole_deinit_device()
> >       device_remove_file()
> >         ...
> >         __kernfs_remove()
> >           kernfs_drain() ->
> >             takes s_active rwsem for the console's bind sysfs entry
> >             (tracked via kn->dep_map)
> 
> I don't see this call chain.
> 
> I see: (some obvious intermediate calls redacted)
> 
>   i915_driver_unload
>     do_unregister_framebuffer
>       fb_notifier_call_chain
>         fbcon_event_notify
>           do_unregister_con_driver
>             device_remove_file
>               sysfs_addrm_finish
> 
> which has console_lock => fb notifier lock => kernfs lock dependency.

Ah, right, there are two dmesg logs in the bug report, your sequence is
the first one [1] and I talked about the second [2].

[1] https://bugs.freedesktop.org/attachment.cgi?id=87716
[2] https://bugs.freedesktop.org/attachment.cgi?id=107602

> 
> My point being that this occurs only because the fb notifier call chain
> requires console_lock => fb notifier lock dependency ordering; that and
> fbcon assumes that it can do whatever in the notifier.

Yes this is correct and I don't see any place where we would have the
opposite order.

The lockdep report in [1] is really just the same problem I described
which is the s_active->console_lock dependency:

CPU0                                   CPU1
----                                   ----
lock((fb_notifier_list).rwsem);
                                       lock(console_lock);
                                       lock((fb_notifier_list).rwsem);
lock(s_active#114);
*** DEADLOCK ***

This can lead to a deadlock only because s_active already depends on
console_lock. After removing this dependency I can't see any other way
these locks could deadlock.

> I would like to see more thorough justification for why this change
> belongs in vt, and not in fbcon. That was my point about the 3-way
> dependency.

Since it's really about the dependency between console_lock and s_active
and fbcon is not involved in all code paths where we take these locks
(like in the [2] case) but vt is.

--Imre

> 
> Regards,
> Peter Hurley
> 
> 
> > vfs_write()   for the above console bind sysfs entry
> >   kernfs_fop_write()
> >     kernfs_get_active() ->
> >       takes s_active rwsem for the above sysfs entry
> >     ...
> >       store_bind() -> takes console_sem
> > 
> > So you have console_sem->s_active ordering on one path and
> > s_active->console_sem ordering on the other.
> > 
> > This patch gets rid of the ordering problem and the related lockdep
> > warning.
> > 
> > --Imre
> > 
> >>
> >> Regards,
> >> Peter Hurley
> >>
> >>> Fix this by unregistering the console driver from a deferred work, where
> >>> we can safely drop the console lock while unregistering the device and
> >>> corresponding sysfs entries (which in turn acquire s_active). Note that
> >>> we have to keep the console driver slot in the registered_con_driver
> >>> array reserved for the driver that's being unregistered until it's fully
> >>> removed. Otherwise a concurrent call to do_register_con_driver could
> >>> try to reuse the same slot and fail when registering the corresponding
> >>> device with a minor index that's still in use.
> >>>
> >>> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=70523
> >>> Signed-off-by: Imre Deak <imre.deak@...el.com>
> >>> ---
> >>>  drivers/tty/vt/vt.c | 51 +++++++++++++++++++++++++++++++++++++++++----------
> >>>  1 file changed, 41 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> >>> index 5dd1880..b9edc77 100644
> >>> --- a/drivers/tty/vt/vt.c
> >>> +++ b/drivers/tty/vt/vt.c
> >>> @@ -108,6 +108,7 @@
> >>>  #define CON_DRIVER_FLAG_MODULE 1
> >>>  #define CON_DRIVER_FLAG_INIT   2
> >>>  #define CON_DRIVER_FLAG_ATTR   4
> >>> +#define CON_DRIVER_FLAG_ZOMBIE 8
> >>>  
> >>>  struct con_driver {
> >>>  	const struct consw *con;
> >>> @@ -153,6 +154,7 @@ static int set_vesa_blanking(char __user *p);
> >>>  static void set_cursor(struct vc_data *vc);
> >>>  static void hide_cursor(struct vc_data *vc);
> >>>  static void console_callback(struct work_struct *ignored);
> >>> +static void con_driver_unregister_callback(struct work_struct *ignored);
> >>>  static void blank_screen_t(unsigned long dummy);
> >>>  static void set_palette(struct vc_data *vc);
> >>>  
> >>> @@ -180,6 +182,7 @@ static int blankinterval = 10*60;
> >>>  core_param(consoleblank, blankinterval, int, 0444);
> >>>  
> >>>  static DECLARE_WORK(console_work, console_callback);
> >>> +static DECLARE_WORK(con_driver_unregister_work, con_driver_unregister_callback);
> >>>  
> >>>  /*
> >>>   * fg_console is the current virtual console,
> >>> @@ -3597,7 +3600,8 @@ static int do_register_con_driver(const struct consw *csw, int first, int last)
> >>>  	for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
> >>>  		con_driver = &registered_con_driver[i];
> >>>  
> >>> -		if (con_driver->con == NULL) {
> >>> +		if (con_driver->con == NULL &&
> >>> +		    !(con_driver->flag & CON_DRIVER_FLAG_ZOMBIE)) {
> >>>  			con_driver->con = csw;
> >>>  			con_driver->desc = desc;
> >>>  			con_driver->node = i;
> >>> @@ -3660,16 +3664,10 @@ int do_unregister_con_driver(const struct consw *csw)
> >>>  
> >>>  		if (con_driver->con == csw &&
> >>>  		    con_driver->flag & CON_DRIVER_FLAG_MODULE) {
> >>> -			vtconsole_deinit_device(con_driver);
> >>> -			device_destroy(vtconsole_class,
> >>> -				       MKDEV(0, con_driver->node));
> >>>  			con_driver->con = NULL;
> >>> -			con_driver->desc = NULL;
> >>> -			con_driver->dev = NULL;
> >>> -			con_driver->node = 0;
> >>> -			con_driver->flag = 0;
> >>> -			con_driver->first = 0;
> >>> -			con_driver->last = 0;
> >>> +			con_driver->flag = CON_DRIVER_FLAG_ZOMBIE;
> >>> +			schedule_work(&con_driver_unregister_work);
> >>> +
> >>>  			return 0;
> >>>  		}
> >>>  	}
> >>> @@ -3678,6 +3676,39 @@ int do_unregister_con_driver(const struct consw *csw)
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(do_unregister_con_driver);
> >>>  
> >>> +static void con_driver_unregister_callback(struct work_struct *ignored)
> >>> +{
> >>> +	int i;
> >>> +
> >>> +	console_lock();
> >>> +
> >>> +	for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
> >>> +		struct con_driver *con_driver = &registered_con_driver[i];
> >>> +
> >>> +		if (!(con_driver->flag & CON_DRIVER_FLAG_ZOMBIE))
> >>> +			continue;
> >>> +
> >>> +		console_unlock();
> >>> +
> >>> +		vtconsole_deinit_device(con_driver);
> >>> +		device_destroy(vtconsole_class, MKDEV(0, con_driver->node));
> >>> +
> >>> +		if (WARN_ON_ONCE(con_driver->con))
> >>> +			con_driver->con = NULL;
> >>> +		con_driver->desc = NULL;
> >>> +		con_driver->dev = NULL;
> >>> +		con_driver->node = 0;
> >>> +		WARN_ON_ONCE(con_driver->flag != CON_DRIVER_FLAG_ZOMBIE);
> >>> +		con_driver->flag = 0;
> >>> +		con_driver->first = 0;
> >>> +		con_driver->last = 0;
> >>> +
> >>> +		console_lock();
> >>> +	}
> >>> +
> >>> +	console_unlock();
> >>> +}
> >>> +
> >>>  /*
> >>>   *	If we support more console drivers, this function is used
> >>>   *	when a driver wants to take over some existing consoles
> >>>
> >>
> > 
> > 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ