lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1307263197.28359.42.camel@t41.thuisdomein>
Date:	Sun, 05 Jun 2011 10:39:55 +0200
From:	Paul Bolle <pebolle@...cali.nl>
To:	Jens Axboe <jaxboe@...ionio.com>
Cc:	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
	Vivek Goyal <vgoyal@...hat.com>,
	linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: Mysterious CFQ crash and RCU

On Sun, 2011-06-05 at 08:56 +0200, Jens Axboe wrote:
> Does this fix it? It will introduce a hierarchy that is queue -> ioc
> lock, but as far as I can remember (and tell from a quick look), we
> don't have any dependencies on that order of locking at this moment. So
> should be OK.
> 
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 3c7b537..fa7ef54 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -2772,8 +2772,11 @@ static void __cfq_exit_single_io_context(struct cfq_data *cfqd,
>  	smp_wmb();
>  	cic->key = cfqd_dead_key(cfqd);
>  
> -	if (ioc->ioc_data == cic)
> +	if (ioc->ioc_data == cic) {
> +		spin_lock(&ioc->lock);
>  		rcu_assign_pointer(ioc->ioc_data, NULL);
> +		spin_unlock(&ioc->lock);
> +	}
>  
>  	if (cic->cfqq[BLK_RW_ASYNC]) {
>  		cfq_exit_cfqq(cfqd, cic->cfqq[BLK_RW_ASYNC]);
> 

0) I'd guess not, as the last thing I tried before simply ripping
io_context.ioc_data out, was:

        spin_lock_irqsave(&ioc->lock, flags);
        rcu_read_lock();
        ioc_data = rcu_dereference(ioc->ioc_data);
        rcu_read_unlock();
        if (ioc_data == cic)
                rcu_assign_pointer(ioc->ioc_data, NULL);
        spin_unlock_irqrestore(&ioc->lock, flags);

(By this time I had already wrapped all access to io_context.ioc_data in
rcu_read_lock(), rcu_dereference(), and rcu_read_unlock() voodoo. I also
wrapped all access of io_context members - other than refcount and
nr_tasks - in a spin_lock_irqsave()/spin_unlock_irqrestore() on
io_context.lock. This gave no warnings, nor lockups, but the code just
kept crashing in the exact same location it always did!)

1) Of course, by now I have forgotten what I had in mind when I stopped
working on this last night. My first bet currently is to change the core
of cic_free_func() into something like:

        spin_lock_irqsave(&ioc->lock, flags);
        radix_tree_delete(&ioc->radix_root, dead_key >>
CIC_DEAD_INDEX_SHIFT);

        rcu_read_lock();
        ioc_data = rcu_dereference(ioc->ioc_data);
        rcu_read_unlock();
        if (ioc_data == cic)
		rcu_assign_pointer(ioc->ioc_data, NULL);

        hlist_del_rcu(&cic->cic_list);
        spin_unlock_irqrestore(&ioc->lock, flags);

2) But, I must admit I'm not yet at full speed today.


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ