linux-kernel - Re: [PATCH] oprofile: fix crash when accessing freed task structs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1281910924.2811.0.camel@pasglop>
Date:	Mon, 16 Aug 2010 08:22:04 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Robert Richter <robert.richter@....com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Carl E. Love" <cel@...ibm.com>,
	Michael Ellerman <michaele@....ibm.com>,
	oprofile-list <oprofile-list@...ts.sourceforge.net>
Subject: Re: [PATCH] oprofile: fix crash when accessing freed task structs

On Fri, 2010-08-13 at 17:39 +0200, Robert Richter wrote:
> On 02.08.10 21:39:33, Benjamin Herrenschmidt wrote:
> 
> > I can't tell that much about the workload, I don't have access to it
> > either, let's say that from my point of view it's a "customer" binary
> > blob.
> > 
> > I can re-trigger it though.
> 
> Benjamin,
> 
> can you try the patch below?

Thanks. I'll see if the folks who have a repro-case can give it a spin
for me.

Cheers,
Ben.

> Thanks,
> 
> -Robert
> 
> >From 4435322debc38097e9e863e14597ab3f78814d14 Mon Sep 17 00:00:00 2001
> From: Robert Richter <robert.richter@....com>
> Date: Fri, 13 Aug 2010 16:29:04 +0200
> Subject: [PATCH] oprofile: fix crash when accessing freed task structs
> 
> This patch fixes a crash during shutdown reported below. The crash is
> caused be accessing already freed task structs. The fix changes the
> order for registering and unregistering notifier callbacks.
> 
> All notifiers must be initialized before buffers start working. To
> stop buffer synchronization we cancel all workqueues, unregister the
> notifier callback and then flush all buffers. After all of this we
> finally can free all tasks listed.
> 
> This should avoid accessing freed tasks.
> 
> On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:
> 
> > So the initial observation is a spinlock bad magic followed by a crash
> > in the spinlock debug code:
> >
> > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
> > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
> >
> > Backtrace looks like:
> >
> >       spin_bug+0x74/0xd4
> >       ._raw_spin_lock+0x48/0x184
> >       ._spin_lock+0x10/0x24
> >       .get_task_mm+0x28/0x8c
> >       .sync_buffer+0x1b4/0x598
> >       .wq_sync_buffer+0xa0/0xdc
> >       .worker_thread+0x1d8/0x2a8
> >       .kthread+0xa8/0xb4
> >       .kernel_thread+0x54/0x70
> >
> > So we are accessing a freed task struct in the work queue when
> > processing the samples.
> 
> Reported-by: Benjamin Herrenschmidt <benh@...nel.crashing.org>
> Signed-off-by: Robert Richter <robert.richter@....com>
> ---
>  drivers/oprofile/buffer_sync.c |   27 ++++++++++++++-------------
>  drivers/oprofile/cpu_buffer.c  |    2 --
>  2 files changed, 14 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> index a9352b2..b7e755f 100644
> --- a/drivers/oprofile/buffer_sync.c
> +++ b/drivers/oprofile/buffer_sync.c
> @@ -141,16 +141,6 @@ static struct notifier_block module_load_nb = {
>  	.notifier_call = module_load_notify,
>  };
>  
> -
> -static void end_sync(void)
> -{
> -	end_cpu_work();
> -	/* make sure we don't leak task structs */
> -	process_task_mortuary();
> -	process_task_mortuary();
> -}
> -
> -
>  int sync_start(void)
>  {
>  	int err;
> @@ -158,7 +148,7 @@ int sync_start(void)
>  	if (!zalloc_cpumask_var(&marked_cpus, GFP_KERNEL))
>  		return -ENOMEM;
>  
> -	start_cpu_work();
> +	mutex_lock(&buffer_mutex);
>  
>  	err = task_handoff_register(&task_free_nb);
>  	if (err)
> @@ -173,7 +163,10 @@ int sync_start(void)
>  	if (err)
>  		goto out4;
>  
> +	start_cpu_work();
> +
>  out:
> +	mutex_unlock(&buffer_mutex);
>  	return err;
>  out4:
>  	profile_event_unregister(PROFILE_MUNMAP, &munmap_nb);
> @@ -182,7 +175,6 @@ out3:
>  out2:
>  	task_handoff_unregister(&task_free_nb);
>  out1:
> -	end_sync();
>  	free_cpumask_var(marked_cpus);
>  	goto out;
>  }
> @@ -190,11 +182,20 @@ out1:
>  
>  void sync_stop(void)
>  {
> +	/* flush buffers */
> +	mutex_lock(&buffer_mutex);
> +	end_cpu_work();
>  	unregister_module_notifier(&module_load_nb);
>  	profile_event_unregister(PROFILE_MUNMAP, &munmap_nb);
>  	profile_event_unregister(PROFILE_TASK_EXIT, &task_exit_nb);
>  	task_handoff_unregister(&task_free_nb);
> -	end_sync();
> +	mutex_unlock(&buffer_mutex);
> +	flush_scheduled_work();
> +
> +	/* make sure we don't leak task structs */
> +	process_task_mortuary();
> +	process_task_mortuary();
> +
>  	free_cpumask_var(marked_cpus);
>  }
>  
> diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
> index 219f79e..f179ac2 100644
> --- a/drivers/oprofile/cpu_buffer.c
> +++ b/drivers/oprofile/cpu_buffer.c
> @@ -120,8 +120,6 @@ void end_cpu_work(void)
>  
>  		cancel_delayed_work(&b->work);
>  	}
> -
> -	flush_scheduled_work();
>  }
>  
>  /*
> -- 
> 1.7.1.1
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/