[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111018154542.GB3869@osiris.boeblingen.de.ibm.com>
Date: Tue, 18 Oct 2011 17:45:42 +0200
From: Heiko Carstens <heiko.carstens@...ibm.com>
To: "Jun'ichi Nomura" <j-nomura@...jp.nec.com>
Cc: James Bottomley <James.Bottomley@...senPartnership.com>,
Steffen Maier <maier@...ux.vnet.ibm.com>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, Hannes Reinecke <hare@...e.de>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Alan Stern <stern@...land.harvard.edu>,
Thadeu Lima de Souza Cascardo <cascardo@...ux.vnet.ibm.com>,
"Taraka R. Bodireddy" <tarak.reddy@...ibm.com>,
"Seshagiri N. Ippili" <seshagiri.ippili@...ibm.com>,
"Manvanthara B. Puttashankar" <mputtash@...ibm.com>
Subject: Re: [GIT PULL] Queue free fix (was Re: [PATCH] block: Free queue
resources at blk_release_queue())
On Tue, Oct 18, 2011 at 10:31:20PM +0900, Jun'ichi Nomura wrote:
> On 10/17/11 23:06, James Bottomley wrote:
> > On Mon, 2011-10-17 at 17:46 +0900, Jun'ichi Nomura wrote:
> >> On 10/15/11 01:03, James Bottomley wrote:
> >>> On Thu, 2011-10-13 at 15:09 +0200, Steffen Maier wrote:
> >>>> Initially, we encountered use-after-free bugs in
> >>>> scsi_print_command / scsi_dispatch_cmd
> >>>> http://marc.info/?l=linux-scsi&m=130824013229933&w=2
> >>
> >> It is interesting that both this and the older report
> >> got oopsed in scsi_log_send(), while there are other
> >> dereferences of 'cmd' around scsi_dispatch_cmd().
> >> Are there any reason they are special? Just by accident?
> >
> > Right, that's why it looks like the command area got freed rather than
> > the command pointer was bogus (6b is a poison free pattern). Perhaps if
> > the reporter could pin down the failing source line, we'd know better
> > what was going on?
>
> Yeah, that might be useful.
The struct scsi_cmnd that was passed to scsi_log_send() was already freed
(contents completely 6b6b6b...).
Since SLUB debugging was turned on we can see that it was freed from
__scsi_put_command(). Not too much of a surprise.
> One remote possibility I imagined is if the submitting process
> took very long after blk_start_request until scsi_dispatch_cmd
> and timeout handler kicks in, can cmd be freed?
>
> Also Tejun's report here could be related to possible data corruption:
> [PATCH] block: make gendisk hold a reference to its queue
> https://lkml.org/lkml/2011/10/16/148
>
> Though I haven't hit the reported oops myself,
> a process closing a removed device may modify freed memory.
> And his patch will fix it.
>
> So if the problem is easily reproducible, I think it's worth
> trying his patch to see if the problem disappear.
In addition it might be worthwhile to apply this small slub patch as well,
so that we get a full trace of the caller in the object which was freed:
---
mm/slub.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -27,6 +27,7 @@
#include <linux/memory.h>
#include <linux/math64.h>
#include <linux/fault-inject.h>
+#include <linux/stacktrace.h>
#include <trace/events/kmem.h>
@@ -191,11 +192,14 @@ static LIST_HEAD(slab_caches);
/*
* Tracking user of a slab.
*/
+#define SAVE_TRACE_NUM 30
struct track {
unsigned long addr; /* Called from address */
int cpu; /* Was running on cpu */
int pid; /* Pid context */
unsigned long when; /* When did the operation occur */
+ unsigned long trace[SAVE_TRACE_NUM];
+ unsigned long long stamp;
};
enum track_item { TRACK_ALLOC, TRACK_FREE };
@@ -397,10 +401,17 @@ static void set_track(struct kmem_cache
struct track *p = get_track(s, object, alloc);
if (addr) {
+ struct stack_trace trace = {
+ .max_entries = SAVE_TRACE_NUM,
+ .entries = p->trace,
+ };
+
p->addr = addr;
p->cpu = smp_processor_id();
p->pid = current->pid;
p->when = jiffies;
+ p->stamp = get_clock();
+ save_stack_trace(&trace);
} else
memset(p, 0, sizeof(struct track));
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists