lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090322142409.GA3279@elte.hu>
Date:	Sun, 22 Mar 2009 15:24:09 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 0/5] [GIT PULL] updates for tip/tracing/ftrace


* Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

> > for one of the hung boxes in the past i waited 24 hours but it 
> > never unwedged itself. The box that hung today is still hanging 
> > and the RCU stall detector is still busy printing out those 
> > backtraces.
> 
> And on the last trace you emailed, the first and the last stall 
> warning are identical according to "diff".  In fact, they are all 
> identical. That is a bit unusual, one would normally expect to see 
> slight differences in the stack based on the scheduling clock 
> interrupt hitting the "longer than average loop" in different 
> places each time.
> 
> That would indicate either a very tight loop or a loop that has 
> interrupts enabled only in one spot.

With the help of Frederic's watchdog patch, i now have an ftrace 
dump of the hang:

 [<ffffffff8028f999>] ? test_ti_thread_flag+0x9/0x14
 [<ffffffff80a04dce>] ? _spin_unlock+0x4b/0x63
 [<ffffffff8036f186>] ? proc_register+0x1b5/0x1c0
 [<ffffffff8029da92>] ? register_irq_proc+0xc4/0xf5
 [<ffffffff8128f140>] ? early_idt_handler+0x0/0x6a
 [<ffffffff8128f700>] ? do_initcalls+0x1e/0x30
 [<ffffffff8128f72f>] ? do_basic_setup+0x1d/0x1f
 [<ffffffff8128fc04>] ? kernel_init+0x69/0xce
 [<ffffffff8020da1a>] ? child_rip+0xa/0x20
 [<ffffffff8020d3ad>] ? restore_args+0x0/0x30
 [<ffffffff8128fb9b>] ? kernel_init+0x0/0xce
 [<ffffffff8020da10>] ? child_rip+0x0/0x20
BUG: Function graph tracer hang!
Dumping ftrace buffer:
---------------------------------
 0)               |                          ack_apic_edge() {
 0)   0.000 us    |                            }
 0)   0.000 us    |                            irq_complete_move();
 0)               |                            move_native_irq() {
 0)   0.000 us    |                              }
 0)   0.000 us    |                            }
 0)               |                            ack_APIC_irq() {
 0)   0.000 us    |                                native_apic_mem_write();
 0)   0.000 us    |                              }
 0)   0.000 us    |                            }
 0)   0.000 us    |                          }
 0)               |                          _spin_unlock() {
 0)   0.000 us    |                            }
 0)   0.000 us    |                            preempt_schedule();
 0)   0.000 us    |                          }
 0)               |                          handle_IRQ_event() {
 0)   0.000 us    |                            }
 0)               |                            timer_interrupt() {
 0)               |                                tick_do_periodic_broadcast() {
 0)   0.000 us    |                                  }
 0)               |                                  cpumask_and() {
 0)   0.000 us    |                                    }
 0)   0.000 us    |                                  }
 0)               |                                  tick_do_broadcast() {
 0)   0.000 us    |                                    }
 0)               |                                    tick_handle_periodic() {
 0)               |                                        write_seqlock() {
 0)   0.000 us    |                                          }
 0)   0.000 us    |                                        }
 0)               |                                        do_timer() {
 1)   ==========> |
 1)               |      smp_apic_timer_interrupt() {
 1)               |          apic_write() {
 1)   0.000 us    |            }
 1)   0.000 us    |          }
 1)   0.000 us    |        }
 1)               |        exit_idle() {
 1)               |            mce_idle_callback() {
 1)   0.000 us    |              }
 1)   0.000 us    |            }
 1)   0.000 us    |            test_ti_thread_flag();
 1)   0.000 us    |          }
 1)   0.000 us    |        }
 1)               |        irq_enter() {
 1)   0.000 us    |          }
 1)               |          tick_check_idle() {
 1)   0.000 us    |            }
 1)               |            tick_nohz_stop_idle() {
 1)               |                ktime_get_ts() {
 1)               |                    clocksource_read() {
 1)   0.000 us    |                      }
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1)   0.000 us    |                  set_normalized_timespec();
 1)   0.000 us    |                }
 1)               |                timespec_to_ktime() {
 1)   0.000 us    |                  }
 1)   0.000 us    |                }
 1)   0.000 us    |              }
 1)               |              ktime_get() {
 1)               |                  getnstimeofday() {
 1)   0.000 us    |                      jiffies_read();
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1)   0.000 us    |                  set_normalized_timespec();
 1)   0.000 us    |                }
 1)               |                timespec_to_ktime() {
 1)   0.000 us    |                  }
 1)   0.000 us    |                }
 1)   0.000 us    |              }
 1)   0.000 us    |            }
 1)   0.000 us    |            tick_nohz_update_jiffies();
 1)   0.000 us    |          }
 1)   0.000 us    |        }
 1)               |        local_apic_timer_interrupt() {
 1)               |            tick_periodic() {
 1)               |                account_process_tick() {
 1)   0.000 us    |                  }
 1)   0.000 us    |                }
 1)               |                run_local_timers() {
 1)   0.000 us    |                  }
 1)               |                  raise_softirq() {
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1)   0.000 us    |                }
 1)               |                rcu_pending() {
 1)   0.000 us    |                    check_cpu_stall();
 0)               |                                            update_wall_time() {
 0)   0.000 us    |                                                jiffies_read();
 0)   0.000 us    |                                              }
 0)   0.000 us    |                                              clocksource_adjust();
 0)   0.000 us    |                                              update_xtime_cache();
 0)               |                                              change_clocksource() {
 0)   0.000 us    |                                                  _spin_lock_irqsave();
 0)               |                                                  _spin_unlock_irqrestore() {
 0)   0.000 us    |                                                    }
 0)   0.000 us    |                                                    preempt_schedule();
 0)   0.000 us    |                                                  }
 0)   0.000 us    |                                                }
 0)   0.000 us    |                                              }
 0)               |                                              update_vsyscall() {
 0)   0.000 us    |                                                  _spin_lock();
 0)   0.000 us    |                                                }
 0)               |                                                write_sequnlock() {
 0)   0.000 us    |                                                    test_ti_thread_flag();
 0)   0.000 us    |                                                    preempt_schedule();
 0)   0.000 us    |                                                  }
 0)   0.000 us    |                                                }
 0)   0.000 us    |                                              }
 0)   0.000 us    |                                            }
 0)   0.000 us    |                                            calc_load();
 0)   0.000 us    |                                          }
 0) ! 1000.000 us |                                        }
 0)               |                                        write_sequnlock() {
 0)   0.000 us    |                                            test_ti_thread_flag();
 0)   0.000 us    |                                            preempt_schedule();
 0)   0.000 us    |                                          }
 0)   0.000 us    |                                        }
 0)               |                                        update_process_times() {
 0)               |                                            account_system_time() {
 0)   0.000 us    |                                              }
 0)   0.000 us    |                                            }
 0)   0.000 us    |                                          }
 0)               |                                          run_local_timers() {
 0)   0.000 us    |                                            }
 0)               |                                            raise_softirq() {
 0)   0.000 us    |                                              }
 0)   0.000 us    |                                            }
 0)   0.000 us    |                                          }
 0)               |                                          rcu_pending() {
 0)   0.000 us    |                                              check_cpu_stall();
 0)   0.000 us    |                                            }
 0)   0.000 us    |                                          }
 0)               |                                          rcu_check_callbacks() {
 0)   0.000 us    |                                            }
 1) ! 1000.000 us |                  }
 1)               |                  __rcu_pending() {
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1) ! 1000.000 us |                }
 1)   0.000 us    |                printk_tick();
 1)               |                scheduler_tick() {
 1)               |                    ktime_get_ts() {
 1)               |                        clocksource_read() {
 1)   0.000 us    |                          }
 1)   0.000 us    |                        }
 1)   0.000 us    |                      }
 1)   0.000 us    |                      set_normalized_timespec();
 1)   0.000 us    |                    }
 1)               |                    timespec_to_ktime() {
 1)   0.000 us    |                      }
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1)   0.000 us    |                  _spin_lock();
 1)   0.000 us    |                  update_rq_clock();
 1)   0.000 us    |                  update_cpu_load();
 1)   0.000 us    |                  task_tick_idle();
 1)               |                  perf_counter_task_tick() {
 1)               |                      perf_swcounter_event() {
 1)   0.000 us    |                          perf_swcounter_ctx_event();
 1)   0.000 us    |                          perf_swcounter_ctx_event();
 1)   0.000 us    |                          test_ti_thread_flag();
 1)   0.000 us    |                        }
 1)   0.000 us    |                      }
 1)   0.000 us    |                    }
 1)   0.000 us    |                  }
 1)   0.000 us    |                }
 1) ! 1000.000 us |              }
 1) ! 1000.000 us |            }
 1) ! 1000.000 us |          }
 1) ! 1000.000 us |        }
 1) ! 1000.000 us |      }
 1)   <========== |
 1) ! 1000.000 us |    }
 1) ! 1000.000 us |  }
---------------------------------
FAILED!
initcall init_graph_trace+0x0/0x12 returned -1 after 324385776 usecs
initcall init_graph_trace+0x0/0x12 returned with error code -1 
calling  init_bts_trace+0x0/0x12 @ 1

Unfortunately it's rather short, due to the recent shrinking of the 
default ftrace buffer sizes.

Which makes me wonder ... maybe there's a connection?

Also, the time deltas look very weird. Either 0 or 1 msec. That's 
characteristic of a sched_clock() that has fallen back to a jiffies 
clocksource.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ