lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA-nRyrW8ehQ4CphjSQGmZi0XJepHO6O2Z7O0j=4-XtxmQR4WQ@mail.gmail.com>
Date:	Wed, 6 May 2015 22:57:27 +0530
From:	pawandeep oza <oza.contri.linux.kernel@...il.com>
To:	linux-kernel@...r.kernel.org,
	malayasen rout <malayasen.rout@...il.com>, oza@...adcom.com
Subject: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

Hi,

Linux version 3.10.17

Problem Statement: The timekeeping/do_timer seems to be stopped and
the core (in this case it is core0) which is aborting is stuck in the
loop which relies on jiffies.


The root cause/Reason:

we have tickless kernel, so cpu goes to deep idle state, and stop
sched tick. tick_nohz_stop_sched_tick

tick_sched_do_timer should then take the job and whichever cpu is
running transfer jiffies incrementing job to itself. which is
tick_sched_do_timer


but when say core0 has raised BUG, ipi_cpu_stop will amek other cpu to
go to stop. and clcokevents_notify/tick_notify/hrtimer_notifiy
eventually seem to be conencted through cpu_chain.

but this code belong to hotplug where cpu_down happen and then it can
successfully call tick_handover_do_timer which will take over the duty
from dying cpu and assign it to the one which is online.

static void tick_handover_do_timer(int *cpup) { if (*cpup ==
tick_do_timer_cpu) { int cpu = cpumask_first(cpu_online_mask);
tick_do_timer_cpu = (cpu < nr_cpu_ids) ? cpu : TICK_DO_TIMER_NONE; } }


but since cpu_down is not getting called, this handover is not happening.
and the last status of the variable tick_do_timer_cpu is always
pointing to DEAD cpu (1,2 or 3).

and core0 waits forever (where if the code relies on the increment of jiffies).


what is the right way to approach this problem, at first it looks like
kernel should take care of handing over the jiffies job to other
online core indepedent of hotplug.

Regards,
Oza.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ