lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <200811031828.07285.bzolnier@gmail.com> Date: Mon, 3 Nov 2008 18:28:07 +0100 From: Bartlomiej Zolnierkiewicz <bzolnier@...il.com> To: Ingo Molnar <mingo@...e.hu> Cc: linux-kernel@...r.kernel.org, Alok N Kataria <akataria@...are.com>, Robert Hancock <hancockr@...w.ca>, Arjan van de Ven <arjan@...radead.org>, Pavel Machek <pavel@...e.cz> Subject: Re: upstream regression (IO-APIC?) On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote: > On Sunday 02 November 2008, Bartlomiej Zolnierkiewicz wrote: > > On Thursday 30 October 2008, Robert Hancock wrote: > > > Bartlomiej Zolnierkiewicz wrote: > > > > The current Linus tree as of commit e946217e4fdaa67681bbabfa8e6b18641921f750 > > > > is broken for me. I get either the following panic (see log from qemu below) > > > > or lost IRQs on ATA init... Is this a known issue? > > > > > > > > PS The tree that I used before and was supposedly good (sorry, I'm too tired > > > > to verify it now) had commit 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 at head. > > > > Unfortunately 57f8f7b60db6f1ed2c6918ab9230c4623a9dbe37 (v2.6.28-rc1) > > is also bad. Bisecting it further was a real pain (i.e. I hit broken > > build with x86 irqbalance changes, broken build with netfilter nat > > changes and jbd journal problem). In the end it turned out that 2.6.27 > > is bad too! However with 2.6.27 the panic occurs only once per several > > attempts and if there is no panic kernel boots normally (no lost IRQs). > > > > [...] > > > > I finally managed to narrow it down to change making x86 use tsc_khz > > for loops_per_jiffy -- commit 3da757daf86e498872855f0b5e101f763ba79499 > > ("x86: use cpu_khz for loops_per_jiffy calculation"). This approach > > seems too simplistic (as I see now Arjan & Pavel expressed concerns > > about it back when the patch was posted initially [1][2]). Also it > > would probably be preferred to re-use existing preset_lpj variable > > (just like KVM does it for similar purpose [3]) instead of adding a > > lpj_tsc one and increasing complexity. > > It turned out that I can boot a kernel with different config with > HZ == 250 just fine and switching to HZ == 1000 makes it fail. > > > Looking into it some more: > > HZ == 250 kernel (good): > > Calibrating delay loop (skipped), value calculated using timer frequency.. 2986.79 BogoMIPS (lpj=5973580) > > HZ == 1000 kernel (bad): > > Calibrating delay loop (skipped), using tsc calculated value.. 2990.35 BogoMIPS (lpj=1495176) > > HZ == 1000 kernel with hackyfix (good): > > Calibrating delay using timer specific routine.. 3016.68 BogoMIPS (lpj=6033376) > > > Argggh... lpj is used for udelay() & friends so this bug is quite > dangerous (since udelay() & friends are used for hardware delays)... It may be not as severe as I initially thought, (obviously) the real hardware works fine: calibrate_delay_direct(): lpj=1495884 Calibrating delay loop (skipped), value calculated using timer frequency.. 2990.36 BogoMIPS (lpj=1495183) So the issue only affects qemu ATM. Thanks, Bart -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists