lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090519.150156.115978100.davem@davemloft.net>
Date:	Tue, 19 May 2009 15:01:56 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	hong.pham@...driver.com
Cc:	netdev@...r.kernel.org, matheos.worku@....com
Subject: Re: [PATCH 0/1] NIU: fix spurious interrupts

From: "Hong H. Pham" <hong.pham@...driver.com>
Date: Tue, 19 May 2009 17:52:15 -0400

> Unfortunately I don't have a PCIe NIU card to test in an x86 box.
> If the hang does not happen on x86 (which is my suspicion), that
> would rule out a problem with the NIU chip.  That would mean there's
> some interaction between the NIU and sun4v hypervisor that's causing
> the spurious interrupts.

I am still leaning towards the NIU chip, or our programming of
it, as causing this behavior.

Although it's possible that the interrupt logic inside of
Niagara-T2, or how it's hooked up to the internal NIU ASIC
inside of the CPU, might be to blame I don't consider it likely
given the basic gist of the behavior you see.

To quote section 17.3.2 of the UltraSPARC-T2 manual:

	An interrupt will only be issued if the timer is zero,
	the arm bit is set, and one of more LD's in the LDG, have
	their flags set and not masked.

which confirms our understanding of how this should work.

Can you test something Hong?  Simply trigger the hung case
and when it happens read the LDG registers to see if the ARM
bit is set, and what the LDG mask bits say.

There might be a bug somewhere that causes us to call
niu_ldg_rearm() improperly.  In particular I'm looking
at that test done in niu_interrupt():

	if (likely(v0 & ~((u64)1 << LDN_MIF)))
		niu_schedule_napi(np, lp, v0, v1, v2);
	else
		niu_ldg_rearm(np, lp, 1);

If we call niu_ldg_rearm() on an LDG being serviced by NAPI
before that poll sequence calls napi_complete() we could
definitely see this weird behavior.  And whatever causes
that would be the bug to fix.

Thanks!

 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ