[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1329160187.8108.9.camel@HP1>
Date: Mon, 13 Feb 2012 11:09:47 -0800
From: "Michael Chan" <mchan@...adcom.com>
To: "Vasily Averin" <vvs@...allels.com>
cc: netdev@...r.kernel.org, "Dmitry Kravkov" <dmitry@...adcom.com>
Subject: Re: [PATCH net-next] bnx2x: Disable LRO on FCoE or iSCSI boot
device
On Mon, 2012-02-13 at 15:20 +0400, Vasily Averin wrote:
> Michael, Dmitry,
>
> Could you please clarify how you have fixed this issue?
> I've noticed very similar problem in CentOS6.2 environment,
> could you please clarify how it's possible to fix or workaround it?
>
We made a number of fixes:
1. iscsiuio user daemon no longer logs to the log file in the root fs
during reset. The longer term fix is to use a different thread for
logging that can block when the fs is not available
2. iscsiuio is locked into memory and won't be swapped out.
3. bnx2x now caches the firmware after initial open:
http://git.kernel.org/?p=linux/kernel/git/davem/net.git;a=commit;h=eb2afd4a622985eaccfa8c7fc83e890b8930e0ab
because request_firmware() may not be able to get the firmware when the
root fs is not available.
These fixes should address all issues involving bnx2x reset during
iSCSI/FCoE/network boot. RHEL6.2 has #1 and #2 fixes, but not #3.
> On 10/27/2011 at 16:30 -0700, Michael Chan wrote:
> > On Wed, 2011-10-19 at 13:53 -0700, John Fastabend wrote:
> >> As a reference point this works fine in both FCoE and iSCSI stacks
> >> today. The device is reset or link is lost for whatever reason
> >> when the link comes back up the stack logs back in, enumerates
> >> the luns and the scsi stack recovers as expected.
> >>
> >> Firmware should do the equivalent login, lun enumeration, etc as
> >> needed.
> >
> > Just a quick follow-up on this issue. Our firmware actually performs
> > the same logout before the reset and login after the reset. For iSCSI,
> > the problem on our device was actually caused by our userspace daemon
> > logging events to a log file in the root fs. The file I/O was blocked
> > and the daemon could not proceed to do the important operations during
> > the reset, and this caused filesystem I/O errors. We have now fixed the
> > problem in the userspace daemon.
> >
> > For FCoE, there is no logging issue and the root fs failure seems to
> > happen only in a multipath configuration with all paths going down for a
> > short time (caused by reset in this case). We believe this also affects
> > other devices and not just ours. We are now working with the multipath
> > maintainer to understand this issue.
> >
> > So this confirms that the original patch for bnx2x is not needed.
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists