lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpW=VXDRbOkEuK+mr+4G2FgfmT11yYKt4DbhGB2QGqeeYA@mail.gmail.com>
Date:   Mon, 28 Oct 2019 17:49:35 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [Patch net-next 0/3] tcp: decouple TLP timer from RTO timer

On Mon, Oct 28, 2019 at 1:31 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Mon, Oct 28, 2019 at 1:13 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> >
> > On Mon, Oct 28, 2019 at 11:34 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Mon, Oct 28, 2019 at 11:29 AM David Miller <davem@...emloft.net> wrote:
> > > >
> > > > From: Cong Wang <xiyou.wangcong@...il.com>
> > > > Date: Tue, 22 Oct 2019 16:10:48 -0700
> > > >
> > > > > This patchset contains 3 patches: patch 1 is a cleanup,
> > > > > patch 2 is a small change preparing for patch 3, patch 3 is the
> > > > > one does the actual change. Please find details in each of them.
> > > >
> > > > Eric, have you had a chance to test this on a system with
> > > > suitable CPU arity?
> > >
> > > Yes, and I confirm I could not repro the issues at all.
> > >
> > > I got a 100Gbit NIC, trying to increase the pressure a bit, and
> > > driving this NIC at line rate was only using 2% of my 96 cpus host,
> > > no spinlock contention of any sort.
> >
> > Please let me know if there is anything else I can provide to help
> > you to make the decision.
> >
> > All I can say so far is this only happens on our hosts with 128
> > AMD CPU's. I don't see anything here related to AMD, so I think
> > only the number of CPU's (vs. number of TX queues?) matters.
> >
>
> I also have AMD hosts with 256 cpus, I can try them later (not today,
> I am too busy)
>
> But I feel you are trying to work around a more fundamental issue if
> this problem only shows up on AMD hosts.

I wish I have Intel hosts with the same number of CPU's, but I don't,
all Intel ones have less, probably 80 at max. This is why I think it
is related to the number of CPU's.

Also, IOMMU is turned off explicitly, I don't see anything else could
be AMD specific along the TCP path.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ