[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ef574666-26f4-299e-65c8-2348948651f9@oracle.com>
Date: Wed, 24 Feb 2021 10:47:09 -0500
From: chris hyser <chris.hyser@...cle.com>
To: Josh Don <joshdon@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
"Joel Fernandes (Google)" <joel@...lfernandes.org>,
Nishanth Aravamudan <naravamudan@...italocean.com>,
Julien Desfossez <jdesfossez@...italocean.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Vineeth Pillai <viremana@...ux.microsoft.com>,
Aaron Lu <aaron.lwe@...il.com>,
Aubrey Li <aubrey.intel@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel <linux-kernel@...r.kernel.org>, mingo@...nel.org,
torvalds@...ux-foundation.org, fweisbec@...il.com,
Kees Cook <keescook@...omium.org>,
Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
Valentin Schneider <valentin.schneider@....com>,
Mel Gorman <mgorman@...hsingularity.net>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org,
Chen Yu <yu.c.chen@...el.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Agata Gruza <agata.gruza@...el.com>,
Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com,
Paul Turner <pjt@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>,
Patrick Bellasi <derkling@...gle.com>, benbjiang@...cent.com,
Alexandre Chartre <alexandre.chartre@...cle.com>,
James.Bottomley@...senpartnership.com, OWeisse@...ch.edu,
Dhaval Giani <dhaval.giani@...cle.com>,
Junaid Shahid <junaids@...gle.com>,
Jesse Barnes <jsbarnes@...gle.com>,
Ben Segall <bsegall@...gle.com>, Hao Luo <haoluo@...gle.com>,
Tom Lendacky <thomas.lendacky@....com>
Subject: Re: [PATCH v10 2/5] sched: CGroup tagging interface for core
scheduling
On 2/24/21 8:52 AM, chris hyser wrote:
> On 2/24/21 8:02 AM, Chris Hyser wrote:
>
>>> However, it means that overall throughput of your binary is cut in
>>> ~half, since none of the threads can share a core. Note that I never
>>> saw an indefinite deadlock, just ~2x runtime for your binary vs th > control. I've verified that both a) manually
>>> hardcoding all threads to
>>> be able to share regardless of cookie, and b) using a machine with 6
>>> cores instead of 2, both allow your binary to complete in the same
>>> amount of time as without the new API.
>>
>> This was on a 24 core box. When I run the test, I definitely hangs. I'll answer your other email as wwll.
>
>
> I just want to clarify. The test completes in secs normally. When I run this on the 24 core box from the console, other
> ssh connections immediately freeze. The console is frozen. You can't ping the box and it has stayed that way for up to
> 1/2 hour before I reset it. I'm trying to get some kind of stack trace to see what it is doing. To the extent that I've
> been able to trace it or print it, the "next code" always seems to be __sched_core_update_cookie(p);
I cannot duplicate this on a 4 core box even with 1000's of processes and threads. The 24 core box does not even create
the full 400 processes/threads in that test before it hangs and that test reliably fails almost instantly. The working
theory is that the 24 core box is doing way more of the clone syscalls in parallel.
-chrish
Powered by blists - more mailing lists