lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230327140425.GA1090@ziqianlu-desk2>
Date:   Mon, 27 Mar 2023 22:04:25 +0800
From:   Aaron Lu <aaron.lu@...el.com>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: rq lock contention due to commit af7f588d8f73

On Mon, Mar 27, 2023 at 09:20:44AM -0400, Mathieu Desnoyers wrote:
> On 2023-03-27 04:05, Aaron Lu wrote:
> > Hi Mathieu,
> > 
> > I was doing some optimization work[1] for kernel scheduler using a
> > database workload: sysbench+postgres and before I submit my work, I
> > rebased my patch on top of latest v6.3-rc kernels to see if everything
> > still works expected and then I found rq's lock became very heavily
> > contended as compared to v6.2 based kernels.
> > 
> > Using the above mentioned workload, before commit af7f588d8f73("sched:
> > Introduce per-memory-map concurrency ID"), the profile looked like:
> > 
> >       7.30%     0.71%  [kernel.vmlinux]            [k] __schedule
> >       0.03%     0.03%  [kernel.vmlinux]            [k] native_queued_spin_lock_slowpath
> > 
> > After that commit:
> > 
> >      49.01%     0.87%  [kernel.vmlinux]            [k] __schedule
> >      43.20%    43.18%  [kernel.vmlinux]            [k] native_queued_spin_lock_slowpath
> > 
> > The above profile was captured with sysbench's nr_threads set to 56; if
> > I used more thread number, the contention would be more severe on that
> > 2sockets/112core/224cpu Intel Sapphire Rapids server.
> > 
> > The docker image I used to do optimization work is not available outside
> > but I managed to reproduce this problem using only publicaly available
> > stuffs, here it goes:
> > 1 docker pull postgres
> > 2 sudo docker run --rm --name postgres-instance -e POSTGRES_PASSWORD=mypass -e POSTGRES_USER=sbtest -d postgres -c shared_buffers=80MB -c max_connections=250
> > 3 go inside the container
> >    sudo docker exec -it $the_just_started_container_id bash
> > 4 install sysbench inside container
> >    sudo apt update and sudo apt install sysbench
> > 5 prepare
> >    root@...tainer:/# sysbench --db-driver=pgsql --pgsql-user=sbtest --pgsql_password=mypass --pgsql-db=sbtest --pgsql-port=5432 --tables=16 --table-size=10000 --threads=56 --time=60 --report-interval=2 /usr/share/sysbench/oltp_read_only.lua prepare
> > 6 run
> >    root@...tainer:/# sysbench --db-driver=pgsql --pgsql-user=sbtest --pgsql_password=mypass --pgsql-db=sbtest --pgsql-port=5432 --tables=16 --table-size=10000 --threads=56 --time=60 --report-interval=2 /usr/share/sysbench/oltp_read_only.lua run
> > 
> > Let it warm up a little bit and after 10-20s you can do profile and see
> > the increased rq lock contention. You may need a machine that has at
> > least 56 cpus to see this, I didn't try on other machines.
> > 
> > Feel free to let me know if you need any other info.
> 
> While I setup my dev machine with this reproducer, here are a few
> questions to help figure out the context:
> 
> I understand that pgsql is a multi-process database. Is it strictly
> single-threaded per-process, or does each process have more than
> one thread ?

I do not know the details of Postgres, according to this:
https://wiki.postgresql.org/wiki/FAQ#How_does_PostgreSQL_use_CPU_resources.3F
I think it is single-threaded per-process.

The client, sysbench, is single process multi-threaded IIUC.

> 
> I understand that your workload is scheduling between threads which
> belong to different processes. Are there more heavily active threads
> than there are scheduler runqueues (CPUs) on your machine ?

In the reproducer I described above, 56 threads are started on the
client side and if each client thread is served by a server process,
there would be about 112 tasks. I don't think the client thread and
the server process are active at the same time but even if they are,
112 is still smaller than the machine's CPU number: 224.

> 
> When I developed the mm_cid feature, I originally implemented two additional
> optimizations:
> 
>     Additional optimizations can be done if the spin locks added when
>     context switching between threads belonging to different memory maps end
>     up being a performance bottleneck. Those are left out of this patch
>     though. A performance impact would have to be clearly demonstrated to
>     justify the added complexity.
> 
> I suspect that your workload demonstrates the need for at least one of those
> optimizations. I just wonder if we are in a purely single-threaded scenario
> for each process, or if each process has many threads.

My understanding is: the server side is single threaded and the client
side is multi threaded.

Thanks,
Aaron

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ