lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7d6cbc8-1f5f-d0f4-6656-f0cc8ab3a118@intel.com>
Date:   Tue, 18 Apr 2023 16:36:11 +0300
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Ian Rogers <irogers@...gle.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events

On 18/04/23 09:18, Adrian Hunter wrote:
> On 17/04/23 14:02, Peter Zijlstra wrote:
>> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>>> Hi
>>>
>>> Here is a stab at adding an ioctl for sideband events.
>>>
>>> This is to overcome races when reading the same information
>>> from /proc.
>>
>> What races? Are you talking about reading old state in /proc the kernel
>> delivering a sideband event for the new state, and then you writing the
>> old state out?
>>
>> Surely that's something perf tool can fix without kernel changes?
> 
> Yes, and it was a bit of a brain fart not to realise that.
> 
> There may still be corner cases, where different kinds of events are
> interdependent, perhaps NAMESPACES events vs MMAP events could
> have ordering issues.
> 
> Putting that aside, the ioctl may be quicker than reading from
> /proc.  I could get some numbers and see what people think.
> 

Here's a result with a quick hack to use the ioctl but without
handling the buffer becoming full (hence the -m4M)

# ps -e | wc -l
1171
# perf.old stat -- perf.old record -o old.data --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB old.data (100 samples) ]

 Performance counter stats for 'perf.old record -o old.data --namespaces -a true':

            498.15 msec task-clock                       #    0.987 CPUs utilized             
               126      context-switches                 #  252.935 /sec                      
                64      cpu-migrations                   #  128.475 /sec                      
              4396      page-faults                      #    8.825 K/sec                     
        1927096347      cycles                           #    3.868 GHz                       
        4563059399      instructions                     #    2.37  insn per cycle            
         914232559      branches                         #    1.835 G/sec                     
           6618052      branch-misses                    #    0.72% of all branches           
        9633787105      slots                            #   19.339 G/sec                     
        4394300990      topdown-retiring                 #     38.8% Retiring                 
        3693815286      topdown-bad-spec                 #     32.6% Bad Speculation          
        1692356927      topdown-fe-bound                 #     14.9% Frontend Bound           
        1544151518      topdown-be-bound                 #     13.6% Backend Bound            

       0.504636742 seconds time elapsed

       0.158237000 seconds user
       0.340625000 seconds sys

# perf.old stat -- perf.new record -o new.data -m4M --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB new.data (103 samples) ]

 Performance counter stats for 'perf.new record -o new.data -m4M --namespaces -a true':

            386.61 msec task-clock                       #    0.988 CPUs utilized             
               100      context-switches                 #  258.658 /sec                      
                65      cpu-migrations                   #  168.128 /sec                      
              4935      page-faults                      #   12.765 K/sec                     
        1495905137      cycles                           #    3.869 GHz                       
        3647660473      instructions                     #    2.44  insn per cycle            
         735822370      branches                         #    1.903 G/sec                     
           5765668      branch-misses                    #    0.78% of all branches           
        7477722620      slots                            #   19.342 G/sec                     
        3415835954      topdown-retiring                 #     39.5% Retiring                 
        2748625759      topdown-bad-spec                 #     31.8% Bad Speculation          
        1221594670      topdown-fe-bound                 #     14.1% Frontend Bound           
        1256150733      topdown-be-bound                 #     14.5% Backend Bound            

       0.391472763 seconds time elapsed

       0.141207000 seconds user
       0.246277000 seconds sys

# ls -lh old.data
-rw------- 1 root root 1.2M Apr 18 13:19 old.data
# ls -lh new.data
-rw------- 1 root root 1.2M Apr 18 13:19 new.data
# 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ