Tuesday, August 25, 2015

perf on fedora 21


sudo yum install perf



[steve@steve-GA-880GMA-UD2H perf]$ pgrep iperf
2807

[steve@steve-GA-880GMA-UD2H perf]$ perf record -p 2807 -g
^C[ perf record: Woken up 26 times to write data ]
[nf_nat_ipv4] with build id 08e53c9cb720761011036c0f3f3d7db2feb62bc3 not found, continuing without symbols
[nf_conntrack] with build id 6b2d193426fa4e8a93181f77d0622c1425d23af9 not found, continuing without symbols
[iptable_mangle] with build id 3f487227e3da17d0386b8a8b6976ade8311bc0fd not found, continuing without symbols
[nf_conntrack_ipv4] with build id d872b229572d3106b3b415d24a83679191cfd58c not found, continuing without symbols
[iptable_nat] with build id 4a354a31abf978aacb65d3e7c53e13f421c6bdb9 not found, continuing without symbols
[nf_defrag_ipv4] with build id ed79015881b828ed986f78ee7fa06bb62ecf52f3 not found, continuing without symbols
[nf_nat] with build id 9135bac40d6819086edde4816d82c495bbc5640f not found, continuing without symbols
[radeon] with build id 7417d27e130d2e4eecb5d45cc08fb74b4eb003a5 not found, continuing without symbols
[drm] with build id b840649b8536d349cd1525af9dac7892f4b4f5d7 not found, continuing without symbols
[ perf record: Captured and wrote 6.551 MB perf.data (37928 samples) ]


[steve@steve-GA-880GMA-UD2H perf]$ perf report --stdio -g


# Children      Self  Command  Shared Object        Symbol                                         
# ........  ........  .......  ...................  ...............................................
#
    98.75%     0.00%  iperf    libpthread-2.20.so   [.] 0xffff80c11eca95fd                         
              |
              ---0xf5fd
                 |          
                 |--77.16%-- malloc
                 |          0x1d5
                 |          
                 |--15.29%-- Server::Run
                 |          
                 |--6.44%-- 0
                  --1.11%-- [...]

    98.56%     0.08%  iperf    [kernel.kallsyms]    [k] system_call_fastpath                       
              |
              ---system_call_fastpath
                 |          
                 |--99.89%-- 0xf5fd
                 |          |          
                 |          |--78.04%-- malloc
                 |          |          0x1d5



can't get report/annotate to show file/line
maybe related to ?

OK...
I could get it to work by building iperf with
CXXFLAGS='-g3 -gdwarf-2' CFLAGS='-g3 -gdwarf-2'

then some work to understand how threads are represented


operf / oprofile on fedora 21



Using iperf as code to profile
sudo yum install iperf
sudo debuginfo-install iperf

Install oprofile
sudo yum install oprofile

Profile particular pid

[steve@steve-GA-880GMA-UD2H operf]$ pgrep iperf
2807
[steve@steve-GA-880GMA-UD2H operf]$ operf --pid 2807
operf: Press Ctl-c or 'kill -SIGINT 3924' to stop profiling
operf: Profiler started
^C
Profiling done.


[steve@steve-GA-880GMA-UD2H operf]$ opreport -l -dg  |less
CPU: AMD64 family10, speed 3200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
vma      samples  %        linenr info                 image name               symbol name
00408700 261      30.9242  Reporter.c:710              iperf                    reporter_handle_packet
...
  004089af 121      46.3602  Reporter.c:814

[steve@steve-GA-880GMA-UD2H operf]$ opannotate --source |less
 * Total samples for file : "/usr/src/debug/iperf-2.0.8/src/Reporter.c"
 * 
 *    402 47.6303
...
    809             } else if (reporthdr->report.mThreadMode == kMode_Server && (packet->packetLen > 0)) {
    810                 int bin;
    811                 // mean min max tests
    812                 stats->tcp.read.cntRead++;
    813                 stats->tcp.read.totcntRead++;
    814                 bin = (int)floor((packet->packetLen -1)/stats->tcp.read.binsize);
    815                 stats->tcp.read.bins[bin]++;
    816                 stats->tcp.read.totbins[bin]++;
    817             } else if (reporthdr->report.mThreadMode == kMode_Client) {


Profile system wide

[steve@steve-GA-880GMA-UD2H operf]$ sudo operf -s
operf: Press Ctl-c or 'kill -SIGINT 4003' to stop profiling
operf: Profiler started
^C* * * * WARNING: Profiling rate was throttled back by the kernel * * * *
The number of samples actually recorded is less than expected, but is
probably still statistically valid.  Decreasing the sampling rate is the
best option if you want to avoid throttling.

Profiling done.


[steve@steve-GA-880GMA-UD2H operf]$ opreport  |less

CPU: AMD64 family10, speed 3200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
  samples|      %|
------------------
   298879 83.0667 iperf
        CPU_CLK_UNHALT...|
          samples|      %|
        ------------------
           286580 95.8850 kallsyms
             6738  2.2544 nf_conntrack
             1279  0.4279 nf_nat_ipv4
              742  0.2483 libpthread-2.20.so
              740  0.2476 nf_conntrack_ipv4
              667  0.2232 iptable_mangle
              634  0.2121 iperf
              533  0.1783 [vdso] (tgid:2807 range:0x7fffd53fc000-0x7fffd53fdfff)

[steve@steve-GA-880GMA-UD2H operf]$ opreport -l -d -g /usr/bin/iperf -f |less
...
004092e0 293       0.0980  /usr/src/debug/iperf-2.0.8/src/Server.cpp:104 /usr/bin/iperf           Server::Run()
  004093d2 2         0.6826  /usr/src/debug/iperf-2.0.8/src/Server.cpp:182
  004093d9 5         1.7065  /usr/src/debug/iperf-2.0.8/src/Server.cpp:179
  004093ec 121      41.2969  /usr/src/debug/iperf-2.0.8/src/Server.cpp:182
  004093f5 2         0.6826  /usr/src/debug/iperf-2.0.8/src/Server.cpp:201
  00409416 12        4.0956  /usr/src/debug/iperf-2.0.8/src/Server.cpp:243
  0040941c 7         2.3891  /usr/src/debug/iperf-2.0.8/src/Server.cpp:244
  00409427 2         0.6826  /usr/src/debug/iperf-2.0.8/src/Server.cpp:248
  00409443 19        6.4846  /usr/src/debug/iperf-2.0.8/src/Server.cpp:178
  00409446 11        3.7543  /usr/src/debug/iperf-2.0.8/src/Server.cpp:178
  00409510 3         1.0239  /usr/src/debug/iperf-2.0.8/src/Server.cpp:259
  00409512 12        4.0956  /usr/src/debug/iperf-2.0.8/src/Server.cpp:259
  00409521 4         1.3652  /usr/src/debug/iperf-2.0.8/src/Server.cpp:260
  0040952d 1         0.3413  /usr/src/debug/iperf-2.0.8/src/Server.cpp:260
  00409538 4         1.3652  /usr/src/debug/iperf-2.0.8/src/Server.cpp:260
  0040953b 14        4.7782  /usr/src/debug/iperf-2.0.8/src/Server.cpp:201
  00409541 19        6.4846  /usr/src/debug/iperf-2.0.8/src/Server.cpp:201
  0040954a 4         1.3652  /usr/src/debug/iperf-2.0.8/src/Server.cpp:201
  004095bb 27        9.2150  /usr/src/debug/iperf-2.0.8/src/Server.cpp:242
  004095c4 24        8.1911  /usr/src/debug/iperf-2.0.8/src/Server.cpp:242

vi /usr/src/debug/iperf-2.0.8/src/Server.cpp
    177         gettimeofday( &watchdog, NULL );
    178         do {
    179             reportstruct->emptyreport=0;
    180 #if HAVE_DECL_SO_TIMESTAMP
    181             // perform read
    182             currLen = recvmsg( mSettings->mSock, &message, 0 );
    183             if (currLen <= 0) {
    184                 // Socket read timeout or read error
    185                 reportstruct->emptyreport=1;
    186                 gettimeofday( &(reportstruct->packetTime), NULL );


[steve@steve-GA-880GMA-UD2H operf]$ opannotate --source |less
...
/* 
 * Total samples for file : "/usr/src/debug/iperf-2.0.8/src/Server.cpp"
 * 
 *    293  0.0814
 */
...
    30  0.0083 :        do {
     5  0.0014 :            reportstruct->emptyreport=0;
               :#if HAVE_DECL_SO_TIMESTAMP
               :            // perform read 
   123  0.0342 :            currLen = recvmsg( mSettings->mSock, &message, 0 );
               :            if (currLen <= 0) {
               :                // Socket read timeout or read error
               :                reportstruct->emptyreport=1;
               :                gettimeofday( &(reportstruct->packetTime), NULL );