Documentation Center

  • Trial Software
  • Product Updates

Profile Your Code

Profiling is a way to measure where a program spends time. MuPAD® provides the prog::profile function to help you identify performance bottlenecks in your code. Use this function to analyze performance of complicated nested procedure calls. For simpler code snippets, measuring execution times is more convenient and can give you enough information to identify performance bottlenecks.

The prog::profile function evaluates the code that you want to profile, and returns the profiling report. The report contains:

  • A table showing the time your code spent on each function (total and averaged per one call), number of calls to that function, children of that function, and so on. Information for the most heavily used functions appear on the top of the table. The first row in this table represents the total execution time.

  • A dependence graph showing a function itself, the functions it calls (children), and the functions that call this function. The graph also shows the timing and number of calls for each function.

    Note:   By default, prog::profile does not measure performance of single calls to kernel functions.

However, when prog::profile measures the performance of library functions, it also prints the accumulated time the system spends in kernel functions. To measure the performance of a single call to a kernel function, use prog::trace to trace that kernel function.

Suppose you want to write a procedure that checks whether each integer from 1 to 1000 appears in a 1000×1000 matrix of random integers. The direct approach is to write a procedure that checks every element of a matrix proceeding row by row and column by column:

f := proc(M, n, x)
begin
  for j from 1 to n do
    for k from 1 to n do
      if M[j, k] = x then
        return(TRUE)
      end_if
    end_for
  end_for;
  return(FALSE)
end_proc:

Use the linalg::randomMatrix to create a 1000×1000 matrix of random integers:

matrixSize := 1000:
M := linalg::randomMatrix(matrixSize, matrixSize, Dom::Integer):

Then call the procedure f 1000 times to check if each number from 1 to 1000 appears in that matrix:

g := proc()
begin
  f(M, matrixSize, i) $ i = 1..1000
end_proc:

Measuring the time needed to run the procedure g shows that the procedure requires optimization. Although the performance bottleneck in this procedure is obvious, it is not always easy to identify performance bottlenecks in more complicated procedures. The time function does not indicate where the procedure spends most of its execution time:

time(g())
 62023.876 

To obtain the complete profiling report that shows timings for all inner function calls, use prog::profile:

prog::profile(g()):
  percent usage of all
    |      time self per single call
    |          |      time self
    |          |          |      time children per single
call
    |          |          |
         |      time children
  
 |          |          |          |          |     calls/normal exit
    |          |          |          |    
     |        |   calls/remember exit
 
  |          |          |          |          |        |   |  calls/errors
    |          |          |          |    
     |        |   |  |  [index] function name
--------------------------------------------------------------------------------------------------------------------
 100.0   109982.9   109982.9         .    
     .         1  .  .  [0]  procedure entry point               
       
--------------------------------------------------------------------------------------------------------------------
  45.9         .     50467.2         .    
27149.6  3019825  .  .  [1]  (Dom::Matrix(Dom::Integer))::_index_intern
 
  19.7         .     21689.3  
      .      5460.3  3019825  .  .  [2]  Dom::Integer::coerce    
                   
  16.7      
  .     18373.1         .     77616.9  3019825  .  .  [3]  (Dom::Matrix(Dom::Integer))::_index
        
  12.7       14.0    13984.8
      96.0    95990.0     1000  .  .  [4]  f                     
                     
   5.0    
    .      5460.3         .          .   3019825  .  .  [5]  Dom::Integer::convert
                      
    .    
    8.0        8.0   109974.9   109974.9        1  .  .  [6]  g  
                                        
--------------------------------------------------------------------------------------------------------------------
 
index  
  %time      self  children     called  [index] name
---------------------------------------------------------------------------------------------
  [0]     100.0  109982.8         0       
  1  procedure entry point
     
                8.0  109974.8          1      [6]  g
---------------------------------------------------------------------------------------------
                 50467.24  27149.65    3019825
     [3]  (Dom::Matrix(Dom::Integer))::_index
  [1]      45.9  50467.24  27149.65    3019825  (Dom::Matrix(Dom::Integer))::_index_intern
                 21689.30  5460.346    3019825
     [2]  Dom::Integer::coerce
---------------------------------------------------------------------------------------------
                 21689.30  5460.346    3019825
     [1]  (Dom::Matrix(Dom::Integer))::_index_intern
  [2]      19.7  21689.30  5460.346    3019825
 Dom::Integer::coerce
          
      5460.346         0    3019825      [5]  Dom::Integer::convert
---------------------------------------------------------------------------------------------
                 18373.13  77616.89    3019825
     [4]  f
  [3]      16.7  18373.13
 77616.89    3019825  (Dom::Matrix(Dom::Integer))::_index
                 50467.24  27149.65    3019825
     [1]  (Dom::Matrix(Dom::Integer))::_index_intern
---------------------------------------------------------------------------------------------
                 13984.84  95990.02       1000
     [6]  g
  [4]      12.7  13984.84
 95990.02       1000  f
        
        18373.13  77616.89    3019825      [3]  (Dom::Matrix(Dom::Integer))::_index
---------------------------------------------------------------------------------------------
                 5460.346         0    3019825
     [2]  Dom::Integer::coerce
 
[5]       5.0  5460.346         0    3019825  Dom::Integer::convert
---------------------------------------------------------------------------------------------
  [6]       0.0       8.0  109974.8       
  1  g
                 13984.84
 95990.02       1000      [4]  f
---------------------------------------------------------------------------------------------
 Time sum: 109982.873 ms

Top rows of the profiling report indicate that the procedure spends most of its time accessing each matrix element. To improve performance, rewrite the procedure so that it can access fewer elements of a matrix in each call to the procedure f. For example, use the algorithm based on the bisection method:

f := proc(M, n, x)
begin
  if (M[1] - x)*(M[n] - x) > 0 then
    return(FALSE)
  elif (M[1] - x)*(M[n] - x) = 0 then
    return(TRUE);
  else
    a := 1: b := n:
    while (b - a > 1) do
      if is(b - a, Type::Odd) then
        c := a + (b - a + 1)/2
      else
        c := a + (b - a)/2
      end_if;
      if M[c] - x = 0 then
        return(TRUE)
      elif (M[a] - x)*(M[c] - x) < 0 then
        b := c:  
      else
        a := c:
      end_if;
    end_while;
  end_if;
  return(FALSE)
end_proc:

Before calling the procedure f, you must convert the matrix M to a list and sort that list. Sorting the list that contains 104 entries is an expensive operation. Depending on the number of calls to the procedure f, this operation can potentially eliminate the increase in performance that you gain by improving the procedure f itself:

g := proc()
local M1;
begin
  M1 := sort([op(M)]):
  f(M1, matrixSize^2, i) $ i = 1..1000
end_proc:

For these particular matrix size and number of calls to f, implementing the bisectional algorithm is still efficient despite the time required to sort the list:

time(g())
 3840.24 

The profiling report shows that the procedure spends most of the time executing the op and g function calls. This is because implementation of the bisection algorithm added new expensive operations in g (conversion of a matrix to a list and then sorting the list). The profiling report generated for the procedure call g() is very long. This example shows only the top of the report:

prog::profile(g()):
  percent usage of all
    |     time self per single call
    |        |     time self
    |        |        |     time children per single
call
    |        |        |    
   |     time children
    |    
   |        |        |        |    calls/normal exit
    |        |        |        |        | 
   |    calls/remember exit
    |
       |        |        |        |     |     |   calls/errors
    |        |        |        |        | 
   |     |   |  [index] function name
-------------------------------------------------------------------------------------------------
 100.0   3884.2   3884.2       .        . 
    1     .  .  [0]  procedure entry point            
-------------------------------------------------------------------------------------------------
  56.1   2180.1   2180.1   1704.1   1704.1
    1     .  .  [1]  g                                
  33.0   1280.1   1280.1    188.0    188.0
    1     .  .  [2]  (Dom::Matrix(Dom::Integer))::op  
   6.1      0.2    236.0       .        . 
 1000     .  .  [3]  f                                
   3.2      0.1    124.0       .        . 
 1000     .  .  [4]  `p -> [coeff(p, All)][2..-1]`    
   1.5      0.1     60.0       .        . 
 1000     .  .  [5]  `l -> l.[Rzero $ r - nops(l)]`   
   0.1      2.0      4.0       .        . 
    2     .  .  [6]  Dom::Integer::hasProp            
    .        .        .        .        . 
    2     .  .  [7]  DomainConstructor::hasProp       
    .        .        .        .        . 
    .  9981  .  [8]  is                               
-------------------------------------------------------------------------------------------------
 

The recommended approach for improving performance of your code is to use the MuPAD functions when possible. For example, MuPAD provides the has function for checking whether one MuPAD object contains another MuPAD object. Rewrite your code using the has function and combining the procedures f and g:

g := proc()
local M1;
begin
  M1 := {op(M)}:
  has(M1, i) $ i = 1..1000
end_proc:

This procedure also converts the matrix M to a set. Converting a matrix to a set can reduce the number of elements. (MuPAD removes duplicate elements of a set.)

The execution time for the procedure call g() is the shortest among the three implementations:

time(g())
 1520.095 

The profiling report shows that the procedure spends most of its execution time accessing the 1000×1000 matrix of random integers and converting it to a set. This example shows only the top of the profiling report:

prog::profile(g()):
  percent usage of all
    |     time self per single call
    |        |     time self
    |        |        |     time children per single
call
    |        |        |    
   |     time children
    |    
   |        |        |        |    calls/normal exit
    |        |        |        |        | 
   |   calls/remember exit
    |
       |        |        |        |     |   |  calls/errors
    |        |        |        |        | 
   |   |  |  [index] function name
----------------------------------------------------------------------------------------------
 100.0   1556.1   1556.1       .        . 
    1  .  .  [0]  procedure entry point            
----------------------------------------------------------------------------------------------
  78.9   1228.1   1228.1    188.0    188.0
    1  .  .  [1]  (Dom::Matrix(Dom::Integer))::op  
   9.0    140.0    140.0   1416.1   1416.1     1  . 
.  [2]  g                                
   8.2      0.1    128.0       .        .   1000  . 
.  [3]  `p -> [coeff(p, All)][2..-1]`    
   3.9      0.1     60.0       .        .   1000  . 
.  [4]  `l -> l.[Rzero $ r - nops(l)]`   
    .        .        .        .        .      2  . 
.  [5]  Dom::Integer::hasProp            
    .        .        .        .        .      2  . 
.  [6]  DomainConstructor::hasProp       
----------------------------------------------------------------------------------------------
 
Was this topic helpful?