An empty do loop giving seg fault with OpenMP while mexing with fortran

Question

Ugur on 25 Aug 2016

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/300875-an-empty-do-loop-giving-seg-fault-with-openmp-while-mexing-with-fortran

Commented: Ugur on 4 Oct 2016

Hello,

I have a code which uses a 'parallel do' construct. When I run the code in serial, it works w/o any problem. But when I parallelize it, it gives seg fault. I tried omitting some parts of the code out for testing purposes. At the end, I endded up with the below which still gave the seg fault error.

!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(i,adjDist,j,k)
do i = 1,nNods
  adjDist  = zero
  do j = 1,nDim
    do k = 1,nAdj
      if (nodAdj(i,k).eq.0) cycle
      adjDist(k,j)  = XYZ(nodAdj(i,k),j) - XYZ(nods(i),j)
    enddo
  enddo
enddo
!OMP END PARALLEL DO

Here 'adjDist' is a (nAdj x nDim) array and all the other variables are properly defined shared variables. Do you have any idea about what causes this to give seg fault? Before I was using the following statement;

adjDist(:,j) = XYZ(nodAdj(i,:),j) - XYZ(nods(i),j)

not to have the third do-loop with index 'k', but then I even changed it not to allocate/deallocate any memory, although I am not sure if Fortran allocates any extra memory in such a case. But in any case, apparently that is not the reason.

Thanks!

Ugur

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

James Tursa on 25 Aug 2016

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/300875-an-empty-do-loop-giving-seg-fault-with-openmp-while-mexing-with-fortran#answer_232777

Why is adjDist private? This means each thread will work with its own individual copy of adjDist. Is this what you are really trying to do for this test case? How large is adjDist?

Some combinations of Fortran/OpenMP/MATLAB are not compatible. What versions are you running? Can you code up a very simple test case with one simple parallel loop and see if that runs OK?

18 Comments
Show 16 older commentsHide 16 older comments

Ugur on 25 Aug 2016

Open in MATLAB Online

Crashed again. Here is the gateway function, I am using. I do not know if I do a mistake here, but it seems that it is not what is inside the do loop what causes the problem. Somehow the threads occupy illegal memory, if such a thing is possible...

#include "fintrf.h"
!
SUBROUTINE MEXFUNCTION(NLHS, PLHS, NRHS, PRHS)
use mexf90
implicit none
integer*8, intent(in)  :: PRHS(*)    ! Pointers carrying the input data
integer*8, intent(out)  :: PLHS(*)    ! Pointers carrying the output data
integer*4, intent(in)  :: NLHS,NRHS  ! REMAINS THE SAME FOR 64BIT system
!-----------------------------------------------------------------------
integer*8     :: err, nNods, ...
integer*8, pointer  :: XYZ, elmConn, ...
real*8      :: dvType, zeroLim
integer*4    :: classId
! ERROR CHECKING FOR INPUT
IF (NRHS .NE. 8) THEN
  CALL MEXERRMSGTXT('MultMexError: 8 INPUT ARGUMENT IS REQUIRED')
ENDIF
IF (NLHS .NE. 1) THEN
  CALL MEXERRMSGTXT('MultMexError: 1 OUTPUT ARGUMENT IS REQUIRED')
ENDIF
! ASSIGN POINTERS TO THE VARIOUS PARAMETERS
! Input matrix
XYZ  =>MXGETPR(PRHS(1))
nPts   = MXGETM(PRHS(1))
nDim   = MXGETN(PRHS(1))
...
classId    = mxClassIDFromClassName('double')
plhs(1)    = mxCreateNumericArray(int(3,8),[nNods, nAdj, nDim],classId,0)
returnVar  =>mxGetPr(plhs(1))
CALL myFunction(XYZ, nPts, nDim, ...)
END SUBROUTINE MEXFUNCTION

James Tursa on 25 Aug 2016

Edited: James Tursa on 25 Aug 2016

Open in MATLAB Online

What is in the mexf90 module that you are using? The exact same stuff from this thread?

https://www.mathworks.com/matlabcentral/newsreader/view_thread/307087

Or something different?

Also, I am confused as to what data types you are using. The XYZ declaration implies that the data is INTEGER*8, but then the returnVar points to plhs(1) data area, which you created as 'double'. What exactly are the data types of your inputs and outputs? And what does the signature of myFunction look like?

Also, I advise to never use literal arguments to the mex API functions in Fortran. Always use variables to ensure that the data types match up. (In C you can get away with type mismatches since everything is passed by value and argument types automatically get promoted ... not so in Fortran). So in your mxCreateNumericArray routine call, create variables with the exact types you want and pass those in instead of the literals you are currently using. E.g., the signature is this:

mwPointer mxCreateNumericArray(ndim, dims, classid, ComplexFlag)
mwSize ndim
mwSize dims(ndim)
integer*4 classid, ComplexFlag

So create the appropriate mwSize ndim and dims variables, and the integer*4 classid and ComplexFlag variables, and pass those in as arguments. That way you don't have to worry about whether mwSize is integer*4 or integer*8 for your mex compile settings, and you don't have to worry about whether literal integers are integer*4 or integer*8, etc.

Ugur on 25 Aug 2016

Open in MATLAB Online

It is the module where mex function interfaces are defined. The one related to mxCreateNumericArray is as follows;

function mxCreateNumericArray(ndim, dims, classid, ComplexFlag)
integer*4      :: classid, ComplexFlag
integer*8      :: ndim, dims(ndim), mxCreateNumericArray
end function mxCreateNumericArray

What you ask is a trick I use to avoid copying input and output variables using mxCopy functions. I define all the inputs and outputs as an integer pointer in the gateway function pointing the memory location where the input/output array is stored. No matter of what type an array is, as far as I know, Fortran uses an int*4 pointer in 32 bit systems and int*8 pointer in 64 bit systems to pass the array to a subroutine. This trick I used in many other programs with or without OpenMP, so either I was very lucky untill today or what I do wrong is sth else. The part of myFunction that I define the inputs is as follows;

subroutine myFunction(XYZ, nPts, nDim, ...
use omp_lib
use mexf90
implicit none
integer*4, intent(in)  :: nPts, nDim, ...
real*8, intent(in)  :: XYZ(nPts,nDim), ...
real*8, intent(out)  :: returnVar(nNods,nAdj,nDim)
real*8      :: zero, one, adjDist(nAdj,nDim),  ...
integer*4    :: i, j, k, ...

James Tursa on 8 Sep 2016

Edited: James Tursa on 9 Sep 2016

Open in MATLAB Online

Sorry for the delay. Finally getting back to this. First, some comments about the interface for mxGetPr (applies to mxGetPi also).

Fortran pointers, in general, are not like C pointers. In C, a pointer is basically just an address where the compiler knows what type is being pointed to (so it knows how to dereference and do pointer arithmetic). In Fortran, pointers have all of that, plus they contain information on the shape and stepping size of what is being pointed to. How that information is stored in this pointer is up to the compiler vendor, but it is some sort of descriptor (i.e., structure). So Fortran pointers and C pointers are not the same. But you are using a Fortran pointer like a C pointer in your interface code above. How is this expected to work? Well, as it happens, Fortran pointers to scalars are a special case. Since they can't change shape (always 1 element exactly) and there is nothing to "step" over, this information is meaningless for a pointer to scalar. So compiler vendors typically just implement a Fortran pointer-to-scalar just like a simple C pointer ... simply an address with nothing else attached. That is why the above interface works, because the compiler vendor chose to implement a Fortran pointer-to-scalar this way (they weren't required to) and because you are passing this pointer through an implicit interface to myFunction.

That being said, your interface above is for an integer*8 data type, not a real*8 double type. So your interface should probably be this since you are using it with real*8 data:

    real*8,pointer  :: mxGetPr  ! <-- real*8 is the data type being pointed to, not the pointer itself
    mwPointer    :: pm

However, I don't think this is the cause of the problem. Since the interface to myFunction looks like it is implicit, even though you have declared XYZ to point to integer*8 (even though it is really pointing to real*8), what will likely happen is that the address will simply get put on the argument stack for the function call and myFunction will end up using this memory properly as real*8 data.

If you get rid of the OpenMP stuff and just do your mxGetPr stuff, do you still have a crash problem?

James Tursa on 4 Oct 2016

Off hand, the first thing I would look at is the argument list for each of those calls. Check to see that you are passing in the same stuff (same number of inputs, same sizes, same classes, etc). It could be that there is a difference that the mex routine doesn't check for.

Ugur on 4 Oct 2016

Well, I think I checked this, and indeed, what I did is to split the m-file from the line it calls the mex function, and write the rest into the main program. So the arguments remain the same but it crashes in the former while it works in the latter case.

Sign in to comment.

An empty do loop giving seg fault with OpenMP while mexing with fortran

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

18 Comments
Show 16 older commentsHide 16 older comments

See Also

Categories

Tags

Community Treasure Hunt

An empty do loop giving seg fault with OpenMP while mexing with fortran

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

18 Comments Show 16 older commentsHide 16 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

18 Comments
Show 16 older commentsHide 16 older comments