There are several confounding factors here that go some way at least to explaining your timings.
In your first example (the for loop running in the client), the call to MATLAB's pinv is intrinsically multithreaded - i.e. MATLAB automatically runs it on multiple threads. That means that all your computational resources are already being fully utilised in running the loop. There's nothing more to give from your computer, and the only way you're going to get a faster time is by getting more hardware.
This (somewhat) explains why your parfor loop is never going to be faster. The fact that it is slower is a little confusing. There are several possibilities here. The first thing to consider is: how many hardware threads your CPU supports. MATLAB's maxNumCompThreads can tell you that. If the answer is not 4, then you are probably not seeing the best performance. This is because (by default) parallel pool workers operate in "single computational thread" mode - so they will not intrinsically multi-thread the calls to pinv. There's overhead to getting your large-ish matrix to the workers, that will slow things down. Walter's comment about the parfor scheduling is not quite right in this specific case (the comment is correct in general) - when the loop bounds happens to be the same as the number of workers in the pool, then exactly one iteration is sent to each worker. So that aspect isn't slowing you down here. All that said, I am surprised at how poorly the parfor loop is performing in this case. I would have expected it to be probably slightly slower than the for loop.
Finally, your for-drange example isn't right - you're running on the client there, so that's why you're seeing the same time as the client. You need to put the loop inside an spmd block, i.e.
for i = drange(1:NTasks)