It depends on the problem. Parallelization is not trivial. If you use e.g. 16 cores and write the results in neighboring elements of a UINT8 vector, you get a collision in the cache-line. As result the total computing time can exceed the time of a serial code, because the threads are waiting for eachother. Such collisions can occur in other resources also, e.g. if the memory is exhausted and expensive disk caching is used, or if data are requested through a network.
Many Matlab functions are mutli-threaded, e.g. sum: For large inputs Matlab computes the sum in several parts using different threads. For a 1e5 x 1e5 matrix all cores are used (most likely). Computing this by parallelization in a parfor loop is less efficient, because there is some overhead for starting the threads. The multi-threaded functions are written such, that resource collisions are avoided (at least in most cases. In some cases, e.g. logical indexing or cell2mat there is some potential for improvements).
So before you start to parallelize a function, check if it uses many cores already in the sequential version. Then starting mutliple threads of a parpool on a single machine will not improve the efficiency.