In all 64 bit versions of MATLAB, the built-in limits are:
- no array dimension may exceed 2^48-1, even if the array is empty. This is a MATLAB limitation.
- no array may occupy more than 2^48-1 bytes. This is a MATLAB limitation that builds upon a hardware limitation -- namely that no publicly known x64 architecture has more than 48 address lines connected
- the total amount of memory allocated to the process may not exceed 2^48-1 bytes. At the moment I am not sure if this includes virtual memory that is allocated to shared DLL, but I suspect it does. This is a MATLAB limitation that builds upon a hardware limitation -- namely that no publicly known x64 architecture has more than 48 address lines connected. However the specific question of whether the allocated memory includes shared DLLs or not is an operating system question (possibly combined with hardware) as operating systems have access to memory with the top address bit set, able to use hardware virtual memory mapping to map it to other locations
- no more than 8192 gigabytes of memory may be reserved for Java use. This a MATLAB limitation if I understand correctly -- though it is rooted in historical limitations on JVM that have since been removed for 64 bit JVM
- Student licenses and Home licenses are restricted to 1000 virtual blocks in Simulink. This is purely a MATLAB marketing limitation
- There is a Preference (that you already found) that can be set to limit any one array to a fraction of the maximum memory. Besides helping to "play nice" on a shared machine, this can help prevent MATLAB from swapping to disk when large arrays are used. If this limit is exceeded, an error message is generated
- Only one GPU may be used at a time on any one process. I do not know if this is a hardware limitation or a MATLAB limitation or a driver limitation
Other than that, I cannot think of any limitation.
In particular, there is no limitation to 75 gigabytes.
However... MATLAB only allocates as much memory as is required by the calculations. Allocating more memory would not speed up the calculation..
For some kinds of matrix calculations, MATLAB automatically splits the calculation between all available cores. Unless very large arrays are being used, it is not uncommon for there to be not much performance improvement beyond roughly 8 cores, due to the time required to distribute the task and collect the results. If MATLAB is automatically splitting tasks between cores much of the time, then you would expect to see a number of cores busy for MATLAB in your task manager.
Mathworks offers a Parallel Computing Toolbox, which allows you to split a task between cores. The most common experience when first using the toolbox is that it makes programs take longer -- starting the processes and transferring data to them and retrieving the data takes a fair bit of overhead, so to get any value from this approach, you need to have well separated large tasks... especially if the amount of work to be done is high compared to the amount of data to be transferred.
The automatic spliting of array calculations between available cores also factors into using Parallel Computing Toolbox: the default number of cores allocated per worker is 1, so matrix calculations that used to be split between available cores will instead run on one core, making them take longer.
I mentioned above that the performance improvement for automatic splitting is often not much beyond 8 cores. That suggests that for some situations, an effective strategy is to split the work between 4 workers (each allocated 8 cores) or 5 workers (each allocated 6 cores) or 6 workers (each allocated 5 cores) or 8 workers (each allocated 4 cores.) The more workers you allocate, the more simultaneous progress you can make on tasks that cannot easily be done in parallel. There are some tasks that are highly iterative that cannot be vectorized, so sometimes allocating many workers each with only one core is worth doing... but more of the time, splitting between workers that each have access to multiple cores turns out to be more productive.
But all of this parallel work depends upon there being low communication between workers. If you have goals that can be met by processing large blocks independently and then stitching the results together at the seems then SPMD might be the best approach.