You could fwrite() RightIrix_Training224, and fread() it when you needed it.
Note that if your bmp files are uint8 images (not uncommon) then you are making the storage requirements 8 times higher by using mat2gray(), which is part of what contributes to the reading being slow. I would not be surprised if it was faster to save() as a uint8 3D array and load() that, and convert to the 0/1 range if you needed to.
A question along those lines is whether you are counting upon the fact that when you use mat2gray(A) with no range provided, then it scales so that max(A(:)) becomes 1 and min(A(:)) becomes 0? If so then redoing that calculation each time you load() might not be efficient enough. But if you know that your png range from 0 to 255 anyhow (or should) then the scaling calculation becomes trivial after loading the uint8 3D array.