The parfor implementation uses a number of different strategies to try to keep the workers busy. However, sometimes these might turn out not to work well for a particular problem. There is no way to explicitly request the workers start the next iteration - that is all handled automatically by the parfor implementation. I'm a bit surprised that you're seeing workers being idle (for a noticeable amount of time) and then later becoming busy. (Of course, at the end of the loop, there's always going to be a "tail" when each worker has finished its work). I wouldn't normally expect that to happen unless you've got a huge amount of data to transfer (and even then, the parfor implementation tries pretty hard to hide data transfers by running them concurrently with other stuff).
One alternative you might explore is to use parfeval rather than parfor. This uses a simpler scheduling mechanism than parfor, but it might work better for your situation. Briefly, each call to parfeval schedules a single remote function evaluation on a worker. You call fetchNext or fetchOutputs to collect the results. A bit like this:
for k = 1:size(xx,1)
f(k) = parfeval(f, 1, xx(k,:), k);
for idx = 1:size(xx, 1)
[k, data] = fetchNext(f);
out(k,:) = data;