I tried your function on a large array (eg. 5*5*1000000), it occupies too much memory. It will take at least 5 times memory as the initial data package. And most of the memory cost is to store the sparse indexes. Is there any way to reduce the memory consuming? Thanks!