Poormina: This code does (should) produce single undirected edge between nodes.
As a note: If you don't care about multiple edges and self loops there is a much simpler algorithm: pick d/2 permutations at random.
The reason this code is complicated is because it avoids multiple edges and self loops.
So it seems these are all problems of dynamically allocating memory and coping data. Here are 2 optimizations that can solve this:
1. Replace the sparse n*n matrix A by a preallocated n*d adjacency list, filled initially with -1.
2. Instead of recreating U (half edges), keep its size as a separate variable and only swap elements when updating it.