When you decompose a function f in an orthonormal basis, you can relate the sum of the square of the coefficients to the L2-norm (read energy) of the function via Parseval's identity. To get the best approximation in a lossy compression, you want to kill the coefficients that are smallest when squared.
The number of coefficients you decide to threshold depends on the amount of compression your looking for and the quality of the reconstruction: As you threshold more coefficients, you will degrade the quality of the reconstruction.
Now, using curvelets is another story since a curvelet transform is highly redundant (not orthogonal). With that said, you could still proceed by killing those coefficients that are "small enough," and just monitor the quality of the recongstruction as you do..