Fixed CUDA random generation entirely.

The state machine is now called piece-wise, by blocks of size 2^24
at most. 2^24 turns out to be the max precision for single precision
floats (mantissa). Since the randomizer's state is encoded into the
vector itself (floats), it was the easiest fix.
1 file changed