The top answer is an excellent demonstration of the central limit theorem and why the question's assumption that rand() and rand() * rand() are "equally random" is wrong.
The central limit theorem is one of the most initially surprising and fascinating things in all of math. Please support the central limit theorem by learning about it on Wikipedia, or wherever fine theorems are discussed.
Edit to add: If you haven't learned about this enough to know that implementing systems that depend on randomness, like cryptographic systems, is really, really difficult and error-prone, please just use a standard library instead.
It's the most counter-intuitive thing in the world that rand() * rand() is less secure than rand(); shouldn't it be twice as unpredictable? But it's not. Cryptography and related fields are rife with counter-intuitive gotchas.
Here's some intuition about why rand() * rand() is different than rand().
Let's say you are doing rand(2) which produces either 0 or 1, each equally often, then rand(2) * rand(2) will produce 0 much more often than 1.
0 * 0 = 0
0 * 1 = 0
1 * 0 = 0
1 * 1 = 1
Addition has a similar problem.
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 2
If you use the right range and wrap the result. e.g. if you do mod(rand(2^n) + rand(2^n), 2^n) then addition will not affect the distribution of your numbers.
Or you could argue formally by pointing out that the maximum entropy distribution on the interval [0,1] is the uniform distribution, so modifying that distribution will only decrease its entropy, which means that it's "more predictable".
I remember when I first learned about the central limit theorem, having this sudden realization:
So that's where all those normal distributions come from!
In the statistics classes that most people take, they just dump this weird equation on you, and say that the normal distribution often arises in the real world. But they don't explain why, or in what kinds of situations you're likely to see things being normally distributed. It's like suddenly finding a walrus in your living room: everybody assures you that it's supposed to be there, but where the hell did it come from?
The central limit theorem de-mystifies the normal distribution and makes the world easier to understand.
It's the most counter-intuitive thing in the world that rand() rand() is less secure than rand(); shouldn't it be twice as unpredictable?*
Under some reasonable assumptions, rand() + rand() mod 1 is twice as random! If both calls return independent samples, and at least one of the calls to rand() returns a uniform sample in [0,1], then the sum will be uniform on [0,1].
So if you assume you get at least one good shot and that the other is not specifically correlated to it, you'll be improving your randomness.
Oh man, I would totally upvote you in thanks, but I'm too lazy. Not to mention the moralist in me wanting to instead berate you for keeping the lazy that way. Luckily the karma whore in me has trumped all that and convince the rest to reply with snark :)
Anyone interested in understanding randomness should endeavor to understand what probability actually is. As a Bayesian I think that probability is not a property of any object but rather a subjective measure of your own confidence, your own knowledge. To say something is random is just to say that you lack enough information to predict any one outcome over another. But everything could be predicted perfectly (barring quantum events) with enough knowledge. If I knew that a coin was weighted to favor heads my degree of belief in the outcome of the toss would become stronger. The probability for me would become, say, 75% heads, while for the observer without such knowledge it would be 50%.
>@Thilo No, it doesn't... ? If a random variable is uniformly distributed in the range (0,1), and you sample the variable n times, and take the sum, it will just be uniformly distributed in the range (0,n). – user359996 12 hours ago
It's so easy to sound certain and be wildly wrong.
How is that related to the SO question? The post is asking about how to tell if one distribution is "more random" than another, which is what entropy is all about.
Consider a random number generator that produces integers between 0 and 15. Here's a really crappy algorithm:
1. Start with a truly random seed between 0 and 15.
2. Increment it each time to generate a random number, modulo 16.
Suppose you start at 11. Your sequence of random numbers will be 11, 12, 13, 14, 15, 1, 2, 3 ....
This is obviously not very random. But look at the entropy of it. All values between 0 and 15 are equally likely, so this will have the maximum possible entropy: 4 bits.
The problem here is that, for entropy to be an accurate measurement of information content, you have to assume that you're measuring independent identically distributed random variables. The outputs from a pseudo-random number generator are not independent.
If we are interested in a _sequence_ of pseudorandom numbers, then we should be talking about the _joint_ entropy of the sequence. But information theory is plenty capable of describing "how random" a sequence is.
The generator you are talking about is far from producing a uniform distribution (which would have maximum entropy) jointly over the n-dimensional hypercube that would represent a sequence of n draws from your generator, so it has less entropy and is thus "less random" than the distribution you'd get from n independent draws from a perfect rand().
More interesting to grasp is that even
rand() * rand() / rand()
is less evenly distributed than rand().
It does seem like it should be possible to remap the numbers onto an even distribution if you knew how uneven to expect it to be. That would probably lead to some loss of fidelity at the limits of accuracy of the variable, however.
I like the comments about dice. With one die, an outcome of 4 or 2 is equally likely. With two dice, an outcome of 4 is three times more likely than 2.
So while one and two dice games are both random (as in unpredictable), only the one-die game has uniformly distributed outcomes.
The central limit theorem is one of the most initially surprising and fascinating things in all of math. Please support the central limit theorem by learning about it on Wikipedia, or wherever fine theorems are discussed.
Edit to add: If you haven't learned about this enough to know that implementing systems that depend on randomness, like cryptographic systems, is really, really difficult and error-prone, please just use a standard library instead.
It's the most counter-intuitive thing in the world that rand() * rand() is less secure than rand(); shouldn't it be twice as unpredictable? But it's not. Cryptography and related fields are rife with counter-intuitive gotchas.