28 comments on “The bell has tolled for rand()

  1. Great article!

    Although mt19937 is a great default and unbeatable for its quality (among “insecure” generators, of course), I think the taus88 engine deserves a bit more love than it typically receives, especially in the often-resource-sensitive world of C++. Here’s why:

    sizeof(boost::mt19937) == 2504
    sizeof(boost::taus88) == 12

    It fares pretty darn well on DIEHARD, and sports a still-unearthly period of 2^88, but gets away with two orders of magnitude less state. That in turn makes it much more practical to associate internal RNG seeds on a fine-grained basis (think agents in a large simulation). It’s valuable when determinism and reproducibility are important, which in my experience is most of the time even when (especially when!) dealing with randomness. Thread-local generators are better than rand(), but they still sacrifice the modularity of your program’s determinism.

    • All true, and good points. I actually considered writing something pointing out that mt19337 is kinda big – so you shouldn’t use one as an internal PRNG for objects you’re going to have a lot of, but it wouldn’t matter if you used one as a “global” (thread-local) generator – but it wasn’t really relevant to the point at hand and I thought that would be distracting, so I left it out.

      I was thinking that one day I’d do a post just on the standard generators, adaptors, and distributions, talking about the pros and cons of each, and where you might use them. Now that you mention it, I might also mention some of the other stuff that’s in Boost.Random, too – they have that sweet table already all laid out. Or if you’ve written something on Boost.Random or the standard random library, feel free to drop a link to it.

  2. The only problem of Mersenne Twister is a huge (for a prng) amount of memory it eats.
    Good thing is that C++ rng library is very flexible and you can plug any implementation you like.

  3. Personally, I think we should teach new programmers to use std::default_random_engine. Yes, odds are pretty good it will be the Mersenne Twister engine, but it was included in the standard as a way for an implementation to give something of a recommendation without first needing to discuss number theory.

    And are you sure that it’s possible for a std::srand(std::time(nullptr)) to cause a compile error on some platform? I can’t think of any arithmetic type that doesn’t implicitly convert to an unsigned int, but it’s possible that I’m overlooking something with the usual conversions.

    • Why leave it up to the implementation? You know mt19337 is good as a general purpose engine, so use it. I don’t see the logic in pretending you don’t know what you want and letting the library choose. You also don’t need to discuss number theory to tell beginners “use mt19337 by default”. You can do it in four words – the four words I just used in fact.

      I always compile with full warnings on, and warnings to errors, so implicit narrowing/truncating conversions kill the build for me. I forgot that’s due to my environment; it’s not universally true, though if you’re not turning warnings on and heeding them you’re just asking for trouble.

      • There are three reasons I can see for recommending std::default_random_engine: (1) the name makes its use obvious (“if you want a good all-around engine, try default_random_engine”); (2) the Standard doesn’t require it, but it is certainly legitimate to resolve to different implementations based on any number of features, such as word size or whether NDEBUG is defined; and (3) default_random_engine isn’t required to resolve to any of the engines in the Standard, it could resolve to an engine that calls the Intel assembly code instruction rdrand, for instance.

        I can understand forgetting that the compiler settings you use are, in practice, more strict than the Standard. And I know that many projects and programmers use the same settings you do. I simply wanted to be sure that I wasn’t overlooking something.

        • Well, (1) all that the name implies is that it is the library writer’s choice for the *default* engine, so i’d be careful leaping to the assumption that it’s a “good all-round” engine… for all you know it could just be a thin wrapper around the system’s rand (i can even see somebody thinking that’s a “good idea”) – and the standard itself suggests that it just needs to be good enough for “casual, inexpert, and/or lightweight use”, which doesn’t fill me with confidence as to its quality; (2) that would mean your program would behave radically differently whether NDEBUG is defined or not… that’s not a feature, that’s a bug; and (3) no, that’s not possible… it must be a deterministic engine.

          The bottom line is that you can’t expect that you’ll get a good engine with the default. You might, you might not. But i’m not a gambler, i’m a programmer… and for 14 less characters i can be assured of getting a high quality, fast engine on every platform. I don’t see the logic of pretending not to know what’s good and falling back on faith that the library writer knows better.

          • Looking at the working draft for C++11 ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf , section 26.5.5, page 906), I see the explanatory note “The implementation may select this type [default_random_engine] on the basis of performance, size, quality, or any combination of such factors, so as to provide at least acceptable engine behavior for relatively casual, inexpert, and/or lightweight use. Because different implementations may select different underlying engine types, code that uses this typedef need not generate identical sequences across implementations.” So, it does appear that the choice has to give deterministic results, even if the determinism is unspecified.

          • Another way to tell is that in standardese a random *ENGINE* is deterministic – it has to have a seed() function and a way to serialize its state and so on – while a random *GENERATOR* can be non-deterministic (and the only generator in the standard library is random_device).

  4. Pingback: Przyszłość std::rand() - Security News

  5. Very nice article. Quick note, the first code sample fails to compile. There are missing parentheses after ‘roll_die’.

  6. Good article, enjoyed the read. One question:

    What is the point of:

    auto main () -> int { }

    you know that main always returns an int and are specifying it that way, so why not just go with the traditional form.

    I am asking not about preferences, but if that is actually different.

    • It is actually different in general, but it doesn’t matter for main.

      I tell C++ beginners to *always* use trailing returns (if you even bother to use returns at all – you often don’t need them in C++14). In some situations they are necessary, in others they are not necessary but superior.

      In cases like main where it makes no difference (because int is a keyword so it can never be a dependent type, because it never depends on the arguments, and because you can’t avoid specifying that main returns int), I still encourage using trailing return style; mixing styles can lead to problems.

  7. Pingback: 1p – The bell has tolled for rand() – Exploding Ads

  8. Pingback: 1 – The bell has tolled for rand() – blog.offeryour.com

  9. Seeding with a constant by default is a great feature. It gives you deterministic results unless you take care of it, which is a great debugging feature – the ability to reproduce bugs.

    • I have never come across a new programmer who wasn’t surprised and confused by the fact that rand is seeded with zero by default. And I have never heard an experienced programmer say: “Gee, I’m so happy I have to go out of my way to seed this! It makes my program so much easier to write and understand!” That’s a sign of bad default behaviour.

      If you *REALLY* want rand to be seeded to some specific value for debugging, you’d be wise to do it manually *anyway*, just in case the implementation itself is broken. And to make matters worse, manually seeding it is so hard that just about everyone screws it up (or half-asses it). So no, there is no good argument for rand’s default seeding.

  10. If you are using the latest OpenBSD libc, your rand() replacement code *will* make it worse. This is because OpenBSD has decided that rand() is so terribly broken, for all the reasons you mentioned and more, that they decided to break the standard and provide a proper random generator instead. They even made srand() a no-op, because seeding with time(NULL) is so terrible. (Of course, they provided an alternative for if you really want a deterministic sequence, but they found no code that actually needs that.)

    See for details: http://marc.info/?l=openbsd-tech&m=141807224826859&w=2

    • That’s not so much an argument against what I’ve recommended as it is an argument against using OpenBSD. Honestly, what kind of bozo would break standard compliance rather than simply providing another function that does the job better… as just about every other platform has done. No one who knows what they’re doing expects rand to be any good by default anyway – by going out of your way to make it so, and breaking compatibility in the process, you’re only screwing the experts in favour of the clueless. (Not to mention that the reasoning given is bizarre – like the wacky theory that rand might be deterministic because of the NSA.)

  11. Seems to me that std::rand() is likely heading for termination in the next C standard, and C++ don’t want to be stuck with it until the next iteration (which I’m guessing will be C++20 at the current rate of progress).

    On another note, the std::tm{} followed by all that initialisation up there made me wonder why initialisers like { .tm_mday = 1, .tm_mon = 1, .tm_year = 1970, .tm_wday = 4 } don’t appear to be coming our way anytime soon? Didn’t C get this several years ago?

  12. Pingback: Security News #0x82 | CyberOperations

  13. Pingback: Don’t blink, or you’ll mess up the Mersenne Twister | Backworlds

  14. Pingback: Why is a raven like a writing desk? » Blog Archive » Lameness Explained

Leave a Reply

Your email address will not be published. Required fields are marked *