Bug 101

Summary: random variable intialization
Product: ns-3 Reporter: Mathieu Lacage <mathieu.lacage>
Component: coreAssignee: Michele Weigle <mweigle>
Status: RESOLVED FIXED    
Severity: normal CC: mathieu.lacage, ns-bugs, tomh, vedran, watrous
Priority: P1    
Version: pre-release   
Hardware: All   
OS: All   

Description Mathieu Lacage 2007-11-05 03:30:27 EST
utils/run-tests.cc now has a call to ns3::RandomVariable::UseGlobalSeed to ensure that the tests use a fixed global seed for all RandomVariable instances. This reduces the efficiency of the Buffer tests in src/common/buffer.cc which expect to get a new seed at every run.

We need a way to force a specific seed locally in RandomVariableTest::RunTests

Here is an exerpt from email exchanges about this issue:


------------------- Mathieu  ------------------------
So, these are 2 use-cases where you need to be able to force a specific
seed for a specific instance of a pseudo random number generator stream,
which cannot be achieved with the current API. I think that it would
have made sense to separate these two features:
  - the ability to create a random stream with a specific initial seed
  - the ability to generate a stream of initial seeds for a set of
random streams

i.e., 
class RandomSeedGenerator 
{
public:
  // create a new seed generator
  RandomSeedGenerator ();
  // return a new seed.
  Seed GetSeed (void);
  // return the default random seed generator, the one which should be
used all the time.
  static RandomSeedGenerator *GetDefault (void);
};
class RandomVariable
{
// force a specific seed generator to see this stream
RandomVariable (RandomSeedGenerator *);
// use the default seed generator
RandomVariable ();
};

A simpler alternative which would save you from having to add a set of
new extra constructors to each RandomVariable subclass would be to use a
stack of seed generators and call RandomSeedGenerator::Front from
RandomVariable's constructor.

class RandomSeedGenerator 
{
  static RandomSeedGenerator *Front (void);
  static void Push (RandomSeedGenerator *);
  static void Pop (void);
};
----------------------- Raj --------------------------
 This choice of API was based on Michelle Weigle's advice: 
http://mailman.isi.edu/pipermail/ns-developers/2007-March/002922.html

Note that we had the functionality you wanted in "SetSeed"sometime in February or March, but I deferred to her judgment and her invocation of the rng-stream's creator's opinion on the matter.  That said, I believe bringing back the per RandomVariable SetSeed method should satisfy this requirement?  This would be a simpler solution than the RandomSeedGenerator you mentioned, since this would complicate the API and the understanding of how our RNGs work.  Right now we have one "line" of seeds used by the entire system, generated by the RngStream class...you are suggesting that different RandomVariable could be seeded from different "lines".  Is this correct?
Comment 1 Mathieu Lacage 2007-11-09 02:27:17 EST
adding raj to CC list.
Comment 2 Rajib Bhattacharjea 2009-03-13 10:39:31 EDT
Closed by RNG API changes merged for ns-3.4
Comment 3 Tom Henderson 2011-05-20 15:20:42 EDT
I have been dealing with this issue as I try to debug AODV; I do not think that Mathieu's concerns were adequately addressed when this bug was closed, and I now have similar concerns.

I believe that we should provide a way to deterministically set the stream number (not the substream, or run number) for random variables.  The problem is that as random variables are created, they are allocated the next stream from the generator.  However, it is difficult to control the order in which random variables are created in the simulation.  The problem from a user perspective is that when you perturb your simulation program configuration, you may change the streams assigned to random variables even if you hold the Seed and RunNumber constant.

For instance, I am testing a mobile network simulation.  I would like to vary certain parameters (routing protocol, number of traffic senders) without varying the mobility pattern of the network.  However, every time I change configuration (such as adding/removing a data sender), my mobility pattern changes.  So, I can't really compare two routing protocols exactly against the identical randomly-generated scenario.

One possible solution is to try to ensure that mobility related random variables are created first, but I think that it would be a fragile solution due to initialization order issues, and it is not general.

Another may be to modify the stream generator to reserve some N initial streams that may be deterministically assigned, and allow the random variable to be assigned to a specific stream number.  We would need to plumb this API up through our helpers such as MobilityHelper::SetDeterministicStream ().
Comment 4 Michele Weigle 2011-05-20 15:43:48 EDT
Yes, I see the problem.  

How about just allowing the user to define a particular Stream for each RV, as they do with Substream (using the run number)?  By default, it would operate as it does now (creating a new RandomVariable grabs the next Stream).  But, we'd add an option to allow the user to choose the Stream number when the RandomVariable is created.

If this approach sounds ok, I can start looking into what we'd need to do to implement it.
Comment 5 Tom Henderson 2011-05-20 16:06:26 EDT
(In reply to comment #4)
> Yes, I see the problem.  
> 
> How about just allowing the user to define a particular Stream for each RV, as
> they do with Substream (using the run number)?  By default, it would operate as
> it does now (creating a new RandomVariable grabs the next Stream).  But, we'd
> add an option to allow the user to choose the Stream number when the
> RandomVariable is created.
> 
> If this approach sounds ok, I can start looking into what we'd need to do to
> implement it.

Yes, should we reserve some large number of streams (how many?) before automatically allocating new ones to users who don't specify a particular stream?
Comment 6 Michele Weigle 2011-05-31 12:06:58 EDT
Here's the approach I propose:

Reserve x number of streams for explicit use.  In order to not invalidate current simulations, these reserved streams should be well into the generator rather than at the beginning (for example, starting at stream y).  

To reserve these streams, we just need to generate the appropriate seeds for them.  I have some test code that does this (saves x seeds starting at seed y into a static variable, much like nextSeed).  Then, when you want to use a specific seed, specify the number (between 1-x) and the appropriate seed will be set.  This will not affect nextSeed, so random variables created after this will not be affected.

Here's my question.  Can we make setting the stream be a function in RandomVariable, or does it need to have a separate constructor, or both?  This depends upon how these will be used most often.

Then, we'll need to decide the proper values of x and y:
x - number of reserved streams, we'll need a double[x][6] static array
y - number of unreserved streams before running into reserved ones, make affect performance to generate lots of these, but I haven't tested it yet
Comment 7 Mathieu Lacage 2011-05-31 12:19:01 EDT
(In reply to comment #6)
> Here's the approach I propose:
> 
> Reserve x number of streams for explicit use.  In order to not invalidate
> current simulations, these reserved streams should be well into the generator
> rather than at the beginning (for example, starting at stream y).  
> 
> To reserve these streams, we just need to generate the appropriate seeds for
> them.  I have some test code that does this (saves x seeds starting at seed y
> into a static variable, much like nextSeed).  Then, when you want to use a
> specific seed, specify the number (between 1-x) and the appropriate seed will
> be set.  This will not affect nextSeed, so random variables created after this
> will not be affected.
> 
> Here's my question.  Can we make setting the stream be a function in
> RandomVariable, or does it need to have a separate constructor, or both?  This
> depends upon how these will be used most often.
> 
> Then, we'll need to decide the proper values of x and y:
> x - number of reserved streams, we'll need a double[x][6] static array
> y - number of unreserved streams before running into reserved ones, make affect
> performance to generate lots of these, but I haven't tested it yet

Before we go and hack stuff in, someone should take time to review what omnetpp does.
Comment 8 Michele Weigle 2011-05-31 16:27:26 EDT
Maybe someone who is more familiar with OMNeT++ can comment as well, but I took a look at the source.  It seems that it just uses a set of RNGs (like our streams) for a whole set of random variables -- instead of a separate RNG for each RV, like ns3 does.  

Also, I didn't see code that guaranteed independence between the different streams, but I could have just missed that.
Comment 9 Tom Henderson 2011-06-07 14:55:28 EDT
(In reply to comment #6)
> Here's the approach I propose:
> 
> Reserve x number of streams for explicit use.  In order to not invalidate
> current simulations, these reserved streams should be well into the generator
> rather than at the beginning (for example, starting at stream y). 

Why would choosing x reserved streams to start at stream 0 invalidate current simulations?  It will likely change (not invalidate) simulation output after this change is made; however, the current system doesn't guarantee that output is unchanged across ns-3-dev changes.

It seems to be clearer to me to not have these buried somewhere at some offset y to stream 0.  How to pick y so as to be large enough to avoid most (how to define most?) or all current simulations yet not incur a performance issue seems tricky.  
 
> 
> To reserve these streams, we just need to generate the appropriate seeds for
> them.  I have some test code that does this (saves x seeds starting at seed y
> into a static variable, much like nextSeed).  Then, when you want to use a
> specific seed, specify the number (between 1-x) and the appropriate seed will
> be set.  This will not affect nextSeed, so random variables created after this
> will not be affected.
> 
> Here's my question.  Can we make setting the stream be a function in
> RandomVariable, or does it need to have a separate constructor, or both?  This
> depends upon how these will be used most often.

I would probably lean towards something like a pure virtual method
virtual void RandomVariable::UseReservedStream (uint32_t streamIndex) = 0;
each subclass needs to then deal with this properly when it goes to call 'new RngStream ()'.  By making this pure virtual, we break any out-of-tree RandomVariables and make them update their model.  The SeedManager could perhaps enforce bounds checking on the streamIndex and whether streams are requested more than once (calling NS_FATAL_ERROR for exceptions).

We probably also need to extend the constructors too since some of these variables are instantiated as attributes (see class RandomWaypointMobilityModel).  An alternative is to add an additional attribute for a ReservedStream that could be used to call "UseReservedStream()", but I am not sure what the default of such an attribute is if we want it to default to random stream.

The number of reserved streams could be perhaps set and stored as a GlobalValue in the same way as RngSeed.

The base class copy constructor needs to do something sane if it is copying a RandomVariable with a reserved stream.

Then, I think we will need to work through a practical use case (mobility module) to plumb through the APIs and see if we got the design right.

> 
> Then, we'll need to decide the proper values of x and y:
> x - number of reserved streams, we'll need a double[x][6] static array
> y - number of unreserved streams before running into reserved ones, make affect
> performance to generate lots of these, but I haven't tested it yet
Comment 10 Mathieu Lacage 2011-06-07 16:12:35 EDT
(In reply to comment #3)
> I believe that we should provide a way to deterministically set the stream
> number (not the substream, or run number) for random variables.  The problem 

It is unclear to me what you mean by the above because it has been a while I did not look at the details of the random variable implementation. So, let me rephrase what I would expect the user experience to be here.

For each RandomVariable instance that is uniquely identified in my simulation by an attribute path (say, /NodeList/0/MobilityModel/Boo, I should be able to specify through the attribute system a number (a seed of whatever) that, if combined with the current run number, will generate a deterministic stream of random numbers according to the distribution specified. 

Now, of course, if I don't specify one, one will be generated for me out of a global seed generation/stream number variable. 

Now, if I want to be careful with all this, I obviously need to control from a single location all of these random variables. Hey, this is what the ConfigStore is supposed to do, right ? It could easily generate a sample assignment of numbers to instances and then, you could keep this around and feed it to your simulation whenever you need it.

To summarize, the natural way of dealing with all of this (for me) would be:

Config::Set ("NodeList/0/$RandomWaypointMobilityModel/Speed/Stream", StringVariable (156));

Note that the above would require us to make RandomVariable subclasses obtain attribute powers to export the stream attribute. Something that might be problematic here is the pass-by-value semantics we gave to RandomVariable and which stands out from every other object we have in ns-3.
Comment 11 Tom Henderson 2011-06-07 16:39:12 EDT
(In reply to comment #10)
> (In reply to comment #3)
> > I believe that we should provide a way to deterministically set the stream
> > number (not the substream, or run number) for random variables.  The problem 
> 
> It is unclear to me what you mean by the above because it has been a while I
> did not look at the details of the random variable implementation. So, let me
> rephrase what I would expect the user experience to be here.
> 
> For each RandomVariable instance that is uniquely identified in my simulation
> by an attribute path (say, /NodeList/0/MobilityModel/Boo, 

I was not proposing to limit it to instances hooked to an attribute path

> I should be able to
> specify through the attribute system a number (a seed of whatever) that, if
> combined with the current run number, will generate a deterministic stream of
> random numbers according to the distribution specified. 

You should be able to (optionally) make the output of a particular instance of a random variable deterministic, independent of the presence of other random variables in the system.  The issue is that the mapping of streams to random variables depends on the order of creation of these random variables; there is no way to fetch a particular stream to deterministically map it to a random variable.  So, it is the tuple (seed, run number, stream number) that would make the output stream deterministic, but presently we have (seed, run number) and "stream number" is hidden beneath the API and is first-come, first-served.

> 
> Now, of course, if I don't specify one, one will be generated for me out of a
> global seed generation/stream number variable. 

Yes.

> 
> Now, if I want to be careful with all this, I obviously need to control from a
> single location all of these random variables. Hey, this is what the
> ConfigStore is supposed to do, right ? It could easily generate a sample
> assignment of numbers to instances and then, you could keep this around and
> feed it to your simulation whenever you need it.

Yes, if this facility is limited to random variables that are attributes.
> 
> To summarize, the natural way of dealing with all of this (for me) would be:
> 
> Config::Set ("NodeList/0/$RandomWaypointMobilityModel/Speed/Stream",
> StringVariable (156));
> 
> Note that the above would require us to make RandomVariable subclasses obtain
> attribute powers to export the stream attribute. Something that might be
> problematic here is the pass-by-value semantics we gave to RandomVariable and
> which stands out from every other object we have in ns-3.

I would be OK with limiting the use to use cases such as you outlined above; that would scratch my itch.  I see value in trying to enforce that these values end up in the config-store output of an experiment.
Comment 12 Mathieu Lacage 2011-06-08 05:53:29 EDT
(In reply to comment #11)

> You should be able to (optionally) make the output of a particular instance of
> a random variable deterministic, independent of the presence of other random

Would that require this output to be independent of the run number/seed ? 

Another use-case would be to find a way to use antithetic random variables. i.e., make a specific random variable become 1 - randomVariable for a given run. Michele, what would think about this ?

> variables in the system.  The issue is that the mapping of streams to random
> variables depends on the order of creation of these random variables; there is
> no way to fetch a particular stream to deterministically map it to a random
> variable.  So, it is the tuple (seed, run number, stream number) that would
> make the output stream deterministic, but presently we have (seed, run number)
> and "stream number" is hidden beneath the API and is first-come, first-served.

Is there a way in the low layers to say: please, give me a specific stream number ?

> 
> > 
> > Now, of course, if I don't specify one, one will be generated for me out of a
> > global seed generation/stream number variable. 
> 
> Yes.
> 
> > 
> > Now, if I want to be careful with all this, I obviously need to control from a
> > single location all of these random variables. Hey, this is what the
> > ConfigStore is supposed to do, right ? It could easily generate a sample
> > assignment of numbers to instances and then, you could keep this around and
> > feed it to your simulation whenever you need it.
> 
> Yes, if this facility is limited to random variables that are attributes.

It should be trivial to dump the file that contains all attributes for all objects and then filter out on the type of the RandomVariable base class.

> > 
> > To summarize, the natural way of dealing with all of this (for me) would be:
> > 
> > Config::Set ("NodeList/0/$RandomWaypointMobilityModel/Speed/Stream",
> > StringVariable (156));
> > 
> > Note that the above would require us to make RandomVariable subclasses obtain
> > attribute powers to export the stream attribute. Something that might be
> > problematic here is the pass-by-value semantics we gave to RandomVariable and
> > which stands out from every other object we have in ns-3.
> 
> I would be OK with limiting the use to use cases such as you outlined above;
> that would scratch my itch.  I see value in trying to enforce that these values
> end up in the config-store output of an experiment.

agreed.

What I am most worried about is the API impact of such a change. What we need to do is:

1) if an object has a member variable of type RandomVariable, it needs to become Ptr<RandomVariableStream> (we pick a new name to avoid name clashes). This new class RandomVariableStream would probably be based on the RandomVariableImpl base class and subclasses.

2) users who pass RandomVariable subclasses in their scripts to configure stuff would need to be taught to use a RandomVariableHelper that can instantiate the RandomVariableStream classes.

class RandomVariableHelper
{
public:
  SetType (std::string typeid);
  SetAttribute (std::string name, std::string value);
};
(presumably, we could make UniformVariable & co be subclasses of RandomVariableHelper to minimize API breakage.)
Comment 13 Tom Henderson 2011-06-08 09:36:07 EDT
(In reply to comment #12)
> (In reply to comment #11)
> 
> > You should be able to (optionally) make the output of a particular instance of
> > a random variable deterministic, independent of the presence of other random
> 
> Would that require this output to be independent of the run number/seed ? 

No, it would remain dependent on both the run number and seed.

> 
> Another use-case would be to find a way to use antithetic random variables.
> i.e., make a specific random variable become 1 - randomVariable for a given
> run. Michele, what would think about this ?

there is API in the lower layers to set this:
RngStream::SetAntithetic(bool)

> 
> > variables in the system.  The issue is that the mapping of streams to random
> > variables depends on the order of creation of these random variables; there is
> > no way to fetch a particular stream to deterministically map it to a random
> > variable.  So, it is the tuple (seed, run number, stream number) that would
> > make the output stream deterministic, but presently we have (seed, run number)
> > and "stream number" is hidden beneath the API and is first-come, first-served.
> 
> Is there a way in the low layers to say: please, give me a specific stream
> number ?

I could not see this at present.  The way it works is that when you need a stream, RngStream constructor is called, which calls InitializeStream, which uses seed information and state from the previous construction of a stream to create a new stream, and updates the global state in preparation for the next stream construction.  The constructor takes no arguments.

> 
> What I am most worried about is the API impact of such a change. What we need
> to do is:

there is no chance to just make RandomVariable inherit from ObjectBase only to get attribute powers?

> 
> 1) if an object has a member variable of type RandomVariable, it needs to
> become Ptr<RandomVariableStream> (we pick a new name to avoid name clashes).
> This new class RandomVariableStream would probably be based on the
> RandomVariableImpl base class and subclasses.
> 
> 2) users who pass RandomVariable subclasses in their scripts to configure stuff
> would need to be taught to use a RandomVariableHelper that can instantiate the
> RandomVariableStream classes.
> 
> class RandomVariableHelper
> {
> public:
>   SetType (std::string typeid);
>   SetAttribute (std::string name, std::string value);
> };
> (presumably, we could make UniformVariable & co be subclasses of
> RandomVariableHelper to minimize API breakage.)

What about just keeping around the existing RVs for backward compatibility and defining a new RV with these added capabilities?
Comment 14 Michele Weigle 2011-06-09 14:00:01 EDT
> Why would choosing x reserved streams to start at stream 0 invalidate current
> simulations?  It will likely change (not invalidate) simulation output after
> this change is made; however, the current system doesn't guarantee that output
> is unchanged across ns-3-dev changes.

Right, it wouldn't invalidate them, but it would change the output.

> Another use-case would be to find a way to use antithetic random variables.
> i.e., make a specific random variable become 1 - randomVariable for a given
> run. Michele, what would think about this ?

I'd be hesitant about using antithetic variables.  I don't know much about them, but from what I've seen they aren't independent, but rather produce a negative correlation.

> Is there a way in the low layers to say: please, give me a specific stream
> number ?

Yes, this exists in L'Ecuyer's original code, but I took it out in the ns-3 implementation to avoid temptation.  =)  

I ran a quick test to see how long it would take to skip the first 10,000 seeds and record the next 1000 seeds for deterministic use.  It took under 100 ms, so I don't think there's a performance hit (at least for speed's sake) for doing things this way.  (I called this from RngStream::EnsureGlobalInitialized (void) inside the if (!initialized) clause.  initialized is static, so this section of code is run only once.)

One of the nice things about SetSeed is that you can set the seed for that particular RV, but it doesn't affect the seeds coming from the default stream (next_seed). So as long as you've jumped far enough ahead, you won't have any interference issues.

As long as the seeds used for the deterministic RVs are generated by advancing through the streams based on the starting seed and run, everything will still be independent (and so I'd be happy).

-Michele
Comment 15 Tom Henderson 2011-06-23 11:00:59 EDT
(In reply to comment #14)
> 
> I ran a quick test to see how long it would take to skip the first 10,000 seeds
> and record the next 1000 seeds for deterministic use.  It took under 100 ms, so
> I don't think there's a performance hit (at least for speed's sake) for doing
> things this way.  (I called this from RngStream::EnsureGlobalInitialized (void)
> inside the if (!initialized) clause.  initialized is static, so this section of
> code is run only once.)

I am still skeptical that skipping any seeds is needed or worth the delay incurred.  I am not sure we are really preserving much of anything because presently, small perturbations in the scenario can cause lots of differences in the substream assignments.  However, I am willing to concede that point in the interest of getting this capability added; maybe I am not understanding the concern completely.

> 
> One of the nice things about SetSeed is that you can set the seed for that
> particular RV, but it doesn't affect the seeds coming from the default stream
> (next_seed). So as long as you've jumped far enough ahead, you won't have any
> interference issues.

I think you are referring to RngStream::SetSeeds() but I don't see how a user can call that from RandomVariable class.

Also, will use of SetSeeds() (different seeds for different RVs) affect the independence assumption between random variables?  For multiple independent replications, we have been recommending to SetSeed only once and advance run number only.
Comment 16 Michele Weigle 2011-06-24 14:16:57 EDT
(In reply to comment #15)
> I am still skeptical that skipping any seeds is needed or worth the delay
> incurred.  I am not sure we are really preserving much of anything because
> presently, small perturbations in the scenario can cause lots of differences in
> the substream assignments.  However, I am willing to concede that point in the
> interest of getting this capability added; maybe I am not understanding the
> concern completely.
 
OK, if there's no concern about re-running regression/validation tests with a new release, then there's no real need to skip streams.

> > One of the nice things about SetSeed is that you can set the seed for that
> > particular RV, but it doesn't affect the seeds coming from the default stream
> > (next_seed). So as long as you've jumped far enough ahead, you won't have any
> > interference issues.
> 
> I think you are referring to RngStream::SetSeeds() but I don't see how a user
> can call that from RandomVariable class.
> 
> Also, will use of SetSeeds() (different seeds for different RVs) affect the
> independence assumption between random variables?  For multiple independent
> replications, we have been recommending to SetSeed only once and advance run
> number only.

Actually, here's a different way.  Instead of saving the seeds and using SetSeeds(), create and save some number of RngStream objects in a protected streams array.  This would guarantee that the seeds are being generated in an independent manner.  I think this could be done at the end of the if (!initialized) conditional in RngStream::EnsureGlobalInitialized() -- after the global seed and run have been set.

When a RandomVariable needs to use a protected stream, it could pass the RngStream to the RandomVariable constructor.

When one of these protected streams is used by a RandomVariable, it could be marked as 'used' so that another RandomVariable couldn't use it.   Maybe the error message thrown (when a user tries to use an already used stream) could indicate which of the protected streams are available for use in user-level programs (if core code needs to use some of these as well).

This seems simpler than my previous suggestion.  Does it make sense to everyone?
Comment 17 Tom Henderson 2011-07-06 18:19:22 EDT
(In reply to comment #16)
> (In reply to comment #15)
> > I am still skeptical that skipping any seeds is needed or worth the delay
> > incurred.  I am not sure we are really preserving much of anything because
> > presently, small perturbations in the scenario can cause lots of differences in
> > the substream assignments.  However, I am willing to concede that point in the
> > interest of getting this capability added; maybe I am not understanding the
> > concern completely.
> 
> OK, if there's no concern about re-running regression/validation tests with a
> new release, then there's no real need to skip streams.

unlike ns-2, we have very little test output that would be perturbed by such a change.

> 
> > > One of the nice things about SetSeed is that you can set the seed for that
> > > particular RV, but it doesn't affect the seeds coming from the default stream
> > > (next_seed). So as long as you've jumped far enough ahead, you won't have any
> > > interference issues.
> > 
> > I think you are referring to RngStream::SetSeeds() but I don't see how a user
> > can call that from RandomVariable class.
> > 
> > Also, will use of SetSeeds() (different seeds for different RVs) affect the
> > independence assumption between random variables?  For multiple independent
> > replications, we have been recommending to SetSeed only once and advance run
> > number only.
> 
> Actually, here's a different way.  Instead of saving the seeds and using
> SetSeeds(), create and save some number of RngStream objects in a protected
> streams array.  This would guarantee that the seeds are being generated in an
> independent manner.  I think this could be done at the end of the if
> (!initialized) conditional in RngStream::EnsureGlobalInitialized() -- after the
> global seed and run have been set.
> 
> When a RandomVariable needs to use a protected stream, it could pass the
> RngStream to the RandomVariable constructor.

I think that I was suggesting basically this approach (last paragraph of comment 3).

> 
> When one of these protected streams is used by a RandomVariable, it could be
> marked as 'used' so that another RandomVariable couldn't use it.   Maybe the
> error message thrown (when a user tries to use an already used stream) could
> indicate which of the protected streams are available for use in user-level
> programs (if core code needs to use some of these as well).
> 
> This seems simpler than my previous suggestion.  Does it make sense to
> everyone?

I did not think of adding protection to prevent double use, but it seems like a good idea.

Can you suggest some sample API modifications, from user's perspective?  In particular, I am most interested in understanding how our mobility models can be made to use such reserved streams, and whether it can be an option or whether it must be hard-wired and invisible to the user.
Comment 18 Mathieu Lacage 2011-07-06 19:14:46 EDT
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > I am still skeptical that skipping any seeds is needed or worth the delay
> > > incurred.  I am not sure we are really preserving much of anything because
> > > presently, small perturbations in the scenario can cause lots of differences in
> > > the substream assignments.  However, I am willing to concede that point in the
> > > interest of getting this capability added; maybe I am not understanding the
> > > concern completely.
> > 
> > OK, if there's no concern about re-running regression/validation tests with a
> > new release, then there's no real need to skip streams.
> 
> unlike ns-2, we have very little test output that would be perturbed by such a
> change.
> 
> > 
> > > > One of the nice things about SetSeed is that you can set the seed for that
> > > > particular RV, but it doesn't affect the seeds coming from the default stream
> > > > (next_seed). So as long as you've jumped far enough ahead, you won't have any
> > > > interference issues.
> > > 
> > > I think you are referring to RngStream::SetSeeds() but I don't see how a user
> > > can call that from RandomVariable class.
> > > 
> > > Also, will use of SetSeeds() (different seeds for different RVs) affect the
> > > independence assumption between random variables?  For multiple independent
> > > replications, we have been recommending to SetSeed only once and advance run
> > > number only.
> > 
> > Actually, here's a different way.  Instead of saving the seeds and using
> > SetSeeds(), create and save some number of RngStream objects in a protected
> > streams array.  This would guarantee that the seeds are being generated in an
> > independent manner.  I think this could be done at the end of the if
> > (!initialized) conditional in RngStream::EnsureGlobalInitialized() -- after the
> > global seed and run have been set.
> > 
> > When a RandomVariable needs to use a protected stream, it could pass the
> > RngStream to the RandomVariable constructor.
> 
> I think that I was suggesting basically this approach (last paragraph of
> comment 3).
> 
> > 
> > When one of these protected streams is used by a RandomVariable, it could be
> > marked as 'used' so that another RandomVariable couldn't use it.   Maybe the
> > error message thrown (when a user tries to use an already used stream) could
> > indicate which of the protected streams are available for use in user-level
> > programs (if core code needs to use some of these as well).
> > 
> > This seems simpler than my previous suggestion.  Does it make sense to
> > everyone?
> 
> I did not think of adding protection to prevent double use, but it seems like a
> good idea.
> 
> Can you suggest some sample API modifications, from user's perspective?  In
> particular, I am most interested in understanding how our mobility models can
> be made to use such reserved streams, and whether it can be an option or
> whether it must be hard-wired and invisible to the user.

I feel that introducing such "hidden" streams does nothing to improve the user understanding of the system. i.e., if the user wants to make a stream deterministic, it is his problem to set deterministically the seed/run number of that stream: we should not be trying to guess what he wants.
Comment 19 Michele Weigle 2011-07-07 13:00:46 EDT
(In reply to comment #17)
> (In reply to comment #16)
> > Actually, here's a different way.  Instead of saving the seeds and using
> > SetSeeds(), create and save some number of RngStream objects in a protected
> > streams array.  This would guarantee that the seeds are being generated in an
> > independent manner.  I think this could be done at the end of the if
> > (!initialized) conditional in RngStream::EnsureGlobalInitialized() -- after the
> > global seed and run have been set.
> > 
> > When a RandomVariable needs to use a protected stream, it could pass the
> > RngStream to the RandomVariable constructor.
> 
> I think that I was suggesting basically this approach (last paragraph of
> comment 3).

Yep, it just took me a little while to come around.  =)

> Can you suggest some sample API modifications, from user's perspective?  In
> particular, I am most interested in understanding how our mobility models can
> be made to use such reserved streams, and whether it can be an option or
> whether it must be hard-wired and invisible to the user.

To be honest, I haven't been using ns-3 myself much lately, so I can't be very detailed.  Maybe seeing the mobility test you mention in comment 3 (https://www.nsnam.org/bugzilla/show_bug.cgi?id=101#c3) would help me get a feel for what might need to be done.

Also, are there places in the core code that would need these types of streams, or is this mainly going to be something that's done by the user?

I envision there being some user-accessible set of reserved (maybe that's not the right term) streams.  You first say that you want a RNG using one of these, by providing the stream index number.  Then, you provide that RNG when you're creating your RandomVariable.  We'll need to add constructors to the RandomVariable classes to allow the user to use a particular RNG.  Or, I guess we could just add to the constructor for the RandomVariable classes where the user provides the desired stream index number and the appropriate RNG stream is used (saving a step for the user).

You can still get different results for each run by changing the run number, but changing just the topology (as you mentioned earlier) won't change the results.

This is something that the user would have to do intentionally, so it needs to be transparent.  As long as we're defining the streams by an index number (that's based on a single seed) rather than allowing users to pick a different seed for each RNG, we're still guaranteeing independence among the various RNGs. 

I can try to cobble something together, but having a good use case example would be helpful.
Comment 20 Tom Henderson 2011-07-31 14:51:01 EDT
I'd like to suggest a solution along the following lines:

1) size of reserved pool of RngStreams

We want to reserve the first 'n' streams for deterministic assignment
to RandomVariable instances.  We do not want this to be too large by
default so it doesn't bog down the start of the simulation; we don't
want it to be too small because large simulations will exhaust the pool.

Proposal:  add this new global value

static ns3::GlobalValue g_rngReserved ("RngReserved",
                                  "The number of initial streams to reserve for specific assignment",
                                  ns3::IntegerValue (1000),
                                  ns3::MakeIntegerChecker<uint32_t> ());

uint32_t is large enough to cover the pool of roughly 2^15 substreams in the generator.  Not sure what magic number should be used for the default.

2) how to request a deterministic stream on a simple declaration:

suggest to overload the constructor such that (e.g.):

  UniformVariable u (5.0, 10.0);

could optionally become 
    UniformVariable u (5.0, 10.0, <stream#>);

and likewise, UniformVariable u (); could be UniformVariable u (<stream#>);

It may be useful to add a method to set the reserved stream, after construction, but before its initial use:

+  /**
+   * Must be called before GetValue() is used.
+   *
+   * \param streamIndex reserved index to use
+   */
+   void SetReservedStream (uint32_t streamIndex);

but probably this could be avoided until the need arises.

There would need to be some kind of object to manage the reserved streams and prevent double or out-of-bound assignment.

There may also need to be some kind of "well known reserved assignments" so that 

3) use in attributes

It seems that we will want to avoid the use in attributes; e.g. as in making
a change to this:
    .AddAttribute ("Speed",
                   "A random variable used to pick the speed (m/s).",
                   RandomVariableValue (UniformVariable (2.0, 4.0)),
                   MakeRandomVariableAccessor (&RandomWalk2dMobilityModel::m_speed),
                   MakeRandomVariableChecker ())

since it will lead to double assignments.

4) use of container based helpers

We need to be able to plumb this into helper-oriented code such as:

  MobilityHelper mobility;
  mobility.SetPositionAllocator ("ns3::RandomDiscPositionAllocator",
                                 "X", StringValue ("100.0"),
                                 "Y", StringValue ("100.0"),
                                 "Rho", StringValue ("Uniform:0:30"));
  mobility.SetMobilityModel ("ns3::RandomWalk2dMobilityModel",
                             "Mode", StringValue ("Time"),
                             "Time", StringValue ("2s"),
                             "Speed", StringValue ("Constant:1.0"),
                             "Bounds", StringValue ("0|200|0|200"));
  mobility.InstallAll ();
  Config::Connect ("/NodeList/*/$ns3::MobilityModel/CourseChange",
                   MakeCallback (&CourseChange));

The limitation here is that reserved streams must be unique and
must not collide with other reserved streams that may be allocated
by some other helper.  Also, the node containers may be variably sized,
so it is not possible to anticipate how large of a reserved pool
may be needed.

Proposal is to make it user's responsibility to deconflict the usage
of reserved streams, but make it easy to enable them from the helper
and to tell the mobility helper the offset into the pool of reserved
streams (similar to how the automatic address allocator is given an
offset via SetBase()).

e.g.

  MobilityHelper mobility;
  mobility.SetPositionAllocator ("ns3::RandomDiscPositionAllocator",
                                 "X", StringValue ("100.0"),
                                 "Y", StringValue ("100.0"),
                                 "Rho", StringValue ("Uniform:0:30"));
  mobility.SetMobilityModel ("ns3::RandomWalk2dMobilityModel",
                             "Mode", StringValue ("Time"),
                             "Time", StringValue ("2s"),
                             "Speed", StringValue ("Constant:1.0"),
                             "Bounds", StringValue ("0|200|0|200"));
+ // make all of the underlying RandomVariables get a deterministic offset
+ uint32_t streamOffset = 100;
+  mobility.UseReservedStreams (streamOffset);
  mobility.InstallAll ();
  Config::Connect ("/NodeList/*/$ns3::MobilityModel/CourseChange",
                   MakeCallback (&CourseChange));

In this specific case, however, it has the drawback that MobilityHelper
would need to know about all of the mobility models.  Probably, what
will be needed here in addition is to make mobility model-specific helpers.
Comment 21 Mitch Watrous 2011-10-11 13:07:07 EDT
Rather than reserving the first RngReserved initial substreams in each stream for specific assignment, I propose that we reserve the last RngReserved substreams in each stream for specific assignment.

That way if there were no substreams reserved, the set of random number sequences generated in all of ns-3 would be exactly the same as they are now.  In other words, all of the random number sequences created by ns-3's modules, tests, and examples would be exactly the same as they are presently.

Only once a particular existing randomly chosen substream number of random numbers was assigned to a particular substream number would that substream's values be different than they are currently.
Comment 22 Michele Weigle 2011-11-02 12:47:37 EDT
The only issue I see with this is that to access the last substreams, you'll need to step through all of the previous substreams.  Since there are many substreams, it may add some overhead.

(In reply to comment #21)
> Rather than reserving the first RngReserved initial substreams in each stream
> for specific assignment, I propose that we reserve the last RngReserved
> substreams in each stream for specific assignment.
> 
> That way if there were no substreams reserved, the set of random number
> sequences generated in all of ns-3 would be exactly the same as they are now. 
> In other words, all of the random number sequences created by ns-3's modules,
> tests, and examples would be exactly the same as they are presently.
> 
> Only once a particular existing randomly chosen substream number of random
> numbers was assigned to a particular substream number would that substream's
> values be different than they are currently.
Comment 23 Tom Henderson 2011-11-03 00:45:01 EDT
(In reply to comment #22)
> The only issue I see with this is that to access the last substreams, you'll
> need to step through all of the previous substreams.  Since there are many
> substreams, it may add some overhead.
> 

Mathieu has a nice solution to this that seems to avoid stepping through all of the previous streams.  The repository safe/ns-3-rng has this code:

The method RandomVariableStream::SetStream (int64_t stream) has these comments:

  if (stream == -1)
    {
      // The first 2^63 streams are reserved for automatic stream
      // number assignment.
      ...
  else
    {
      // The last 2^63 streams are reserved for deterministic stream
      // number assignment.
      ...
    }

note, there are actually more than 2^64 streams in the underlying generator but we are just constraining it in ns-3 to 2^64 streams by use of the 64-bit integer.

The new RngStream constructor calls some arithmetic to allow one to advance to the proper stream as needed, without generating all of the prior ones.

I will try to get a patch out for review soon.
Comment 24 Tom Henderson 2012-07-11 02:16:29 EDT
this is basically fixed as of changeset: 60846d2741c0  but leaving open for now until existing variables are cut over
Comment 25 Tom Henderson 2012-10-09 14:26:20 EDT
fixed as of ns-3.15 release