# Random Number Generation

## Statistical distributions

Matlab uses 4 core random number generators to create pseudo-random numbers

* `rand` - Uniform pseudo-random number generator on the interval (0,1)
* `randn` - Standard Normal pseudo-random number generator
* `randg` - Standard Gamma pseudo-random number generator
* `randi` - Uniform integer pseudo-random number generator

From these four generators other distributions can be sampled from, e.g.,

```
a + (b-a)*rand % Sample from U(a,b)
m + s*randn % Sample from N(m,s^2)
```

To create a vector of samples from the standard normal distribution we write

```
n = 10    % number of samples
x = randn(n,1)
```

## Generating simulated data sets

When we write our own estimation routines we use simulated data to check how well they perform. Say we want to examine the properties of the OLS estimator. We need to create a dataset that obeys&#x20;

$$Y = X\beta + e$$

For simplicity, lets assume that all the entries of X are iid standard normal random variables. Then we proceed as follows,

{% code title="simulate\_ols.m" %}

```
n = 50; % sample size
beta = [1,2,3]'; % we need to specify the true parameters
sigma = 1; % we need to specify the true parameters

X = [ones(n,1) randn(n,size(beta,1)-1)];  % the matrix of regressors
ep = sigma*randn(n,1); % the error term
Y = X*beta + ep; % the dependent variable
```

{% endcode %}

We now have a cross sectional dataset with which we can run experiments!

Simulating time series data requires a little more effort due to the dependent nature of the data. Say we want to simulate data from an AR(1) process, that is

$$
Y\_t = \rho ; Y\_{t-1} + u\_t
$$

We will have to use a "burn in" to get rid of any effects of starting values and to ensure stationarity of Y(t). If we assume that u(t) is iid N(0,1) we can proceed as follows

{% code title="simulate\_ar1.m" %}

```
n = 500 % number of observations
rho = 0.5 % persistence parameter
burn_in = 100 % number of burn-in periods

Ytemp = zeros(n+burn_in,1); % initialise temp. Y vector

u = randn(n+burn_in,1); % draw error vector
for i = 2:n+burn_in
    Ytemp(i) = rho*Ytemp(i-1) + u(i); % simulate AR(1) model
end

Y = Ytemp(burn_in+1:end,1); % discard burn-in draws
```

{% endcode %}

## Replicability of simulated data

A problem with working on simulated data arises when you need to replicate your results. Due to the "randomness" of the data generating process different results may appear each time the code is run.

However, pseudo-random numbers, such as those generated by Matlab, are created using a entirely deterministic algorithm given an initial value (try closing (if you have it open) and opening MATLAB and typing `randn` and write down the number you get. Now close MATLAB and open it again and type `randn`. You should get the same number again!). We refer to this initial value of the algorithm as the **seed**. By specifying the seed to be used at the beginning of our code we can ensure that the results that we report are the same as those we get from running our code.

One way of specifying the seed in Matlab is by using the function `rng()`. The following example shows how to use `rng()` to replicate a random vector and compares them graphically.

```
rng(42); % Set the seed
X1 = randn(20,1); % Draw sample data
rng(42); % Reset the seed
X2 = randn(20,1); % Draw sample data
plot(X1)
hold on
plot(X2)
```

{% hint style="info" %}
In the example above, try changing the seed to different numbers and explore how the random samples change.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://coding-courses.gitbook.io/matlab/random-numbers-and-simulations/random-number-generation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
