A random variable takes on some number but on top of that we have to associate how likely it takes to occur.

A variable has a probability of becoming one or another value.

Random Variables

When an experiment is performed, we are interested mainly in some function of the outcome as oppose to the actual outcome itself.

e.g An experiment of tossing a coin
I will map head to one and tail to 0, 
P(x=1) = P(H) = 1/2

Maybe in coinflipping of 10 times, I have 10 values in my set.
X = 5 means that i have 5 heads out of this 100 time flip
But i do not know the sequence, 

e.g sequence of X=5
HHHHHTTTTT
HTHHHHTTTT

2^10  (Each toss there are 2 ways to get it..)
We just look at the sequence with 5 heads

P(X=5) = (number of sequence with 5 heads) / (total number of sequence)


This can also be use when testing a large number of component and finding a specfici subset in it.
e.g 100 components, find defective



e.g 1 
let S = {HH, HT , TH, TT}
Define the random variable (Function)
X = number of heads obtain

X: S -> R where R is the set of all real numbers
We map it like:
Such that X(HH) = 2, X(HT) = 1, X(TH) = 1 and X(TT) = 0
X means we pick up one in the sample space and associate with a real number


Imagine we put a value of a sample set into a function and output a number that describe something about the value..

e.g 2
Let X be sum of upturn faces.
Consider tossing a pair of fair dice

S = {(x,y) | x= 1,2....6; y = 1,2...6}
X((x,y)) = x+y

Make sure that the value of x and y are valid (ie x cannot be 7)
Rx = {2,3,4,5,6...12}

E.g 3
A coin is thrown until a head occurs
Let X be number of trials
S = {H, TH, .....TTTTTH....}
X : S -> R

Since X represent the number of trials,
X(TH) = 2


Definition 2.1
Let S be a sample space associated with experiment E
The function x is assign a number to every element s in S
is called a random variable


Notes:
- X is a real value function
- The range of X is a set of real numbers
Each possible value of x of X represent an event that is a subset of sample space S
- If S has element that are themsevles real numbers, we take X(s) = s 
Rx = S

Definition 2.2
Let E be an experiment and S its sample space
X be a random variable defined on S and Rx be its range space.

We can talk like X = 1 where HH, HT TH and TT are samples
X is number of heads
X=1 is (HT,TH)
I define an event (X=1), 
this is linked through the function to give (HT, TH)

We can also define the function to be X = 2 or X =0
which will mapped us to give (HH, TT)

We cannot have a missing event that is defined by the random variable.
e.g
x= 0 or x =1 => (HH,HT) 
This is wrong because we forgot TH

Suppose A consist of all sample points s in S for which X(s) is a set of B
A and B are equivalent events
P(B) =P(B)

e.g 1
A1 = {HH} is equivalent to B1 = {2}
A2 = {HT,TH} is equivalent to B2 ={1}

Using this we can get the probability

Summarize:
Number of heads               0                 1                    2
P(X=x)                            1 / 4             1/2                 1/4

The 3 prob adds up to 1
If x has 100 possible value, we have to write up everything and the corresponding prob
If the random variables take in infinite values, 

Number of trails before heads                     1                  2                    3         .....
P(X=x)                                                       1/2               1/4                 1/ 8      ......


we can write it like p(X =x ) = f(x) { 1/2^x where i = 1,2,3,4....
                                                         {   0       otherwise

e.g 2 
Pair of fair dice is tosssed, prob that sum of 3 is obtain
P(X = 3) = 1/18

Discrete probability Distributions

Let X be a random variable
We need to identify if it is continuous or discrete
Discrete = the x takes finite or countable infinite range of possible values

Think of a range of 0 and 1
Discrete: 0 and 1
Continuous: Rational numbers {0.1.. 0.2 .....}


Probability functions

Each discrete random variable, each value of x has a prob of f(x)
This is a probability function 
Collection of pairs (xi , f(xi)) is called the probability distribution of x

Conditions:
1) f(xi) >= 0 for all xi
2) Summation of all f(xi) should be 1

e.g 2 throwing pair of fair dice
Let x be the sum of the two dice

For all f(x) sum, must be equal to one

e.g 3
Six lots of components are ready to be shipped by a certain cupplier. THe number of defective components in each lots is as follows

1 2 3 4 5 6
0 2 0 1 2 0

One of these lots is randomly slected for next shippment
Let X be the number of defectives in the selected lot


There is only 3 possible values of X : 0,1,2
f(0) = p (X = 0 ) = P( 1 or 3 or 6)

e.g 4
Find the constant c such that
f(x) = cd where x = 1,2,3,4
and 0 otherwise, is a probability function of a random variable X
-> If x is not 1,2,3,4.. the prob is 0

we have the formula
f(x1) + f(x2) + f(x3) + f(x4) = 1
c+2c + 3c+ 4c = 1
therefore c = 1/10

Hence for
P(x>=3) = f(3) + f(4) = 3/10 + 4/10 = 7/10

e.g5
 There are 2 O+ ppl among 5 ppl, we random pick.
 Let Y be the number of typing necessary to identify an O+ individual

Y = {1,2,3}
 f(1) = p(A or B type first) = 2/5 =0.4
f(2) = P(C,D or E first then A or B)
       = 3C1 * 2C1 /  5C1 * 4C1
f(3) = (3C1*2C1*2C1) / ( 5C1 * 4C1 * 3C1)


Another view of probability function

It is often to think of a prob function as specifying a mathematical model for a finite population

e.g Consider Selecting at random  a student who is among 3000000 register for current sem at nus
Let X = number of mods for which selected student is register and suppose X has the prob function

We can seperate the population by proportion.
Consider the population base on x

Continuous Probability Distribution

Definition 2.4

Suppose that Rx, the range space of a random variable x is an interval or a collections of intervals.
We say that X is a continuous random variable

The prob density function f(x) is a function that
1. f(x) > = 0 for all Rx
2. Integrate f(x) dx = 1
since f(x) = 0 for x not in Rx

Not relying on x means its uniformly distributed

3. For any c and d such that c is less than d,
p(c<=X <= d) = integrate f(x) in the range of d and c
We are finding the area of the continuous distribution graph between two points.
It is a estimation..


Remarks:


- For continuous random variable, the integration of the fix point is 0.
This is only for continuous not discrete

- P(A) = 0 does not imply that A is a empty set
e.g Imagine tossing a die, the chance to get a 7 is 0. But that does not mean that the set is 0
We have a set {7}
but p({7}) = 0

- If X assumes values only in interval [a,b] we may set f(x) = 0 for all x outside [a,b]
It does not means that the set is not there but the chance of it happening is 0

e.g1
Random var x is continous
f(x) is given by
cx    for 0<x<1
0      otherwise

We need to check.
a) find value of c
integrate cd dx for 0 to 1
c[x^2/2] for range of 1 and 0
= c/2
hence integrate is 1
implying that c/2 = 1
c = 2

f(x)
2x     0<x<1
0        otherwise



We can just substitute the range to 2x sine f(x) is 2x for 0<x<1

e.g2
TH (Time Headway) in traffic flow is time between the time that one care finishes passing a fixed point and the instant that the next care begin to pass that point.

The p.d.f of X is
f(x) =
0.15e ^ -0.15(x-0.5)   for x>=0.5
0,                                otherwise

x cannot be 0 because that would mean the car is kissing the car in front

If we calculated f(x) integrate in range of 0.5 to infinity,
the answer is 1 which is legitimate.

What is p(x<=5)
This is at most 5 sec (It can be 5)


Cumulative distribution

Definition

Let X be a random variable, discrete or continuous
We define F(x) to be the cumulative distribution function  (c.d.f) of the random variable X
where
F(x) = P(X<=x)

This means that any values below X is consider under this distribuition.

If X is a discrete random variable then

f(X) is the sum of f(t) where t < = X

The c.d.f of a discrete random variable is a step function

- For any two numbers a and b with a<=b
P(a<=X < = b) = P(x<=b) - P(x<a)
= F(b) - F(a-) 
F(a-) means that not inclusive of a

- If the only possible valyes are integer and if a and b are int
then
P(a<= x <= b) = P(X =a or a+1 or .... b)

CDF for continuous random variables

It is a curve thus it is the integration between x and -infinite
Note: Do not write f(x) = integrate f(x) between x and -infinite
                       use   f(x) = integrate f(t) between x and -infinite

For a CFV, 
f(x) = dF(x) / dx [Differentiate with respect to x]
if the derivative exist


Remarks:
- f(x) is non decreasing function
e.g a<=b then f(a) <= f(b)

- 0<= f(x) <= 1


e.g 1 
pf of x is given as

f(x) = 
p * (1-p) ^ (x-1)       if x = 1,2,3   [Discrete]
0                               otherwise

Since it is discrete, we are going to do sum not integration

f(x) = sum of f(t) where t is less than x
t = p * (1-p) ^ (x-1)  

This seems familiar to the series 1+r+ r^2 + r^3 + ... = 1/ (1-r)

= p sum of (1-p) ^s where s starts from 0 to x-1
//We sub s with x-1
= 1 - (1-p)^x for x = 1,2,3

e.g2
Let X be the number of days of sick leave taken by randomly selected employee
If max num allowable sick days per year is 14, possible values of X is 0,1,2 ... 14

P(2<= x <=5) = P(5) - F(2-)

e.g 4
Given the pdf:
f(x) { 2x for 0 <x<1
       { 0 otherwise

This is not the probability but the function.
This means that f(x) is a curve and the product of it and the width will give the probability between f(x) and delta

We want to find F(x) which is the same as P(X <= x)
F(x) = integrate x to neg infinity f(x)

For discrete cases,
There are no probability in rational number (1.5) = 0
It will be a step by step function

Mean and variance of a random variable

Expected values

If X is a discrete random variable, taking on values x1,x2 with probability function f(x)
- Then the mean or expected value ( or mathematical expection) of X, denoted by E(X) as well as by ux

For the mean, we cannot sum all the values and divide by the number.
The mean is the distribution, also know as the expected value

Expected values is the average probability given a continuous or discrete function


It is define by:

Discrete:
The sum of xi * f(xi), this i

Continuous:
Integrate xf(x) from range of infinity to -infinity

The mathematical expectation is an average.

1. The expected value exist if the sum or the integral in the above definitions exist
2. In the discrete case, if f(x) = 1/N for each N values of x, the mean
is 1/N * sum of xi
becomes the average of N items

e.g 1
A gambling game,
gain 5 if gets all heads or all tails in a fair coin toss 3 times
Pay out 3 if either 1 or 2 heads show
What is his expected gain?

HHH, TTT => Gain
HTT, THT, TTH ...=> Pay 3 dollars


If X defines the number of money that he will get, can X be 0?
There are only two ways, either gain 5 or loss 3 (+5, -3)

p(x=5) = 1/8 *2 =1/4
p(-3) = 1- 1/4 = 3/4

Therefore E(x) = 5(1/4) + (-3) (3/4) = -1
You will loose one dollar per game

e.g 2
Roll a balance die
pay c to play game and get i fi number i occurs
How much should we play if game is fair
game is fair if E(gain) = 0

Let X be the outcome of the die
It depends on c,
E(x) = the sum of xf(x) = (1+2....6)(1/6) = 3.5

TO be a fair game, E(gain) = E(pay)

e.g 3
A pilot want insure his airline for 1000000
the insurance company estimates that a total loss may occur with a probability of 0.0002,
50 percent loss with 0.001, 25 percent 0.01
10 percent loss with 0.01

Ignoring all other losses, what premium should the insurance company charge each year to realised an average of profit 5000

Because we realised that it does not add up to 1 (the prob) there might be some other event that happen.
The other probability we take it as if there is no loss

Expected loss ig given by:
1000000(0.0002) + 500000(0.001) + 250000(0.01) + 100000(0.01) = 4200
The insurance shld charge 9200 to make 5000


e.g 4
pdf of gravel sale X is given by
f(x) {3/2(1-x^2) for 0<x<1
       { 0               otherwise

Find E(X)

We have to integrate x*3/2(1-x^2) for the range of 0 to 1

Some special cases: 

g(x) = (x - u) ^2
This is the definition of variance of a given random variable X
u is the E(x)

The bigger the variance, the more spread out the data is.

Let X be a random variable with pdf f(x) then the variance of X is define as  E(X-u)^2

If X is discrete
 sum of E(X-u)^2 * f(X) 

If X is continuous
Integrate  E(X-u)^2 * f(X) 


If variance =0 , that means there is no random variable, every x must be equal to u
This is only when it is a constant.


1. V(x) >= 0
2. V(x) = E(x^2) - [E(x)] ^2

Note: E(ax) = aE(x)

The positive sqroot of the variance is called the standard deviation of x

- g(x) = x^k
then E(X^k) is called the kth moment of X
This is the moment relative to the central/origin
central if its u
E[(x-u)^k]

Properties of expectation

1) E(aX +b) = aE(X) +b

where a is the gradient, b is the y intercept
Y is a linear function of x.
E(x^2) != E(x)^2

<Slide 123 for proof>
For discrete
By breaking it into two pieces, we see that x is changing. We can bring a and b out.
By def of PDF, sum of fx(x) is 1
The key step is the same thing, instead of sum is integration for continuous
E(aX) = aE(x)
E(x+b)  = E(x) + b

note: expectation of a1sinx + a2x^10 = a1E(sinx) +a2E(x^10)


2) V(x) = E[(X=u)^2]

Proof in slide 128

3) V(aX + b) = a^2 V(X)

e.g4
Jewelry shop purchase 3 necklaces for 500 per piece
Sell for 1000 a piece and the designer repurchase if any nexclass still unsold after a specific period at 200 a piece
Let X denote the number of necklace sold and suppose X follow PD

with g(x) = revenue -cost = 1000x +200(3-x) - 3(500)
= 800X - 900

Chebyshev Inequality

Given a random variable X, 
if i know the PD, i can find the E(x), V(x) and f(x)
But what if we do not know?
Can we get the probability of the other event?
Chebyshev will give a bound of the probability

We cannot reconstruct the probability from E(x) and V(x)

Let X be a random variable (discrete or continuous) with E(x) = u and V(X) = a^2
Then for any positive number k we have
P(|X-u| > = ka) <= 1/k^2

THe probability that the value of X lies at least k standard deviation from its mean is at most 1/k^2

Remarks:

1. The quantity of k can be any positive number
2. This inequality is true for all distribution with finite mean and variance
3.