9/5/2015 operations research group continental airlines introduction to operations research judy...

54
03/17/22 Operations Research Group Continen tal Airlines Introduction to Operations Research Judy Pastor Steven Coy Statistical Concepts, Optimization, Heuristics, Simulation, and Forecasting

Upload: rosaline-barber

Post on 26-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

04/19/23 Operations Research Group

ContinentalAirlines

Introduction to Operations Research

Judy Pastor

Steven Coy

Statistical Concepts, Optimization, Heuristics, Simulation,

and Forecasting

204/19/23 Operations Research Group

ContinentalAirlines Sum and Product Notation

n

iix

m

jj

n

ii yx

A sum of a row or column of n numbers

Ex. X = (1, 3, 4, 9)

= 13 4 9 = 16

An iterated sum: Sums a matrix of numbers having n rows and j columns

A sequential product of n numbers

Ex. X = (1, 3, 4, 9)

= 13 4 9 = 108

n

iix

4

iix

4

iix

304/19/23 Operations Research Group

ContinentalAirlines

Convex Set

The set of all the points that are bounded by this curve

Concave Set

The set of all points that are bound by this curve

Convex and concave sets

404/19/23 Operations Research Group

ContinentalAirlines Probability Concepts

• Experiment– A repeatable procedure

– Has a well defined set of possible outcomes

• Sample outcome– Potential result of an experiment, denoted e

• Sample space– The set of all possible outcomes, denoted S

• Event– A subset of the sample space corresponding to the definition of the

event, denoted E

504/19/23 Operations Research Group

ContinentalAirlines Probability of an Event

• Probability is the degree of chance or likelihood that and event will occur in an experiment

• Calculating the probability for a discrete or countable problem1. Find the sum of possible outcomes that satisfy the definition of the

event

2. Find the sum of the total number of possible outcomes

3. Divide the result in 1 by the result in 2

• In mathematical notation– P(E) = e {E} / e {S}

– P(E) P(S) = 1

604/19/23 Operations Research Group

ContinentalAirlines

P(FreqFlyer) = 1MM/5MM = .20 or 20%

Note: P(S) = 1

Example

Experiment: Pick a pax from the passenger database

Sample Space: 5 million total pax; 1 million pax are frequent fliers

Event: Passenger is a frequent flier

704/19/23 Operations Research Group

ContinentalAirlines Union and Intersection of Two Events

• Intersection – The sum of the sample outcomes of two or more events that are common

to all of these events

– A B

– Typically identified by the word “and” as in A and B

• Union– The sum of the sample outcomes of two or more events

– AB = A + B - A B

– Typically identified by the word “or” as in A or B

• Probability of the Union of Two Events– P(AB) = P(A) + P( B) - P( A B )

804/19/23 Operations Research Group

ContinentalAirlines Example

Experiment: Pick a pax at random from the passenger database

Sample Space: 5 million total pax; 1 million pax are frequent fliers; 2.7 million pax are female; 600 thousand frequent fliers are female

Event: Passenger is a female or a frequent flier

P(Female) = 2.7MM/5MM = 54%P(FreqFlyer) = 20%

P(Female and FreqFlyer) = 0.6MM/5MM = 12%

P(Female or FreqFlyer) = 54% + 20% - 12% = 62%

904/19/23 Operations Research Group

ContinentalAirlines Conditional Probability

• Conditional probability is the probability of an event, A, given that a related event, B, has already occurred

• P(A|B) = P(A B)/P(B)

• Conditional probability effectively reduces the size of the sample space

1004/19/23 Operations Research Group

ContinentalAirlines Example

P(Female and FreqFlyer) = 0.6MM/5MM = 12%P(Female) = 54%

P(FreqFlyer| Female) = 0.12/0.54 = 22.2%

Experiment: Pick a passenger at random from the passenger database

Sample Space: 5 million total pax; 1 million pax are frequent fliers; 2.7 million pax are female; 600 thousand frequent fliers are female

Event: Passenger is a frequent flier given that the pax is female

1104/19/23 Operations Research Group

ContinentalAirlines Calculating Expected Value

• Expectation is a long-run weighted average• Example

– What is the expected return from running a revenue integrity process?– The process, which searches for duplicate bookings and expired TTLs,

costs $0.1 for every record that the process scans. – For each duplicate reservation found, we receive $100 in incremental

revenue and for each expired TTL, we receive $25.– We know that the long-run probability that a reservation will have a

dupe is P(D) = 0.15% and that a TTL is expired is P(TE) = 0.1%– If we use the process to scan 100 K reservations, what is the expected

return?

• This requires an expected value computation

E(R) = P(D) * $100 + P(TE) * $25 - $0.10E(R) = 0.15% * $100 + 0.1% * 25 - $0.10 = $0.175

E(R/100 K) = $0.05 * 100 K = $5000

1204/19/23 Operations Research Group

ContinentalAirlines Random Variables

• Random Variables (RV) are characteristics or outcomes that vary from observation to observation

• Independence of two RVs– two RVs are independent if the outcome of one does not effect the

outcome of another => P(A|B) = P(A)

• Correlation of two random variables– Two RVs are correlated if the knowledge of the outcome of one gives us

an indication of the outcome of the other– Positive: X moves with Y– Negative: as X increases, Y decreases

1304/19/23 Operations Research Group

ContinentalAirlines Probability Distribution of a RV

Consider the unconstrained demand for a leisure class ticket on Flt 102:

Let’s compile the demand for each departure of Flt 102 for a full year and create a frequency histogram.

Notice that the histogram is mound-shaped and approximates a familiar bell-shaped curve.47444139363330272522191613118

1404/19/23 Operations Research Group

ContinentalAirlines

• The bell-shaped curve that we saw on the last slide is a normal density curve

• Using this chart, we could argue that demand for this flight is normally distributed

• Probability calculations with a normal distribution– Example: What is the probability that demand will be less than or equal to

35 pax?

– First, we standardize the curve--transform it so that the area under the curve is equal to 1 (we use a z-transform)

– We then find the area under the curve that satisfies the definition of our event (the interval 0 to 35)

– The area under the curve from 0 to 35 = P(D 35) .84

Probability Distribution of a Continuous RV

1504/19/23 Operations Research Group

ContinentalAirlines

To find the probability, we find the interval on the horizontal axis and calculate the area under the curve corrsponding to that interval

0.000.050.100.150.200.25

0.300.350.400.450.50

-5.0-4.0

-3.0-2.0

-1.00.0

1.02.0

3.04.0

5.0

x

f (x

)

P(D<X)=A

Calculating a Normal Probability

1604/19/23 Operations Research Group

ContinentalAirlines More About Distributions

• Cumulative distribution– In our demand example, we found a probability for a single value of x– A cumulative distribution gives us the probability of D x for any

value of x

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40 45 50x

pro

bab

ilit

y D

<=

x

Cumulative Normal distribution

1704/19/23 Operations Research Group

ContinentalAirlines

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

-5.0-4.0

-3.0-2.0

-1.00.0

1.02.0

3.04.0

5.0x

f (x

)

Truncated Normal Distribution

Demand processes that are described by a normal distribution are truncated at 0--demand cannot be negative

1804/19/23 Operations Research Group

ContinentalAirlines Some Common Distributions

• Poisson: used to describe the arrival pattern (or process) of people to a system– For example, 5 people request a certain fair class product per hour– Mean = variance

• Exponential: used to describe service times– If a process generates poisson arrivals, the generating process is

exponential– Has special properties that makes it a favorite for queueing systems

(waiting lines) and reliability

• Uniform (a favorite, because it is easy): the probability is the same throughout– ExampleU(10, 20): P(x > 15 ) = P(10 < x < 15)

1904/19/23 Operations Research Group

ContinentalAirlines Statistical Measures

• Central tendency– Mean = average

– Median = middle value in a sorted list

– Mode = largest value or highest portion of a probability density curve

• Error measures– e = x - prediction of x

– MAD(MAE): Mean absolute deviation (error): |e| / n

– MAPE: Mean absolute percentage error |e| /x / n

2004/19/23 Operations Research Group

ContinentalAirlines More Measures

• Variance: Measures the dispersion (spread) of the observations

• Standard deviation: The square root of the variance

• Coefficient of variation: The standard deviation divided by the mean--stated as a percentage

• Skew: Measures the asymmetry of a distribution– mean > median: Positive or right-skewed

– mean < median: Negative or left-skewed

• Correlation: Statistical measure of the relationship between two variables from -1, perfect negative correlation to 1, perfect positive correlation

2104/19/23 Operations Research Group

ContinentalAirlines Truncated Normal Distribution

• In RM, the tails of the normal demand curve typically extend beyond the capacity of the plane

• This is why we use unconstraining (detruncation) algorithms to approximate the tail of the curve

0.00

0.01

0.02

0 20 40 60 80 100120

140160

180200

x

f (x

)

Cap = 125

2204/19/23 Operations Research Group

ContinentalAirlines Operations Research

• “application of mathematical techniques, models, and tools to a problem within a system to yield the optimal solution”

• Phases of an OR Project– formulate the problem

– develop math model to represent the system

– solve and derive solution from model

– test/validate model and solution

– establish controls over the solution

– put the solution to work

2304/19/23 Operations Research Group

ContinentalAirlines Linear Programs – A Major Tool of OR

• Linear Programs (LPs) are a special type of mathematical model where all relationships between parts of the system being modeled can be represented linearly (a straight line).

• Not always realistic, but we know how to solve LPs.

• May need to approximate a relationship that is slightly non-linear with a linear one.

• When to use: if a problem has too many dimensions and alternative solutions to evaluate all manually, use an LP to evaluate.

2404/19/23 Operations Research Group

ContinentalAirlines

Linear Programs – A Major Tool of OR

• LPs can evaluate thousands, millions, etc. of different alternatives to find the one that best meets the objective of the business problem.– Fleet Assignment Model - assign aircraft to flight legs to minimize cost

and maximize revenue

– Revenue Management - set bid prices to maximize revenue and/or minimize spill

– Crew Scheduling - schedule crew members to minimize number of crew needed and maximize utilization

2504/19/23 Operations Research Group

ContinentalAirlines Linear Program Formulation

• Understand the system and environment to which the problem belongs

• Understand the problem and the objective to be achieved

• State the model - clear idea of problem and what can and can not be included in the model

• Collect Data - get data/parameters/constraints and boundaries of system and interrelationships

• Determine decisions - define decision variables - what do we need the model to tell us?

• Formulate and solve model

2604/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP

• Problem - how many passengers of each itinerary and fare class should be accepted on each flight to achieve the maximum revenue for the flight network?

• Statement - the model should tell us the above

• Data - demand by itinerary/fare class, aircraft capacity, overbooking levels, expected revenue by itinerary/fare class

• Decisions - how many passengers of each itinerary/fare class to accept on each flight leg

2704/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Data Collection

• Two Flights: SFO-IAH, IAH-AUS• Two fare classes: Y-high fare, Q-low fare• Three itineraries: SFOIAH, IAHAUS, SFOAUS• Six fares:

• Flight capacity: SFO-IAH 124, IAH-AUS 94• No overbooking

Fares

Market Y Q

SFOIAH 400 300

IAHAUS 250 100

SFOAUS 450 320

2804/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Data Collection

Demand

Market Y Q

SFOIAH 30 90

IAHAUS 50 30

SFOAUS 20 50

2904/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Formulation Model

• Data Definition:– F set of flights = {SFOIAH, IAHAUS}

• f index of F (1,2)

– CAPf capacity of flightf = {124, 94 }

– I set of itineraries {SFOIAH, IAHAUS, SFOAUS}

– i index of I (1,2,3)

– IFf set of itineraries over flight f

• IF1={SFOIAH,SFOAUS} IF2={IAHAUS,SFOAUS}

– C set of classes {Y, Q}• c index of C (1,2)

– DMDi,c demand for itinerary i and class c

– FAREi,c fare for itinerary i and class c

3004/19/23 Operations Research Group

ContinentalAirlines

• Now that we have defined all the data that we know about the model, we now must define what we want the model to tell us.

• Problem - how many passengers of each itinerary and fare class should be accepted on each flight to achieve the maximum revenue for the flight network?

• Define decision variables:– Xi,c # pax accepted for itinerary i and class c

• There are 3 itineraries and 2 classes so there are a total of 6 decision variables.

Example: RM Network LP Formulation Model

3104/19/23 Operations Research Group

ContinentalAirlines

• Many sets of values (collectively called solutions) for the six Xi,c variables exist which could satisfy the constraints (formulation coming) of aircraft capacity and maximum demand. These are “feasible” solutions.

• Which solution do we want?

• Problem - how many passengers of each itinerary and fare class should be accepted on each flight to achieve the maximum revenue for the flight network?

• The feasible solution for this is “optimal”.

Example: RM Network LP Formulation Model

3204/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Objective Function

cii c

ci XfareMAX ,,

3304/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Obj. Function & Constraints

• The Objective Function is an expression that defines the optimal solution, out of the many feasible solutions. We can either– MAXimize - usually used with revenue or profit or

– MINimize - usually used with costs

• Feasible solutions must satisfy the constraints of the problem. LPs are used to allocate scarce resources in the best possible manner. Constraints define the scarcity.

• The scarcity in this problem involves a fixed number of seats and scarce high paying customers.

3404/19/23 Operations Research Group

ContinentalAirlines

c class fare

i,itinerary each for

:sConstraint Demand

Fin feach for

:sConstraintCapacity

,,

,

cici

fc IFi

ci

DMDX

CAPXf

Example: RM Network LP Constraints

3504/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Constraints

• Rules for Constraints– must be a linear expression

– decision variables can be summed together but not multiplied or divided by each other

– have relational operators of =, <=, or >=

– must be continuous

• Constraints define the “feasible region” - all points within the feasible region satisfy the constraints.

• The feasible region is convex.

• The optimal solution lies at an extreme point of the feasible region.

3604/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Cplex Input File

MAX 400 X_SFOIAH_Y + 300 X_SFOIAH_Q +

250 X_IAHAUS_Y + 100 X_IAHAUS_Q + 450 X_SFOAUS_Y + 320 X_SFOAUS_QSTCAPY_SFOIAH: X_SFOIAH_Y + X_SFOIAH_Q + X_SFOAUS_Y + X_SFOAUS_Q <= 124CAPY_IAHAUS: X_SFOAUS_Y + X_SFOAUS_Q + X_IAHAUS_Y + X_IAHAUS_Q <=94DMD_SFOIAH_Y: X_SFOIAH_Y <= 30DMD_SFOIAH_Q: X_SFOIAH_Q <= 90DMD_IAHAUS_Y: X_IAHAUS_Y <= 50DMD_IAHAUS_Q: X_IAHAUS_Q <= 30DMD_SFOAUS_Y: X_SFOAUS_Y <= 20DMD_SFOAUS_Q: X_SFOAUS_Q <= 50END

3704/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LPSolution - Constraints

SECTION 1 - ROWS

NUMBER ......ROW....... AT ...ACTIVITY... SLACK ACTIVITY ..LOWER LIMIT. ..UPPER LIMIT. .DUAL ACTIVITY

1 obj BS 58100 -58100 NONE NONE 1 2 CAPY_SFOIAH UL 124 0 NONE 124 -300 3 CAPY_IAHAUS UL 94 0 NONE 94 -100 4 DMD_SFOIAH_Y UL 30 0 NONE 30 -100 5 DMD_SFOIAH_Q BS 74 16 NONE 90 0 6 DMD_IAHAUS_Y UL 50 0 NONE 50 -150 7 DMD_IAHAUS_Q BS 24 6 NONE 30 0 8 DMD_SFOAUS_Y UL 20 0 NONE 20 -50 9 DMD_SFOAUS_Q BS 0 50 NONE 50 0

obj is the objective function value - total revenue from the small network of flightsCAPY_SFOIAH and CAPY_IAHAUS are the capacity constraints. Both are at UL -

upper limit with activities of 124 and 94, respectively (i.e. both flight legs are full).

Dual Activity on each capacity constraint is also known as the Shadow Price of the flight. The SP of SFOIAH is 300 and the SP of IAHAUS is 100. In RM terms, this means that the value of one more seat on SFOIAH is 300 and the value of one more seat on IAHAUS is 100. Alternately, 300 and 100 also define the lowest fare that should be accepted on each leg.

DMD_{SFOIAH,IAHAUS}_{Y,Q} are the demand constraints. SFOIAH_Y, IAHAUS_Y, and SFOAUS_Y are at upper level (i.e. accept all Y passengers). Reject some/all of Q.

3804/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Solution - Decision Variables

SECTION 2 - COLUMNS

NUMBER .....COLUMN..... AT ...ACTIVITY... ..INPUT COST.. ..LOWER LIMIT. ..UPPER LIMIT. .REDUCED COST.

10 X_SFOIAH_Y BS 30 400 0 NONE 0

11 X_SFOIAH_Q BS 74 300 0 NONE 0

12 X_IAHAUS_Y BS 50 250 0 NONE 0

13 X_IAHAUS_Q BS 24 100 0 NONE 0

14 X_SFOAUS_Y BS 20 450 0 NONE 0

15 X_SFOAUS_Q LL 0 320 0 NONE -80

This section of the solution report shows the values for the decision variables at the optimal solution.

The LP tells us to accept 30 SFOIAH Y, 74 SFOIAH Q (reject 16), accept 50 IAHAUS Y, accept 24 IAHAUS Q (reject 6), accept 20 SFOAUS Y, and accept no SFOAUS Q.

Note that the LP cut off all SFOAUS Q booking requests because their fare of 320 is less than the sum of the shadow prices of the two flights (300+100 = 400 > 320).

3904/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Solving LPs

• A problem that sounds small, like our example, can balloon out into many decision variables and constraints.

• Computer software is available to solve linear programs.

• Cost of programs depends on size of problems to be solved.

• Excel has an Add-in to solve small LPs.

• CPLEX is state of the art, but more expensive.

• LPs with 100,000s row and columns can be solved.

4004/19/23 Operations Research Group

ContinentalAirlines Example: RM Network LP Solving LPs

• The first method of solving LPs was invented during WWII by George Dantzig. The algorithm is called SIMPLEX. It is based on convexity theory and that the optimal solution will occur at an extreme point of the solution space

• Newer state of the art algorithms are based on steepest descent gradient methods and are called “interior point” methods

• Interior point methods can be extremely fast (much faster than SIMPLEX) for certain structures of problems

4104/19/23 Operations Research Group

ContinentalAirlines Degeneracy

• When an LP has more than one unique way to reach an optimal objective function value, we say that the problem is “degenerate”

• LP solvers can detect degeneracy but only report one solution

• It would be nice to see all possible solutions

• Different solvers can “land” on different solutions of a degenerate problem, depending on solution strategy

• The RM Network problem is usually degenerate

4204/19/23 Operations Research Group

ContinentalAirlines Other Types of Linear Optimizations

• MIP (Mixed Integer Programming)– is similar to LP but at least one decision variable is required to be a

integer value

– violates the LP rule that decision variables be continuous

– is solved by “branch and bound” - solving a series of LPs that fix the integer decision variables to various integer values and comparing the resulting objective function values

– is done in a smart way to avoid enumerating all possibilities

– is useful, since you can not have .3 of an aircraft

4304/19/23 Operations Research Group

ContinentalAirlines Other Types of Linear Optimization

• Network problem– is a special form of LP which turns out to be “naturally integer”

– can be solved faster than an LP, using a special network optimization algorithm

– is very restrictive on types of constraints that can be present in the problem

• Shortest Path– finds the shortest path from the source (start) to sink (end) nodes, along

connecting arcs, each having a cost associated with them

– is used in many applications

4404/19/23 Operations Research Group

ContinentalAirlines Other Optimization Models

• Quadratic Program– has a quadratic objective function with linear constraints

– can be applied to revenue management, because it allows fare to rise with demand within a problem

• price(OD) = 50 + [5*numpax(OD)]

• max revenue = price * numpax

4504/19/23 Operations Research Group

ContinentalAirlines Other Optimization Models

• Non-linear Program (NLP)– can have either non-linear objective function or non-linear constraints or

both

– feasible region is generally not convex

– much more difficult to solve

– but it is worth our time to learn to solve them since world is actually non-linear most of the time

– some non-linear programs can be solved with LPs or MIPs using piecewise linear functions

4604/19/23 Operations Research Group

ContinentalAirlines Deterministic versus Stochastic

• Two broad categories of optimization models exist– deterministic

• parameters/data known with certainty

– stochastic• parameters/data know with uncertainty

• Deterministic models are easier to solve. Our RM LP is deterministic (we pretend we know the demand with certainty).

• Stochastic model are difficult to solve. In reality, we know a distribution about our demand. We get around this in real life by re-optimizing.

4704/19/23 Operations Research Group

ContinentalAirlines Deterministic versus Stochastic

• Deterministic optimization ignores risk of being wrong about parameter/data estimates.

• No commercial software packages are currently available to do generalized, stochastic optimization.

4804/19/23 Operations Research Group

ContinentalAirlines Heuristics

• Definition - educated guess

• When you use a heuristic to solve a problem, you have a gut feeling that it is a pretty good solution, but can not prove it mathematically

• You can not prove that there is not a better solution out there

• To qualify as an “optimal” solution, there must be a mathematical proof to say that no better solution exists

4904/19/23 Operations Research Group

ContinentalAirlines Types of Heuristics

• “Greedy Algorithms” - also called “myopic” - nearsighted solutions.

• Example: in our RM Network LP, the greedy solution would be to take the highest fare passengers possible on each leg, without looking at the consequences of doing so on the connecting leg. So the greedy solution is to take the SFOAUS Q passengers at a fare of $320. But the optimal solution looks at displacement and says do not take any SFOAUS Q passengers.

5004/19/23 Operations Research Group

ContinentalAirlines Heuristics

• Combinatorial problems can grow exponentially when the number of decisions needed to be made grows linearly. Heuristics can be used in these cases to get a good solution in a reasonable amount of time.

• TSP - Travelling Salesman Problem is a good example of this.

• EMSR is a heuristic. It is provably optimal for two fare classes, but not more. However, it gives a good answer in a finite amount of time and takes probability into account.

5104/19/23 Operations Research Group

ContinentalAirlines Simulation

• When a problem is too complicated to be put into an LP or a solvable non-linear optimization, one way to study the problem is to simulate it under different conditions.

• PODS (Passenger O & D Simulation) is one example.

• Simulation can tell us something about a set of parameters (i.e. total revenue, load factor), but does not point us in the direction of an improvement.

5204/19/23 Operations Research Group

ContinentalAirlines Forecasting

• Types of forecasting techniques used in RM:– pick-up

– booking regression

– exponential smoothing

• Pick-up - adds average future bookings from historical observations to bookings on hand.

• Booking Regression - computes best fit for history of bookings on hand (independent) to final booked (dependent)– final booked = a + b*(bookings on hand)

5304/19/23 Operations Research Group

ContinentalAirlines Forecasting

• Exponential Smoothing - similar to pick-up except the average is weighted. The most recent historical observations are weighted most heavily, decreasing for earlier observations.– recursive relationship

• Avg Pick Up = a * (pick upt-1) + (1-a)2* (pick upt-2) + ...

• boils down to

– Avg Pick Up = a * (pick upt-1) + (1-a) * (last fcst pick up)

• Problem is how to estimate a. 0<=a<=1

• How much weight should be on most recent observation?

AvgPU

5404/19/23 Operations Research Group

ContinentalAirlines

Questions?