Weighted Criteria Monte Carlo

In this last section, we will run a different kind of example to illustrate weighted averages in a Monte Carlo simulation in the tab of the workbook labeled “Weighted Decision.” The background story is that we are trying to decide where to go for a family vacation. We have three different options: camping, the beach, or a theme park. The dimensions that we evaluate our decision on are family time, affordability, fun for kids, and proximity. All of these are important to us. We have surveyed multiple people (you, your spouse, and your grandma who is coming with you) to get their preferences. These are rated on a scale of 1 to 9 with 9 being more highly favored. Each person would specify their individual preferences.

Figure 23.7: Individual trip preferences rated on four criteria.

Next, averages are calculated for each of these different dimensions for each location. In other words, the three tables created by our three raters are combined into one by averages. We also calculate standard deviation since both are needed for a normal inverse function. Once we have that, then we can create a normal inverse distribution of those dimensions for each of the different options. In this example, we are using NORM.INV function. You may come across the NORMINV function (without the period) that works with older versions of Excel in addition to the newer versions. We can use the average and the standard deviation tables as parameters in a new table that uses NORMINV so that the values will refresh and run individual scenarios in the eventual data table.

Figure 23.8: Averages, standard deviations, and normal inverse distributions of preferences.

As you can see in the “Norm Inverse” table, we have added up the dimensions in the bottom row, and the highest one is identified with white font. As we refresh it, most often the highest total is camping, with the occasional iteration of beach or theme park. This small table of inputs would work just fine for a Monte Carlo simulation if all of the dimensions were of equal importance and weighted the same. In real life, this is not often the case.

We can create weights for each of the dimensions and create the normal distribution based on those randomized weights. We can perform a pairwise comparison and specify the relative value of one to another. This table is completed and labeled “Pairwise Process.”

Figure 23.9: The pairwise process specifies the relative value of each criterion.

To create the table, each cell is given 100 points to relate the item to the left to the item above the cell. Across the diagonal, it would be comparing one dimension to itself, so we will put a 50 out of 100 in there. We would then fill in the other cells. For instance, affordability would be compared to family time. A 50 would mean they are equally important. If the number is more than 50, then the dimension to the left is more important than the dimension above. If the number is less than 50, then the dimension to the left is less important than the dimension above. You will also notice that we only have to select values for one side of the table. The values that mirror across the diagonal reflect these values subtracted from the full value of 100. For example, the proximity to family time is 40 in the top right and 60 (100-40) in the bottom left. In this case, proximity is more important than family time.

After we specify these weights, we calculate the ratio for those dimensions. A ratio is a comparison that puts a number to the relative value of two things. This is done with a simple formula in a new table titled “Pairwise Process (Ratios).”

Figure 23.10: Pairwise process ratios.

We take the matching cell (same relative location) and divide it by the result of 100 minus itself. The cell D49 represents the ratio of affordability to family time. The formula would be D49/(100-D49), which is 30/(100-30), or 0.43. So, affordability is just 43% of the importance of family time (family time is 2.33 times more important than affordability), whereas proximity seems to be pretty important relative to family time (1.5 times more important). Proximity also seems to be more valuable than affordability. Fun for kids is more important than affordability at a higher ratio than family time is (4 and 2.33).

After creating ratios from the pairwise process, we can move on to percentiles as seen in the table labeled “Norming (Percentiles).”

Figure 23.11: Normalized values.

On the right end, we have created averages of the importance and standard deviations of each of the dimensions. This is also done with a simple formula. This formula is the matching box in the table above divided by the Totals row of the table above in the same column. For the first cell, this looks like 1/3.93 to get 0.255 or 25.5%. After each cell has been filled, we move to the average column. The formula in that cell is the average of the numbers in the same row to the left. The standard deviation column, which is titled StDev, takes the same four numbers used for the average column and uses the standard deviation formula for calculation. Analyzing the averages shows that proximity is weighted the highest, followed by fun for kids, family time, and affordability. For this family vacation, it is clear that cost is of little consequence compared to the other three dimensions.

Now that we know the weights and standard deviations for the different dimensions, we can sum the normalized weights and ratings. We use the values from the original model labeled “Norm Inverse,” and we copy those down. These numbers were calculated from the original averages and standard deviations from the raters and are not weighted.

Figure 23.12: Normal inverse distribution of preferences.

To weight them, we need to multiply by the averages and standard deviations of our weighted dimensions. The column headed Random Avg. comes from a NORM.INV function with the weighted average and standard deviation as parameters. We did this to get kind of a random distribution around the weights of each of the dimensions in addition to the original randomization. Sometimes those randomized averages in that Random Avg. column do not add up to 100%, so we need to standardize that average with a proportion.

Figure 23.13: Random and standardized average of weights.

Finally, we compute a weighted total for each of the three options (camping, beach, and theme park). This is a sum product of the original numbers from the “Norm Inverse” table multiplied by the Standardized Avg. column. For example, under “Camping,” the original family time number is multiplied by the standardized average for family time and added to the product of the one for affordability, and so on. The total for each vacation option is the Wgt. Total row. The highest value of the three options is stored in the Max row, and then the percentage of best alternative means for each cell in the Wgt. Total row is divided by the cell in the Max row. This is a percentage for each of our options based on our individual preferences and ratings combined with the weighted aspect of the categories.

Figure 23.14: Weighted preferences for each vacation option.

Overall, you can see that our models can become pretty complex and that we can create multiple different distributions around different parameters according to their importance to us or our problem. After this, our Monte Carlo simulation is straightforward. All we do is directly refer to the cells in the Pct of Best row in the cells across the top of the table, as seen in row 0 of the data table. We will end up using the following calculations from the completed simulation as charts for further analysis: average percent of best alternative with 25 trials, number of wins in 25 tries, and a dancing line showing the percentages for each option in each trial and how they varied.

Finally, we can run our Monte Carlo simulation. We will not select the top row filled with words. The equations that we want copied down are in row 0, so our selection starts in row 0 and goes through 25. Then we go to Data, What-If Analysis, and Data Table. Does anything vary as we go across the row? No. Does anything vary as we go down the column? Yes. We will take those values and substitute them somewhere out of the way, such as $N$84, as we did in the “Monte Carlo (Advanced)” tab. Hit OK. Come over to our Formulas and hit Calculate Now.

Figure 23.15: Completed Monte Carlo simulation and associated visualizations.

In this run, camping won 15 separate times, theme park won 8, and beach won 3. The charts show us more interesting information about the runs. We can recalculate this a couple of times to see what happens. In almost every run of our simulation, camping is held out as the best alternative based on our individual preferences with weighted categories or dimensions. If we run it enough times, the beach or theme park will sometimes win. For further insight, we can either keep recalculating it or increase the number of trials. Increasing the number of trials gives us stability. In conclusion, it looks like we should choose camping as our next vacation.

I hope this lesson opens your eyes on how to perform Monte Carlo simulations within Excel. It relies on understanding various aspects (or features) of Excel, but, in and of itself, it is not that complicated. Although some of the models can become quite complicated, the actual running of the simulation itself is quite simple.