Executive Summary
One of the primary virtues of using Monte Carlo analysis for evaluating a retirement plan is that it frames the conversation in terms of the probability of success and the risk of failure, rather than simply looking at how much wealth is left at the end of the plan. As a result, the focus of planning shifts from maximizing wealth, to maximizing the likelihood of success and minimizing the risk of failure.
Yet the reality is that while "failure" from the Monte Carlo perspective means the client ran out of money before the end of the time horizon, in truth most clients will not simply continue to spend on an unsustainable path right to the bitter end. Instead, if the plan is clearly heading for ruin, clients begin to make adjustments. Some failures may be more severe than others, and consequently some plans may require more severe adjustments than others.
But the bottom line is that a "risk of failure" is probably better termed a "risk of adjustment" instead. However, when viewed from that perspective, it turns out that the plan with the lowest risk of adjustment may not be the ideal plan for the client to choose!
The inspiration for today's blog post is some analysis I did on the uses and applications of Monte Carlo for the February issue of The Kitces Report. In the process of digging into how we typically frame "probability of success" or "probability of failure" in the plan  naturally trying to maximize success and minimize failure  I realized that there is an inherent danger in doing so blindly, because of the simple fact that not all "failures" are the same, nor do they all require the same adjustments to get back on track.
For instance, imagine a 65yearold client couple looking at two plan options. Plan A uses a relatively equitycentric portfolio that has a 90% probability of success that the plan will sustain for 30 years. Plan B uses a more conservative allocation, but because it generates a lower average rate of return that more frequently fails to keep pace with inflationadjusted spending requirements, it has only an 85% probability of success. Measured by probability of success alone, the path seems clear: Plan A fails only 10% of the time, while Plan B fails 15% of the time, so Plan A is the winner, as shown in the table below.
Probability of Success 

Plan A 
90% 
Plan B 
85% 
In a world where all "failures" are the same, choosing Plan A over Plan B would be a prudent decision. But the reality is that not all failures are the same, and not all failure scenarios require the same adjustment in order to get back on track. For instance, what happens if we look at one of the worst case scenarios, such as a 2 standard deviation result? According to the normal distribution, an end result that is 2 standard deviations or worse should only happen about 2.15% of the time (we'll round off and assume it's equivalent to the 2nd percentile), so this represents a relatively rare, but certainly not impossible, outcome.
So what happens when we look at the 2nd percentile of results under Plan A? It's pretty ugly. Due to the heavy exposure to equities, a severe bear market is a rather destructive event. If we look at this very negative scenario, the hypothetical client actually runs out of money in year 20. In other words, while the client may succeed in 90% of the scenarios and only fail in 10% of them, a whopping 20% of the failures (2% out of the 10%) are rather catastrophic, as the client runs out of money a whole decade early! If that rare but possible adverse event happens early in the client's plan, getting back on track could require some very draconian cuts to the client's standard of living, if the client is trying to recover from what would otherwise be a 10 year shortfall in the plan. In this case, failure may be relatively uncommon, but if it does occur, it requires a big adjustment.
On the other hand, Plan B has far less exposure to equities and far less overall volatility. As a result, an unfavorable sequence can't be all that unfavorable, and a below average series of returns can't be all that far below average. Consequently, when we look at the 2nd percentile in this case, the client doesn't run out of money until year 27, which in turn means that the adjustments necessary to get back on track are relatively mild. In other words, while the odds are higher that this client will have to make some adjustments (i.e., it is a higher failure rate), the failures themselves are not very severe, and the adjustments necessary to mitigate tough scenarios are far more manageable.
Probability of Success 
2nd percentile failure year 

Plan A 
90% 
Year 20 
Plan B 
85% 
Year 27 
Suddenly, the optimal retirement decision is far less clear. While Plan A has a higher probability of success at 90%  and thus "only" a 10% risk of adjustment  the consequences of those required adjustments can be very harsh to shore up a 10year shortfall. Plan B may have a lower probability of success, and therefore a higher probability of adjustment (i.e., a "failure"), but the magnitude of the adjustment will be relatively minor, as the client only has to eke out an extra 3 years of retirement income over the 30 year time horizon even in a highly adverse scenario. In other words, it may be better to follow the plan that leads to a slow failure  which can be easily fixed with midcourse adjustments  than a fast failure, even if the slow failure scenario is somewhat more likely to require at least some modest adjustments. On the other hand, as was recently highlighted on this blog, it appears that most clients tend to slow down their spending in later years anyway  at least, unless they have so much extra wealth they just start giving it away when they no longer spend it  which means plans that cause slow failures and require slow, modest adjustments may simply move in sync with an aging client's lifestyle anyway.
The bottom line is that while the benefit of Monte Carlo is to focus on the probability of success and risk of failure (as opposed to just the average final wealth projected with zerovolatility straightline growth), when we take a deeper look at what those failure scenarios look like, the reality emerges that the highest probability of success and lowest probability of failure may not always be the most desirable plan. Instead, we have to look at both the risk of adjustment, and the potential magnitude of adjustment, to get a clear picture of the risks and opportunities involved.
So what do you think? Is probability of success or risk of failure better framed as the risk of adjustment? Do you ever discuss the magnitude of changes that would be required with an adjustment, in addition to the probability? Would this help your clients make better retirement decisions? Would it change the path any of them are currently taking?
(Editor's Note: This article was featured in the Carnival of Retirement #9 on Sense to Save.)
Manish Malhotra says
I am glad to see this post after our email exchange on this topic of which plan is better. Hope to see comments from other advisors on this topic.
Dylan says
I prefer to frame it as a risk of adjustment. Moreover, I like to refers to the probability of overplanning and the inverse probability as underplanning. Since I’m not actually trying to predict a specific outcome, it seams to me that I’m really measuring the likelihood of overplanning vs. underplanning. The goal being to maintain an a appropriate ratio between the two. I find that maintaining a plan that is 4X more likely to be overplanning than under (80th percentile) does not involve severe adjustment, even when some “failing” iterations are catastrophic as long as you “rebalance” that ratio about once a year.
Ben L. Jennings says
You’re absolutely right, Michael. I have long told clients that there are certain decisions where we should be concerned not only with the *probability* that we are _right_, but also the *consequences* if we are _wrong_! It’s not a new way to frame problems – Blaise Pascal said essentially the same thing about 400 years ago (he was referring to the problem of whether to believe in God, but the approach is widely applicable!). Some of our software tools are better at highlighting this issue than others are; I hope Money Guide Pro’s new version, for example, will have enhanced capability to evaluate these kinds of issues.
Wade Pfau says
Michael,
This issue of probability of failure vs. magnitude of failure is quite important! It will tend to lead toward a slightly lower stock allocation when focusing on magnitude of failure.
The February issue (by Joe Tomlinson) and the new March issue (by me, along with Michael Finke and Duncan Williams) of the Journal of Financial Planning both have articles about retirement income strategies that incorporate the magnitude of failure in addition to the probability of failure.
Alex says
Gentlemen,
Welcome to the world of the multiobjective optimization. Problems you are discussing in the very generic case are already solved at least 50 years ago. In the case of two criteria: Reward and Risk, solution is delivered by the Pareto frontier that is the significant generalization of the well known Efficient frontier.
Want to consider the specific content of Reward and Risk for the case of retirement? Perfect. Just position the probability of a successful retirement as a Reward and some percentile of failure as a Risk. Now apply the standard multiobjective optimization technique and get the Pareto frontier.
Do not forget to apply the user’s tolerance measure for selection of particular point on the frontier.
This is much more powerful solution than just comparing two alternatives with different Risk/Reward pairs.
Run free online Retrian’s Retirement Glidepath Optimizer to see how the described above multiobjective optimization technique can be implemented in software for retirement planning. Pay attention that glide path optimization problem usually considering as extremely difficult can be effectively solved during a very few seconds.
By the way, forget about Monte Carlo if you want to solve this type of optimization problems.
Alex,
I’m still getting up to speed on much of the math in this regard myself, but as I understand it you generally need some form of utility function to mathematically quantify how the investor weighs/optimizes around these tradeoffs.
Which in turn makes me wonder how much research we still need to do in investigating the “right” utility functions to come up with the right answers?
Just a random musing…
– Michael
Michael,
There is no even smell of utility function in my software. They are no needed. Just look at the graphical resentation of the Pareto frontier and in 1020 seconds you will make the comfortable decision.
Yes, you can spend hours and days getting personal utility functions, but final improvement will be negligible. So, I do not recommend such approach. I can tell you a lot about utility functions, why they are widely used in the theoretical researches, and why I do not rely on them.
Just go to my site and I believe you can get something new. Do you know, for example, that increasing long term retirement success probability you at the same time increase the probability of failure for short term, i.e. retirement duration that is 57 years shorter than original long term period? This is a very fundamental result, but I believe that only two of us at this moment know about it. This is one of the key for the effective retirement solutions.
I made comprehensive analysis related to retirement glidepath optimization. Partly it is published on my site. Industry has no commercialsoftware effectively solving this problem.
Early next week I’ll post to one of the forums the big article about current state of strategic asset allocation optimization software for retirement planning. I’ll tweet you its URL
Best regards,
Alex
Michael –
Yes yes! – It’s not just probability of success/failure, but also how far above/below the investor’s projected future needs and goals.
But beware the forces of financialmathematics doubletalk, who will try to use this as another basis for mathematical obfuscation and prevention of understanding!
The purpose is dollar purchasing power for future needs and goals, which planners and clients can understand. But the obfuscators will try to divert us into measures of mathematical abstraction that real people do not understand.
For decades they’ve done that to us with misleadingly labeled diversions from the purpose such as utility functions that don’t address investment utility for the client, which of course is dollar purchasing power for her future needs and goals. Even today, this diversion is taught in our universities and centers of “fiduciary” training and credentials.
Now, on the very important matter of magnitude of shortfall, the mighty financialmathematics obfuscation industry is striking again.
The responsible way to address the essential issue you raise is to INFORM the decisionmakers – show them comparisons of the alternatives in probabilities FOR THE PURPOSE, WHICH THEY UNDERSTAND — dollar purchasing power for future needs and goals.
This can be done, and should be done, in this twostep informthedecisionmakers process:
First, compare the conservativetoaggressive range of portfolios in probability of meeting the dollarpurchasingpower goals. That’s your first step, which in your case showed 90% v. 85%.
Then, compare the best of these in full probability distribution of purchasingpower dollar results, revealing how the portfolios compare in likelihoods for how far above or below the goals the results may be. Real people, clients and planners, will see that with your 90%probability portfolio, possibilities of shortfall have greater danger of further below. They will see how different the portfolios are in this respect.
Clients and planners will be INFORMED to make the choice.
That’s our obligation, those of us who offer investmentselection methods and tools – to INFORM the folks who will make the decisions.
Don’t let the forces of financialmathematics obfuscation divert us from our purpose again.
Dick Purcell
Michael,
In regards to safe withdrawal rate research and Monte Carlo simulations, have you developed a preference or acknowledgement of pros & cons? In your Monte Carlo simulation, is inflation static or derived via a normally distributed standard deviation? Do you prefer actual historical sequence of returns and inflation or randomly distributed returns with static inflation? Is the sample size of historical returns too small?
Or is it all semantics, since a Monte Carlo simulation with a very high probability of success usually has an initial withdrawal rate around 4% anyways? Do you have any general thoughts comparing safe withdrawal rate research and Monte Carlo simulations? http://www.kitces.com/blog/archives/387WhatReturnsAreSafeWithdrawalRatesREALLYBasedUpon.html
I look forward to seeing you in a couple days at FPA NorCal.