Skip to content

Background

Designing, financing, building, commissioning and operating a PV power plant requires regular estimates of the plant’s energy yield.

It is not sufficient to predict a single value, like 4.8 GWh. That would only be accurate if one knew the future solar irradiance and weather experienced by the plant, and the future behaviour of the plant itself.

Rather than a single value, engineers therefore forecast the energy yield as a probability distribution, like in the image below. This distribution can incorporate the uncertainty in the weather, in the plant behaviour, and even in the decisions of the electricity regulator.

The probability distribution quantifies the likelihood of producing a particular energy yield. The wider the distribution, the less confident we are in our forecast.

A common way to quantify this uncertainty is with ‘P-values’, such as P90. As shown in the figure, P90 represents the energy yield that has a 90% probability of being exceeded.

Output uncertainty

All stakeholders care about P50 because it represents the best estimate of a plant’s energy yield, and many software tools determine P50, with PVsyst being the best known.

Most stakeholders, however, would also like a conservative estimate of the yield --- like P90 or P95. Conservative forecasts are particularly desirable for those that need to quantify and mitigate financial risk. For example, the financiers of a PV plant are concerned that there will be enough revenue to make the loan repayments during every year of operation, particularly in the early years. Thus, they are interested in the downside of the distribution, with P-values like P90, P95 and even P99. These are the yields that might arise in a year of poor weather, or in an unreliable system; or it might arise from an average year but where the modeller has been overoptimistic.

On the other hand, an owner that operates a system over a long period will be interested in the upside and the downside, that is, in forecasts that account for the good and the bad years, as well as the good and the bad plants in their portfolio. Thus, they tend to be interested in all P-values.


The uncertainty in an energy-yield forecast can arise from many constituent sources of uncertainty. Some of the major sources of uncertainty relate to the solar radiation, to modelling accuracy, to soiling, degradation, availability and curtailment. The challenge for the yield forecaster is (i) to accurately quantify those major sources of uncertainty, and (ii) to combine them correctly. The figure below illustrates how sources of uncertainty have their own probability distributions, and how these combine to give a yield with a probability distribution. If we can determine the yield uncertainty distribution, then we can determine its P-values. We’ll now describe the two main ways to combine uncertainties: the simple sum-of-squares method and the Monte Carlo method. SunSolve P90 applies the latter. Output uncertainty

A common way to combine uncertainties is the ‘sum-of-squares’ method. Its great advantage is simplicity, which allows it to be calculated quickly in a spreadsheet and determined independently of the P50 calculation. That is, whether the P50 was determined using PVsyst, SunSolve Yield or any other program, the yield uncertainty can be calculated afterwards, without repeating the P50 calculation. The sum-of-squares method assumes that all significant sources of uncertainty follow a Gaussian distribution. And in its simplest application, the standard deviations of those distributions, σ1\sigma_1, σ2\sigma_2, …, σn\sigma_n, combine to give a Gaussian yield distribution with a standard deviation of

σY=σ12+σ22++σn2.\sigma_Y = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_n^2}.

The figure below illustrates that when the sources of uncertainty are approximated as Gaussian distributions, they combine to give a yield uncertainty that is Gaussian and with an easily calculated standard deviation. Output uncertainty This approach, however, contains three important assumptions that are not always justified. The first assumption is that all parameters with a significant uncertainty contribute to the yield as simple multipliers; i.e., the yield depends on those parameters as

Y=p1×p2××pn.Y = p_1 \times p_2 \times \cdots \times p_n.

In reality, however, some aspects of yield behaviour cannot be well represented as multipliers. For example, the yield depends on the module power PmP_m, which itself depends on the irradiance Φ\Phi and thermal conductance UU as roughly PmUCΦ/UP_{m} \propto U - C \cdot \Phi / U at a given ambient temperature where CC is a constant (see §8 of equations). Thus, when the uncertainty in both Φ\Phi and UU is significant, the simple sum-of-squares approach introduces error. While it is possible to apply the sum-of-squares method to more complicated equations such as this, it is not trivial and rarely applied in the PV industry.1 The second assumption is that all significant sources of uncertainty are adequately represented by Gaussian distributions. In fact, many sources of uncertainty in a yield forecast have a notable asymmetry (and are therefore non-Gaussian). Examples of asymmetric uncertainties, such as availability and degradation, are discussed in the tutorials. The third assumption is that the uncertainties are independent. In reality, some uncertainties are co-dependent. For example, if the solar irradiation is higher than expected, the air temperature is more likely to be hotter than expected. And, if the albedo is higher than expected, then snow-shading might be higher than expected. We stress here, however, that SunSolve P90 currently doesn’t account for co-dependent uncertainties.2

Whether or not those assumptions are justified depends upon the situation, but we can avoid making them by applying the Monte Carlo method.

The Monte Carlo method involves simulating the yield multiple times, where in each simulation, the inputs are randomly determined from their probability distributions.

In the figure below, we illustrate how the probability distribution of the yield is determined from its sources, and that as the number of simulations increases, the calculated distribution converges on its true distribution.

For energy-yield forecasts, the calculated distribution tends to be sufficiently accurate to determine P90 and P95 once about 1000 or 10000 simulations have been performed. The error that arises from not running infinitely many simulations is referred to as ‘stochastic error’.

The Monte Carlo method has been demonstrated to yield forecasts in many academic publications 3 4 but it is rarely used in commercial yield forecasts. A major reason for this is that 1000s of simulations are required to determine P90 and P95 with sufficient precision, and that makes it impractical to apply software like PVsyst (which takes 0.1—1 minute to solve) or more sophisticated programs like SunSolve Yield (which takes 1—10 minutes to solve).

SunSolve P90 has been designed to make this calculation much faster. Its approach is equivalent to running about 50—1000 PVsyst simulations per second.

Thus, Monte Carlo yield analyses can be sufficiently fast that they can be applied routinely.

Output uncertainty

Determining P-values from a Gaussian yield distribution

Section titled “Determining P-values from a Gaussian yield distribution”

We make a brief aside to note that when a yield distribution is Gaussian, its 95% confidence interval is the range ±1.960σY\pm 1.960 \cdot \sigma_Y, and its P-values are easily determined from the P50 and σY\sigma_Y as

P-value=P50×(1+zσY),\text{P-value} = P_{50} \times (1 + z \cdot \sigma_Y),

where the z-values are given in the table below.

P-valuez-valueP-valuez-valueP99, –2.326, P1, +2.326P95, –1.645, P5, +1.645P90, –1.282, P10, +1.282P75, –0.674, P25, +0.674

Some uncertainties are significant and others are negligible. Take, for instance, the contributions of efficiency and area when computing the power of a module. Say the module’s STC efficiency is η=(20.0±0.8)%\eta = (20.0 ± 0.8)\% and its area is A=2.00±0.01A = 2.00 ± 0.01 m², amounting to relative errors of 4% and 0.5%, respectively. At first glance, 0.5% does not seem negligible compared to 4%, but recall that, when they’re independent and Gaussian, errors combine as the sum-of-squares. Hence, since the STC module power is P=ηAP = \eta \cdot A \cdot 1000 W/m², its uncertainties combine to give a relative percentage error of

ϵP=42+0.52=4.03\epsilon_P = \sqrt{4^2 + 0.5^2} = 4.03

Thus, in this example, the uncertainty in the module area contributes just 0.03% uncertainty to the module power. That is, the module power is (400.0 ± 16.1) W when one includes uncertainty in the area, or (400.0 ± 16.0) W when one excludes it. Or, put otherwise, the smaller error appeared significant when it was 1/8 of the larger error, but in reality, it only contributed towards 1/64 of the yield error. For most purposes, this additional error would be negligible. The significance of this is that the yield forecaster can omit negligible sources of error. Of course, one must be careful. If, in the same example, there were 30 parameters with a relative uncertainty of 0.5%, then the resulting percentage error in the yield would be

ϵP=42+30×0.52=4.85,\epsilon_P = \sqrt{4^2 + 30 \times 0.5^2} = 4.85,

and now, although each error is small in itself, combined they make a significant contribution to the total error. To be clear, whatever uncertainty approach a yield forecaster applies, they must have a good knowledge of the sources of uncertainty and of their yield algorithm.

Answering uncertainty questions with SunSolve P90

Section titled “Answering uncertainty questions with SunSolve P90”

By solving the yield with a similar approach to PVsyst, but about 1000 times faster, SunSolve P90 permits a forecaster to routinely apply the Monte Carlo method to their energy-yield analyses. This allows one to introduce and remove sources of uncertainty to quantify their impact on the P-values. This is particularly valuable when the uncertainty distributions are asymmetric and when the parameters are not simple multipliers. For example, take the uncertainty in the wind speed and its impact on yield. If its uncertainty is halved by better metrology, what impact will that have on the yield uncertainty? We describe examples such as this in our tutorials.

  1. The general equation for the error in Y(p1,p2,,pn)Y(p_1, p_2, \cdots, p_n) when all errors are Gaussian and independent is ϵY=i=1n[Ypiϵpi]2\epsilon_Y = \sqrt{\sum_{i=1}^{n}{[|{\frac{\partial Y}{\partial p_i}}|\cdot\epsilon_{p_i}]^2}} .

  2. This might involve having an uncertainty with a standard deviation σ1\sigma_1 that depends on other parameters p2p_2; e.g., σ1=0.02×(1p2)\sigma_1 = 0.02 \times (1 - p_2). SunSolve P90 is written such that extensions such as this are possible. Write to us if you’d like that option.

  3. Thevenard, D. and Pelland, S., 2013. ‘Estimating the uncertainty in long-term photovoltaic yield predictions,’ Solar energy, 91, pp.432-445.

  4. Prilliman, M.J., Hansen, C.W., Keith, J.M.F., Janzou S., Theristis, M., Scheiner, A., Ozakyol, E., ‘Quantifying Uncertainty in PV Energy Estimates Final Report,’ NREL report, 2023.