On the use of Bayessian statistics for an improved back analysis of settlement data

In a previous article I looked at the impact of noisey surveying data in back analysing a tunnel settlement trough. Obviously improved data would give a better data fit but often this is not possible or not in the control of the person undertaking the back analysis. In these circumstances it would be useful to have an understanding of the impact of noise on the back analysis. Best fit analyses report values such as R2 but in practical terms this means very little when interpetting a particular back analysis.

One approach that I have looked at for giving a richer results than a simple non linear regression is the use of Bayesian Probability. This approach gives a richer results and interestingly can also take into account the assumed magnitude of the tolerance on the survey data.

In this post will give a very high level overview of how this approach can be applied. I will not go into all of the detail because it does require some fairly extensive calculations which is more detail than I want to describe in a simple blog post.

In the most simple version of the model I have used, the following approach has been applied:

  • Determine some specific scenarios for the settlement parameters that are being considered. In the this case I have considered 9 different
    volume loss values and 9 different trough width K values to give 81 different possible combinations of volume loss and trough width K.
  • We have to assign an initial probability to each of these scenarios (to sum to 100%) which is known as the prior. In the this simple model I have assumed that each scenario has an equal probability of occurance.
  • For each combination we cycle through each settlement point used in the back analysis. By assuming errors are normally distributed based on the defined standard deviation of the surveying error1 we can determine the probability that a point could be on the theoretical settlement curve for the combination of settlement parameters that are being considered.
  • Based on all the probabilities that have been determiined we can use a Bayes Table to combine all the probabilities to define a final probability for each combination of
    volume loss and trough width K considered.
  • We can then aggregate the results for different combinations of scenarios. If for instance one of the volume losses considered in the combinations is 1% then we would add up the probabilites for all the scenarios that had 1% volume loss to give the total probability that the volume loss is 1%.

By way of an example of this type of calculation you can find an example of a calculation sheet here. The complexity of the calculations is clear which may be one reason why this approach is not typically adopted.

I have run this model for the two scenarios considered in the previous post where I considered non linear regression. The following images give some superimposed probability distributions for the 5 different settlement profile cases using the Bayesian model. The vertical lines are the non linear regression values which When the models distributions are compared with the non linear regression there are clear some difference in the predictions. It appears that the Bayesian model may be less effective where survey accuracy is good better where the survey accuracy is poor.

Probability distributions for volume loss for 5 test cases with an error standard deviation of 0.25mm

Probability distributions for volume loss for 5 test cases with an error standard deviation of 0.25mm

Probability distributions for trough width K for 5 test cases with an error standard deviation of 0.25mm

Probability distributions for trough width K for 5 test cases with an error standard deviation of 0.25mm

Probability distributions for volume loss for 5 test cases with an error standard deviation of 0.5mm

Probability distributions for volume loss for 5 test cases with an error standard deviation of 0.5mm

Probability distributions for trough width K for 5 test cases with an error standard deviation of 0.5mm

Probability distributions for trough width K for 5 test cases with an error standard deviation of 0.5mm

Whilst this approach appears to have some useful results in terms of predicting a probability distribution for the input parameters for a settlement curve the results do not appear particularly strong when compared to non linear regression. There are however some adjustment to the approach that could give some signficiant benefits. I will look at one of these in a future post.

1.note that I'm using the term surveying error as a catch all term for all the possible sources of error. We know from experience that other sources of error exist (including issues with the Gaussian Trough) and for this approach these are assumed to be normally distributed.