Proceedings of the Sixteenth Symposium on Explosives and Pyrotechnics, Essington, PA, April 1997.

m+ks Analysis of Various Threshold Tests

Barry T. Neyer
PerkinElmer Optoelectronics
Miamisburg, OH 45343-0529

Contact Address
Barry T. Neyer
PerkinElmer Optoelectronics
1100 Vanguard Blvd
Miamisburg, OH 45342
(937) 865-5586
(937) 865-5170 (Fax)
Barry.Neyer@PerkinElmer.com

Abstract

Recently R. H. Thompson [1996] has performed simulation to study the performance of m+ks type analysis on the Bruceton test. He has derived a set of k-factors for the common all-fire levels and confidences for various sample sizes. This report performs similar analysis on the Neyer D-, c-, and S- Optimal and the Bruceton and Langlie tests. This effort shows that the m+ks analysis calculates reliable confidence values; as expected the k-factors are strongly dependent on the all-fire level, confidence level, and sample size. In addition, the k-factor is also strongly dependent on the test method and ratio of the population standard deviation to the estimate used to perform the test for both the Bruceton and Langlie tests, but is constant for the Neyer designs. This report also compares the variation of the m+ks all-fire levels obtained with each test method. The data show that there is much less variation for all-fire levels obtained with any of the Neyer designs than for the similar Bruceton and Langlie methods. The implications of this effort for testing of components will be discussed.

Technical Papers of Dr. Barry T. Neyer

Introduction

Sensitivity tests are often used to estimate the parameters associated with latent continuous variables that cannot be measured. For example, in testing the sensitivity of explosives to shock, each specimen is assumed to have a critical stress level or threshold. Shocks larger than this level will always explode the specimen, but smaller shocks will not lead to explosion. Repeated testing of any one sample is not possible because the stress that is not sufficient to cause explosion nevertheless will generally damage the specimen. To measure probability of response, samples are tested at various stress levels and the response or lack of response is noted.

Explosive component designers are often interested in determining the all-fire level, usually defined as the level of shock necessary to cause 99.9% of the specimens to fire. (Some designers seek the 99.99% of 99.999% levels.) A distribution-independent method of estimating a level would require at least one response and one non-response at the specified level. Since sample sizes are usually far smaller than the many thousands required to estimate these extreme levels, the experimenter generally relies on parametric methods. Parametric methods also allow the experimenter to characterize the population as a whole, and to evaluate process variation. Parametric designs test specimens at several stress levels. The parameters of the population are most often estimated by maximum likelihood techniques.

Three different test methods are most commonly used in the explosive test community. The Bruceton Test (Dixon and Mood 1948) was created before the invention of digital computers to simplify the calculation of the parameters of the population. The Langlie Test [1965] was invented to make a test that was less dependent on initial guess of the parameters of the population. The Neyer D-Optimal Test (Neyer 1989, 1994) was designed to extract the maximum amount of information from each test item. A comparison of the various test methods has been performed by several authors (Edelman and Prairie 1966, Neyer 1994, Young and Easterling 1994).

Confidence Level Calculations

Unlike the estimation of the parameters discussed previously, there are a number of very different methods of estimating the confidence intervals for the parameters. Each of these analysis methods generally gives very different estimates of the confidence intervals for the parameters of the distribution.

Simulation (Langlie 1965, Edelman and Prairie 1966, Neyer 1994) has shown that the variance of the estimate of the mean, M, and the square root of the standard deviation, S, is approximately proportional to the population standard deviation, s2. The variance function method assumes the variances of M and S can be estimated by simple functions of the sample size, N, and the standard deviation. Because s2 is not independently known, all methods base their estimates of confidence intervals on the estimate S. This function is generally dependent on both the test design (the type of test, Langlie, Bruceton, etc.) and the initial conditions. The functional dependence is most generally determined by simulation. Langlie [1965] estimated that the variation of the parameters for the Langlie test could be approximated by the equations:

(1)

. (2)

More recently Langlie [1988m] recommends the revised (more conservative) formula:

(3)

for sample sizes larger than 20. Confidence estimates for various percentiles of the distribution are computed by the formula:

, (4)

where za is the 100a percentile obtained from the normal distribution. Several different software programs (Thompson 1987, Langlie 1988o, Neyer 1994la) use this method of obtaining confidence levels.

The asymptotic or Cramer-Rao method (Kendall and Stuart 1967) is used by programs such as ASENT (Mills 1980, Neyer 1994a). This method computes the curvature at the peak of the likelihood function. Variation of the parameters and confidence intervals are deduced from this curvature estimate. The simple sum rule (Dixon and Mood 1948) for analyzing Bruceton tests is based on the asymptotic method. If the conditions for use of the sum rule are met, the sum rule yields estimates of the parameters and confidence intervals for these parameters that are almost identical to the asymptotic values.

The Likelihood Ratio Test Method (Neyer 1992) has been shown to produce reliable confidence interval estimates in all cases. This method calculates the ratio of the likelihood function at computed at various points on the contour to the likelihood function evaluated at the peak. Simulation (Neyer 1991, 1992) has shown that the Likelihood Ratio Test produces more reliable confidence intervals than the asymptotic method for small to moderate sample sizes. This method is also independent of the test method, the initial guess of the population parameters, and the sample size. The MuSig software (Neyer 1994m) uses this analysis method.

M±kS Analysis

Most recently Thompson [1996] has proposed a method of analyzing tests to determine all-fire levels by a M±kS type analysis on the Bruceton test. He has derived a set of k-factors for the common all-fire levels and confidences for various sample sizes. The procedure for generating these k factors is straight forward. A test is chosen with the design, sample size, and initial guess of parameters fixed. A random number generator is used to supply a set of thresholds distributed according to some set of population parameters, m and sand . A simulated test is performed and estimates of the mean, M, and the square root of the standard deviation, S, are computed. New sets of threshold values are chosen and the simulation is repeated a large number of times. A value of k is then chosen such that 95% of the estimates computed from M+kS are less than 99.9% value of the population. The preceding discussion describes how to compute the 95% confidence 99.9% level. Of course other levels could be calculated for other probability or confidence values.

The various k factors can be made arbitrarily accurate by performing a sufficient number of repetitions of the simulation. Thus this method can give completely unbiased confidence levels. None of the analysis methods mentioned in the previous section is able to produce unbiased confidence levels. While the Likelihood Ratio Test method gives confidence levels that are close to the requested value for sample sizes of 25 or greater, the asymptotic analysis methods have been shown to require sample sizes larger than 100 (Neyer 1994, 1994h, 1996).

Because the M+kS analysis method is the only known unbiased analysis method, and it is easy to apply, one might think it is the best method. However, there are several difficulties with this method. As the following sections will show, the k factors not only are a factor of probability, confidence, and sample size, but are also a function of the test method used and the relationship between the parameters of the population and assumed parameters when conducting the test. Because the user does not know the parameters of the population (that is why the test is performed) simulation must be performed with a wide range of assumed parameters. If the resulting k factors are a strong function of the parameters of the population, the utility of this test method is greatly diminished. Furthermore, when performing sensitivity tests it is quite possible to arrive at the result S=0, especially when performing a Bruceton test (Neyer 1994). If these zero values are included, then many high confidence values would require an infinite k factor.

This paper reports similar analysis on the Neyer D-Optimal and the Langlie tests. It also repeats the analysis for the Bruceton tests. This effort shows that the M+kS analysis calculates reliable confidence values; as expected the k-factors are strongly dependent on the all-fire level, confidence level, and sample size. In addition, the k-factor is also strongly dependent on the test method and ratio of the population standard deviation to the estimate used to perform the test for both the Bruceton and Langlie tests, but is constant for the Neyer test. This report also compares the variation of the M+kS all-fire levels obtained with each test method. The data show that there is much less variation for all-fire levels obtained with any of the Neyer designs than for the similar Bruceton and Langlie methods.

Simulation Design

The simulation was performed by picking a set of threshold values from a normal population, and feeding these values to each of the three test methods. Commercial versions of software were used for each of the three tests (Neyer 1994b, 1994l, 1994o).

Each test method requires initial guesses for the parameters of the population so that the test will be conducted most efficiently. The Bruceton test requires an initial test level and a step size. Dixon and Mood [1948] and simulation (Edelman and Prairie 1966, Neyer 1989, 1994) show that picking the mean as the starting point and the square root of the standard deviation as the step size gives the best results. The Langlie test requires a lower and an upper test level. Langlie [1965] suggests the optimal results are obtained if these are chosen at m+4s. Simulation (Edelman and Prairie 1966, Neyer 1989, 1994) has confirmed these results. The Neyer test requires three values, lower and upper limits for the mean value and a guess for the standard deviation. The optimal results are obtained when the limits are set at m+4s, and the standard deviation guess is set to the true standard deviation.

Previous simulation (Edelman and Prairie 1966, Neyer 1989, 1991, 1994) has shown that the behavior of the test method depends critically on the relationship between the initial guesses of the required input and the population parameters. The dependence on the ratio of the square root of the guessed standard deviation to true value (sg/s ratio) is especially strong. For this simulation each test was optimized for a population with a mean of 1000 and a standard deviation of 100. Thus the Bruceton tests had test parameters of 1000 and 100, the Langlie test had test parameters of 600 and 1400, and the Neyer test had test parameters of 600, 1400, and 100.

To ensure that the simulation accurately reflects how sensitivity tests are actually used, it is necessary to perform simulations with the same sg/s ratio as experimenters use in the field. Many labs claim that they can guess sg to within a factor of two of the true value. However, by observing the results of tests performed around the world, I do not believe this to be true. Over the past ten years I have looked at sensitivity test data from scores of governmental and industrial laboratories around the world. In the great majority of cases I believe that the sg/s ratio differs from 1 by more than a factor of two!

In almost every case the test parameters are chosen to be "nice numbers." The step size is 20 or 50, almost never a number like 36.5. Since there is a factor of 2 to 2.5 between nice numbers, the experimenter will be wrong by at least 50% on average just due to rounding errors. Observing the ratio between the s estimate from the data and the sg implied by the test design leads me to believe that the test parameters were not optimized or that the initial guess of sg was wrong by a large factor. Implied sg/s ratios of more than 5 are not at all uncommon in the data sets that I have observed. Thus, it is imperative that simulation be carried out for a large range of sg/s ratios. For this study three test sets were used: the sg/s ratios were 0.5, 1.0, and 2.0. Although these ratios do not span the ratios that are found in the typical laboratory, they are wide enough to illustrate the performance of M+kS analysis.

The simulation was driven by software developed for NSWC Dahlgren. Each test method was optimized for a mean of 1000 and a standard deviation of 100. A set of random thresholds was created with the mean uniformly distributed over the range m±s. The sg/s ratios were either 0.5, 1.0, or 2.0. Ten thousand samples of size 50 were chosen. The simulation with sample sizes smaller than 50 used the first numbers from the set. Thus, the results for a sample size of 40 represent the effects of adding 5 additional test samples to the original 35. The simulation was repeated later with sample sizes of 500. Because the random number sequence repeats only after 231 » 2,000,000,000 iterations, the samples can be considered independent.

The simulation software used commercially available software (Neyer 1994m) to analyze the test sets and stored the individual M and S values and the population parameters in a data file. All tests were analyzed by the same analysis method, finding the maximum likelihood estimates of the parameters.. Many experimenters still use the original Dixon and Mood [1948] prescription of simple sums for analyzing Bruceton tests. Where the simple sum rule is valid, both analysis methods yield the same results. Because the simple sum rule is an approximation of the maximum likelihood method, the more general maximum likelihood method should give superior results.

The data file was subsequently analyzed to determine the k factor necessary to produce a M+kS that was larger than the required percentile at the required confidence. Various values of percentile and confidence were analyzed.

To obtain confidence values for various percentiles and confidence levels the user would follow the following prescription:

  1. Analyze the data and arrive at estimates for the mean, M, and the square root of the standard deviation, S.
  2. Find the S bias, B, by reading the value from the bias graph for the appropriate test, test conditions, and sample size.
  3. Find the k factor by reading the value from the bias graph for the appropriate test, test conditions, and sample size.
  4. To find the 99.9% probability value at 95% confidence, construct the value

(5)

The 3.09 value is the 99.9% percentile of the normal distribution. If other probability levels or other confidence levels are needed, the user would have to consult similar graphs for those values.

Results

One problem with the analysis is that the Bruceton test has a large number of test results with a zero estimate of the standard deviation. There are two ways to account for these cases. One is to include the zero S results. Because no possible k factor can make these M+kS values larger than the 99.9% percentile, these cases limit the maximum confidence that can be achieved. Alternatively, these cases can be ignored. This approach has been adopted by Thompson [1996]. Throwing out such cases is unrealistic. Very few experimenters throw out the results of a large test when they get an S value of zero. Instead, after a number of shots with no overlap, they typically change the step size and continue the test. Because there is no well defined rule for when to change step sizes, it is impossible to conduct a simulation that matches the way a "typical" experimenter conducts the test. It also introduces bias, especially when comparing the results of different test methods that have a much different probability of achieving a zero S. The Bruceton test has the highest percentage of degenerate cases, especially when the sg/s ratio is larger than 1.

Thus, this paper analyzes the data two different ways. The data reported in this section were first analyzed keeping all cases where S > 0. Figure 1 through Figure 12 show the k factor, variation of the 99.9% level at 95% confidence, and bias as computed by the M+kS method. Also shown are the fraction of simulation cases where S = 0. As Figure 4 clearly shows, the Bruceton test has a much higher probability of yielding degenerate tests. The degenerate results are completely eliminated from the rest of the analysis. In the real world, the experimenter can not just forget those cases where S = 0. Instead the test must continue, possibly with a different step size in the case of the Bruceton test, until overlap occurs and the test yields a non-degenerate result.

There are three different graphs on each chart. One graph shows the results when the real population was drawn from a population with a Sigma of 50, one where Sigma was 100, and one where Sigma was 200. From the figures it is readily apparent that the sg/s ratio has a major effect on the k factor, percentile variation, Sigma bias, and percentage of degenerate tests. The only exception is for the Neyer D-Optimal tests. The Neyer D-Optimal test results are essentially independent of the sg/s ratio. As mentioned previously, by analyzing the results of many different threshold tests conducted at many laboratories across the country errors in the sg/s ratio of more than two are extremely common. Thus, great care must be used if this method is to be used to compute accurate all-fire levels for either the Bruceton or Langlie tests.

All the graphs should approach their limiting values of 0 for the k factor, variation of the 99.9% level, and degenerate cases and 1 for the bias as 1/N, where N is the sample size. The graphs all have a bottom scale of 1/(N-8). The "8" is used so that the graphs are closer to straight lines. (See the figures for the Neyer tests.) It represents the loss of information that is present when conducting any type of sensitivity test.

In addition to the individual data points on each graph, there is also a fit to a straight line. Inspection of the graphs indicates that only the Neyer D-Optimal test has a reasonable fit to the straight line functions for smaller sample sizes. The Langlie test curves could all be fit to straight lines, but none of them have the correct asymptotic properties. The Langlie test results do not have the "right" asymptotic properties because the test concentrates the test points too close to the mean as the sample size increases.

Figure 1: k Factor, Bruceton Test

Figure 2: Variation of Probability Level, Bruceton Test

 

Figure 3: Sigma Bias, Bruceton Test

 

Figure 4: Percent Degenerate, Bruceton Test

Figure 5: k Factor, Langlie Test

 

Figure 6: Variation of Probability Level, Langlie Test

Figure 7: Sigma Bias, Langlie Test

 

Figure 8: Percent Degenerate, Langlie Test

 

Figure 9: k Factor, Neyer D Optimal Test

 

Figure 10: Variation of Probability Level, Neyer D Optimal Test

 

Figure 11: Sigma Bias, Neyer D Optimal Test

 

Figure 12: Percent Degenerate, Neyer D Optimal Test

Truncated Results

The previous section showed the results when all non-degenerate tests were included. Because there was such a difference in the percentage of degenerate tests for the different test methods and sg/s ratios, it is essentially impossible to compare the results of different tests.

Both the Langlie and Neyer D-Optimal tests have almost no degenerate tests for reasonable sample sizes because the test levels can get arbitrarily close to each other. The Bruceton test, however, has test levels that are always a fixed distance apart. Thus, if the spread of the population is much smaller than expected, the Bruceton test will yield a degenerate result, while the Langlie and Neyer D-Optimal tests will yield small values for S. The Bruceton test thus has zero probability of producing a standard deviation substantially smaller than the initial guess; instead it yields S = 0. Keeping the small values of S for the Langlie and Neyer D-Optimal tests in these cases forces the k factors to be larger. It also increases the variation in probability levels and the Sigma bias. Thus, all test comparisons are invalid.

However, if instead of just removing the cases where S = 0, all tests where S £ C, where C is a cutoff value, were removed, then it would be possible to compare the results of the different tests. The analysis in the previous section was repeated, but all cases where S £ C were eliminated. Various values of C were studied; the case where C = 40 yielded somewhat similar numbers of ignored cases for all three test methods.

Ignoring all tests where S > C does not change the results appreciably for the Bruceton tests, has a minor change on the Langlie tests, and has a major effect on the Neyer D-Optimal tests. The curves for all three tests look somewhat the same, especially the Bruceton and Neyer D-Optimal. Thus, it appears that the Bruceton test results from the previous section were so much different precisely because so many of the tests were ignored in the analysis.

Because the k factors are so strongly dependent on the sg/s ratio, it would be impossible to use these tables to determine confidence levels in a practical sense. Rarely does an experimenter know s before beginning the test. Moreover, even after completion of the test, knowledge of s is limited. For example, even for sample sizes as large as 50, 5% of the tests yield an estimate S that is at least 40% larger than the correct s and another 5% yield an estimate that is at least 40% smaller. Figure 1 shows that the k factors differ by a factor of two for factors of two error in the sg/s ratio. Thus, it is impossible, even for sample sizes as large as 50, to determine a reliable k factor to use in the analysis. Because it is impossible to determine which k factor to use, it is impossible to use the M+kS analysis method to provide reliable analysis. The only exception to this statement is for analysis of the Neyer D-Optimal or other similar tests. Because the adaptive tests are relatively independent to initial parameter guess, (see Figure 9) there is almost no error is picking the correct k factor. Thus, the curve in Figure 9 and similar curves for other probability levels and confidence values could be used to provide reasonable analysis for the Neyer D-Optimal and similar tests.

Figure 13: k Factor, Bruceton Test, Ignore S 40

 

Figure 14: Variation of Probability Level, Bruceton Test, Ignore S 40

 

Figure 15: Sigma Bias, Bruceton Test, Ignore S 40

 

Figure 16: Percent Degenerate, Bruceton Test, Ignore S 40

 

Figure 17: k Factor, Langlie Test, Ignore S 40

 

Figure 18: Variation of Probability Level, Langlie Test, Ignore S 40

 

Figure 19: Sigma Bias, Langlie Test, Ignore S 40

 

Figure 20: Percent Degenerate, Langlie Test, Ignore S 40

 

Figure 21: k Factor, Neyer D Optimal Test, Ignore S 40

 

Figure 22: Variation of Probability Level, Neyer D Optimal Test, Ignore S 40

 

Figure 23: Sigma Bias, Neyer D Optimal Test, Ignore S 40

 

Figure 24: Percent Degenerate, Neyer D Optimal Test, Ignore S 40

 

Summary

Using m+ks Analysis to arrive at reliable confidence levels for threshold tests requires accurate knowledge of the ratio between the square root of the standard deviation of the population s, and the guess used when conducting the test, sg. Because the experimenter rarely knows s before testing has begun, and can not determine s to much accuracy after completion of a reasonably large sample size, the m+ks Analysis can not be used to provide reliable analysis. The Likelihood Ratio Test (Neyer 1991, 1992, 1994a) does not suffer from the inherent limitations in the m+ks Analysis method. It is able to analyze the results of all sensitivity tests, even those where S = 0, and is relatively independent of the test method used and to the s/sg ratio.

References

J. W. Dixon and A. M. Mood (1948), "A Method for Obtaining and Analyzing Sensitivity Data," Journal of the American Statistical Association, 43, pp. 109-126.

D. A. Edelman and R. R. Prairie (1966), "A Monte Carlo Evaluation of the Bruceton, Probit, and One-Shot Methods of Sensitivity Testing," Technical Report SC-RR-66-59, Sandia Corporation, Albuquerque, NM.

Maurice G. Kendall and Alan Stuart (1967), The Advanced Theory of Statistics, Volume 2, Second Edition, New York: Hafner Publishing Company.

H. J. Langlie (1965), "A Reliability Test Method For "One-Shot'" Items," Technical Report U-1792, Third Edition, Aeronutronic Division of Ford Motor Company, Newport Beach, CA.

H. J. Langlie (1988m), ONE_SHOT PAC Users Manual, CMOS Records, Balboa, California.

H. J. Langlie (1988o), ONE_SHOT PAC, Version 1.2, CMOS Records, Balboa, California.

B. E. Mills (1980), "Sensitivity Experiments: A One-Shot Experimental Design and the ASENT Computer Program," SAND80-8216, Sandia Laboratories, Albuquerque, New Mexico.

Barry T. Neyer (1989), "More Efficient Sensitivity Testing," Technical Report MLM-3609, EG&G Mound Applied Technologies, Miamisburg, OH.

Barry T. Neyer (1991) "Sensitivity Testing and Analysis," Proceedings of the 16th International Pyrotechnics Seminar, June 1991, Jönköping, Sweden.

Barry T. Neyer (1992) "An Analysis of Sensitivity Tests," Technical Report MLM-3736, EG&G Mound Applied Technologies, Miamisburg, Ohio.

Barry T. Neyer (1994) "A D-Optimality-Based Sensitivity Test," Technometrics, 36, 61-70.

Barry T. Neyer (1994h) "How to Learn More from Sensitivity Tests," Proceedings of the 15th Symposium on Explosives and Pyrotechnics, April 19-21, 1994, Essington, Pennsylvania.

Barry T. Neyer (1996) "More Efficient and Reliable Detonator Qualification Testing," Proceedings of the Technology Symposium for High Energy Switches and Electro-Explosive Systems.

Neyer Software (1994a), ASENT Program, Neyer Software, Cincinnati, Ohio.

Neyer Software (1994b), Bruceton Program, Neyer Software, Cincinnati, Ohio.

Neyer Software (1994l), LanglieOhio. Program, Neyer Software, Cincinnati.

Neyer Software (1994la), LangAnal Program, Neyer Software, Cincinnati, Ohio.

Neyer Software (1994m), MuSig Program, Neyer Software, Cincinnati, Ohio.

Neyer Software (1994o), Optimal Program, Neyer Software, Cincinnati, Ohio.

Neyer Software (1995s), Simulate Program, Neyer Software, Cincinnati, Ohio.

Ramie H. Thompson (1996), "Explosive and Pyrotechnics," Vol 29, numbers 8-12, Franklin Applied Physics, Oaks, Pennsylvania.

Ramie H. Thompson (1987), L1SHOT, Franklin Research Institute, Philadelphia, Pennsylvania.

L. J. Young and R. G. Easterling (1994), "Estimation of Extreme Quantiles Based on Sensitivity Tests: A Comparative Study," Technometrics, 26, pp. 48-60.