1、概率论毕业论文外文翻译济南大学泉城学院毕业论文外文资料翻译Statistical hypothesis testingAdriana Albu,Loredana UngureanuPolitehnica University Timisoara, adrianaaaut.utt.roPolitehnica University Timisoara, loredanauaut.utt.roAbstract In this article, we present a Bayesian statistical hypothesis testing inspection, testing theory
2、 and the process Mentioned hypothesis testing in the real world and the importance of, and successful test of the Notes.Key words Bayesian hypothesis testing; Bayesian inference; Test of significance IntroductionA statistical hypothesis test is a method of making decisions using data, whether from a
3、 controlled experiment or an observational study (not controlled). In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level. The phrase test of significance was coined b
4、y Ronald Fisher: Critical tests of this kind may be called tests of significance, and when such tests are available we may discover whether a secondsample is or is not significantly different from the first.1Hypothesis testing is sometimes called confirmatory data analysis, in contrast to explorator
5、y data analysis. In frequency probability, these decisions are almost always made using null-hypothesis tests. These are tests that answer the question Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the va
6、lue that was actually observed?) 2 More formally, they represent answers to the question, posed before undertaking an experiment, of what outcomes of the experiment would lead to rejection of the null hypothesis for a pre- specified probability of an incorrect rejection. One use of hypothesis testin
7、g is deciding whether experimental results contain enough information to cast doubt on conventional wisdom.Statistical hypothesis testing is a key technique of frequentist statistical inference. The Bayesian approach to hypothesis testing is to base rejection of the hypothesis on theposterior probab
8、ility.34Other approaches to reaching a decision based on data areavailable via decision theory and optimal decisions.The critical region of a hypothesis test is the set of all outcomes which cause the null hypothesis to be rejected in favor of the alternative hypothesis. The critical region is usual
9、ly denoted by the letter C.One-sample tests are appropriate when a sample is being compared to the population from a hypothesis. The population characteristics are known from theory or are calculated from the population.- 1 -济南大学泉城学院毕业论文外文资料翻译Two-sample tests are appropriate for comparing two sample
10、s, typically experimental and control samples from a scientifically controlled experiment.Paired tests are appropriate for comparing two samples where it is impossible to control important variables. Rather than comparing two sets, members are paired between samples so the difference between the mem
11、bers becomes the sample. Typically the mean of the differences is then compared to zero.Z-tests are appropriate for comparing means under stringent conditions regarding normality and a known standard deviation.T-tests are appropriate for comparing means under relaxed conditions (less is assumed).Tes
12、ts of proportions are analogous to tests of means (the 50% proportion). Chi-squared tests use the same calculations and the same probability distribution fordifferent applications: Chi-squared tests for variance are used to determine whether a normal population has a specified variance. The null hyp
13、othesis is that it does. Chi-squared tests of independence are used for deciding whether two variables are associated or are independent. The variables are categorical rather than numeric. It can be used to decide whether left-handedness is correlated with libertarian politics (or not). The null hyp
14、othesis is that the variables are independent. The numbers used in the calculation are the observed and expected frequencies of occurrence (from contingency tables). Chi-squared goodness of fit tests are used to determine the adequacy of curves fit to data. The null hypothesis is that the curve fit
15、is adequate. It is common to determine curve shapes to minimize the mean square error, so it is appropriate that the goodness-of-fit calculation sums the squared errors.F-tests (analysis of variance, ANOVA) are commonly used when deciding whether groupings of data by category are meaningful. If the
16、variance of test scores of the left- handed in a class is much smaller than the variance of the whole class, then it may be useful to study lefties as a group. The null hypothesis is that two variances are the same - so the proposed grouping is not meaningful.The testing processIn the statistical li
17、terature, statistical hypothesis testing plays a fundamental role. The usual line of reasoning is as follows:1. There is an initial research hypothesis of which the truth is unknown.2. The first step is to state the relevant null and alternative hypotheses. This is important as mis-stating the hypot
18、heses will muddy the rest of the process. Specifically, the null hypothesis allows attaching an attribute: it should be chosen in such a way that it allows us to conclude whether the alternative hypothesis caneither be accepted or stays undecided as it was before the test.93. The second step is to c
19、onsider the statistical assumptions being made about the sample in doing the test; for example, assumptions about the statistical independence or about the form of the distributions of the observations. This is- 2 -78济南大学泉城学院毕业论文外文资料翻译equally important as invalid assumptions will mean that the resul
20、ts of the test are invalid.4. Decide which test is appropriate, and state the relevant test statistic T.5. Derive the distribution of the test statistic under the null hypothesis from the assumptions. In standard cases this will be a well-known result. For example the test statistic may follow a Stu
21、dents t distribution or a normal distribution.6. Select a significance level (), a probability threshold below which the null hypothesis will be rejected. Common values are 5% and 1%.7. The distribution of the test statistic under the null hypothesis partitions the possible values of T into those fo
22、r which the null-hypothesis is rejected, the so called critical region, and those for which it is not. The probability of the critical region is .8. Compute from the observations the observed value tobsof the test statistic T.9. Decide to either fail to reject the null hypothesis or reject it in fav
23、or of the alternative. The decision rule is to reject the null hypothesis H0 if the observed value tobs is in the critical region, and to accept or fail to reject the hypothesis otherwise.Use and ImportanceStatistics are helpful in analyzing most collections of data. This is equally true of hypothes
24、is testing which can justify conclusions even when no scientific theory exists. Real world applications of hypothesis testing include : Testing whether more men than women suffer from nightmares Establishing authorship of documents Evaluating the effect of the full moon on behavior Determining the r
25、ange at which a bat can detect an insect by echo Deciding whether hospital carpeting results in more infections Selecting the best means to stop smoking Checking whether bumper stickers reflect car owner behavior Testing the claims of handwriting analystsStatistical hypothesis testing plays an impor
26、tant role in the whole of statistics and in statistical inference. For example, Lehmann (1992) in a review of the fundamental paper by Neyman and Pearson (1933) says: Nevertheless, despite their shortcomings, the new paradigm formulated in the 1933 paper, and the many developments carried out within
27、 its framework continue to play a central role in both the theory and practice of statistics and can be expected to do so in the foreseeable future.Significance testing has been the favored statistical tool in some experimental social sciences (over 90% of articles in the Journal of Applied Psycholo
28、gy during the early 1990s). Other fields have favored the estimation of parameters. Editors often consider significance as a criterion for the publication of scientific conclusions based on experiments with statistical results.CautionsThe successful hypothesis test is associated with a probability a
29、nd a type-I error rate. The conclusion might be wrong.- 3 -2810.济南大学泉城学院毕业论文外文资料翻译The conclusion of the test is only as solid as the sample upon which it is based. The design of the experiment is critical. A number of unexpected effects have been observed including: The Clever Hans effect. A horse a
30、ppeared to be capable of doing simple arithmetic. The Hawthorne effect. Industrial workers were more productive in betterillumination, and most productive in worse. The Placebo effect. Pills with no medically active ingredients were remarkably effective.A statistical analysis of misleading data prod
31、uces misleading conclusions. The issue of data quality can be more subtle. In forecasting for example, there is no agreement on a measure of forecast accuracy. In the absence of a consensus measurement, no decision based on measurements will be without controversy.The book How to Lie with Statistics
32、 is the most popular book on statistics ever published. It does not much consider hypothesis testing, but its cautions are applicable, including: Many claims are made on the basis of samples too small to convince. If a report does not mention sample size, be doubtful.Hypothesis testing acts as a filter of statistical conclusions; Only those results meeting a probability threshold are publishable. Economics also acts as a publication filter; Only those results favorable to the aut
copyright@ 2008-2023 冰点文库 网站版权所有
经营许可证编号:鄂ICP备19020893号-2