bayes

This quarter I taught some introductory courses in Statistics & Market Research. In past years, the framework was: classical (=frequentist) statistics with SPSS as the tool.

This year, the framework was mixed: Frequentist versus Bayesian approach, and the choice of SPSS or JASP as a tool. My students study marketing and are new to statistics. I’d like to share 3 findings with you.

Finding 1: The students prefer JASP over SPSS. They find it much easier to use. But they miss lots of functionality. Regarding Bayesian versus Frequentism, the picture is clear as well:

Finding 2: Students find the Bayesian way of thinking logical and attractive. This method can be summarized as:

  • start with a “best guess”, a.k.a. prior, based on previous research, experts’ views, a literature study, your gut feeling, industry reports, and so on.
  • gather new data
  • update the prior by means of the new information. This gives your new “best guess” a.k.a. posterior.

What are the things that appeal in this method?

  • the concept that a very outspoken (peaked) prior combined with rather unconvincing data (small sample, small effect size, data all over the place) means that data do not influence the posterior much. “The data is swamped by the prior” so to speak 😉
  • the ideas of “updating knowlegde” and “building upon knowledge”
  • the Bayes Factor (“The data supports HypothesisA  80 times stronger than HypothesisB”) seems to fits better in a business practice where a manager makes an informed choice then a simple “significant/ unsignificant” statement based on some arbitrary cut-off value (like 0.05)
  • the idea that posterior intervals can finally be interpreted the way a reseracher wants: “I am …..% confident that the population mean is in this interval”

Finding 3: Teaching Bayesian and Frequentist thinking at the same time is very confusing to students. The main culprit is in my view the weird way of thinking that is ingrained in Frequentism – to know p(Hypothesis|data) you look at p(data|Strawmanhypothesis) .

Besides, it’sonly sometimes that you are interested in hypothesis. Many times you simply want to know the posterior distribution. Because frequentism can’t give that, they offer two lame alternatives:
1. hypothesis testing – in a very weird way to boot
2. confidence intervals, that are interpreted by everybody and his grandmother as posterior density intervals. Which they are not. But 95% of the textbooks authors seem to have missed this clue train 😉

Advertenties