Here’s an example of Siegfried’s dopiness:
But in fact, there’s no logical basis for using a P value from a single
study to draw any conclusion. If the chance of a fluke is less than 5
percent, two possible conclusions remain: There is a real effect, or
the result is an improbable fluke. Fisher’s method offers no way to
know which is which. On the other hand, if a study finds no
statistically significant effect, that doesn’t prove anything, either.
Perhaps the effect doesn’t exist, or maybe the statistical test wasn’t
powerful enough to detect a small but real effect.
The p-value is a measure of the strength of evidence—it’s not a guarantee. If you’re uncomfortable with p-values in the 5% range, hold out for stronger evidence, say in the sub-1% range. Holding out for more powerful tests to find "small but real effects" is pretty much a mug’s game. It doesn’t convince dissertation committees or the FDA.