I've been working on something that, I think, ought to perform 30%-200% better than the current solutions, depending on the workload and how the measurement mixes values for average/typical/best-case/worst-case time and space. Some workloads might get more than 200% improvement, and I hope less than 30% will be rare.
The very first time I tried to measure the entire system, however, my measured result wasn't 30% improvement or 200%, it was a little over 15000%. Three zeroes. And I wasn't even trying to make a misleading benchmark — I just wanted to measure how well my code worked and I suppose I unconsciously concentrated on parts where the differences should show up clearly.
15000% certainly is a clear difference.
It's also totally meaningless. It measures something you'll never, ever do. However, it's a fantastically big number, I know that I was honest, and it has taught me that even if benchmark results are laughably unrealistic, it's not always because someone tried to brag or mislead. Maybe they are just myopic. Focused on the details of their work.
I've spoken harsh words about other people's benchmarks in the past. I don't think I will do that any more.
Update: And the best way to represent it is with a logarithmic bar graph. Most people do not really understand a log scale, but the difference still looks very large and so the meaning is preserved. The label on the y axis makes it formally correct.