Thanks, I have to admit I didn't think of that (saw VOLUME commented out in your Dockerfile, but didn't think that you might have moved it to a run script).
My point remains that Docker benchmarks are especially easy to spin any way you want, since usually they are really benchmarks of the underlying system configuration (with some parts glued together by Docker, but most parts outside of its control).
I've updated the blog post to specify that since it was obviously unclear.