it can be shown that the simple pre-teaching job of predicting which caption goes with which impression can be an economical and scalable way to understand SOTA picture representations from scratch on the dataset of 400 https://k2spiceshop.com/product/liquid-k2-on-paper-online/