Reading the Hierarchical Reasoning Model paper right now and had the thought that @ylecun might actually be right about autoregressive models...
1.84K