•1 min read•from Machine Learning
Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization [R]
Paper:
https://arxiv.org/abs/2603.21676
I found this interesting as another iteration of the TRM approach:
- Shows decent OOD generalization in 2/3 tasks
- (but why does this fail >2x? and why is unstructured text so much worse?)
- Explains why intermediate step supervision can hurt generalization.
- This makes statistical heuristics "irresistible" to the model, impairing investment in genuine "reasoning."
- I buy this, and would go further to assert it captures the (insidious) weaknesses of foundation models, and maybe even explains the trap expert humans fall into, when they rely on their (expansive) experience to generate intuition, vs. thinking through a situation with less heuristics and more explicit reasoning.
[link] [comments]
Want to read more?
Check out the full article on the original site
Tagged with
#rows.com
#financial modeling with spreadsheets
#natural language processing for spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#Depth-Recurrent Transformers
#compositional generalization
#reasoning
#OOD generalization
#foundation models
#intermediate step supervision
#generalization
#statistical heuristics
#genuine reasoning
#explicit reasoning
#unstructured text
#intuitive reasoning
#model performance
#expert humans
#task performance