MachineLearning | Tolgahan Cakaloglu

Oct 09, 2025	Tiny Recursive Model ; Small, Simple… and Surprisingly Strong 🤖🧩
Jul 29, 2025	Group Sequence Policy Optimization (GSPO); A Smarter Approach to RL for LLMs and MoE Models
Jun 30, 2025	The Illusion of Thinking; Apple's Latest Paper Exposes LLM "Reasoning" Limits
Apr 15, 2025	Llama 4 ; Meta Scales MoE, Online RL, and Multimodal Innovation 🦙💡
Mar 15, 2025	Cosmos-Transfer1 ; NVIDIA’s Model for Next-Gen Conditional World Generation 🤖✨
Feb 25, 2025	Native Sparse Attention ; Hardware-Aligned Breakthrough for Long-Context LLMs 🤖✨
Feb 20, 2025	🏅📐 AlphaGeometry2 ; AI Reaching Gold Medal Level in IMO Geometry!
Feb 15, 2025	EvalPlanner ; Meta’s Transparent & Accurate LLM Evaluation Approach 🌟
Feb 10, 2025	SFT vs RL ; Generalization Power in Foundation Models 🚀🤖
Jan 30, 2025	Janus-Pro ; DeepSeek’s Next-Gen Multimodal Model for Vision & Text-to-Image 🖼️🤖
Jan 25, 2025	Memory Layers in Large Language Models ; Boosting LLM Performance 🧠
Jan 15, 2025	CosyVoice 2 ; Streaming Speech Synthesis with Human-Like Naturalness 🎤
Dec 30, 2024	DeepSeek-V3 ; 671B-Parameter MoE LLM Setting New AI Benchmarks 🌟🤖
Dec 25, 2024	Byte Latent Transformer ; Meta’s Tokenizer-Free LLM for Raw Byte Understanding 🔥