LLM

an archive of posts with this tag

Aug 05, 2025 AlphaGo Moment for Model Architecture Discovery ; The Rise of Autonomous AI Scientists 🤖🚀
Jul 30, 2025 Reinforcement pre-training - baking the cherry into the cake
Jul 29, 2025 Group Sequence Policy Optimization (GSPO); A Smarter Approach to RL for LLMs and MoE Models
Apr 15, 2025 Llama 4 ; Meta Scales MoE, Online RL, and Multimodal Innovation 🦙💡
Mar 20, 2025 Qwen2.5-Omni ; Alibaba’s Multimodal Model Elevates Real-Time AI 🧠🎤🖼️
Mar 15, 2025 Cosmos-Transfer1 ; NVIDIA’s Model for Next-Gen Conditional World Generation 🤖✨
Feb 25, 2025 Native Sparse Attention ; Hardware-Aligned Breakthrough for Long-Context LLMs 🤖✨
Feb 15, 2025 EvalPlanner ; Meta’s Transparent & Accurate LLM Evaluation Approach 🌟
Jan 25, 2025 Memory Layers in Large Language Models ; Boosting LLM Performance 🧠
Dec 30, 2024 DeepSeek-V3 ; 671B-Parameter MoE LLM Setting New AI Benchmarks 🌟🤖
Dec 25, 2024 Byte Latent Transformer ; Meta’s Tokenizer-Free LLM for Raw Byte Understanding 🔥
Dec 20, 2024 Alignment Faking ; Can LLMs Fake Alignment with Human Values? 🤔
Dec 15, 2024 InternVL 2.5 ; Open-Source Multimodal LLM Raising the Bar ✨
Dec 10, 2024 Motion Prompting ; Breakthrough in Video Generation from Google DeepMind ✨📹
Dec 08, 2024 Prompt Formatting ; Does It Really Matter for GPT Models? ✨🤔
Nov 30, 2024 Star Attention ; Supercharging LLM Inference with Speed & Accuracy 🚀✨
Nov 25, 2024 Automated Red Teaming ; OpenAI’s Novel Methods for LLM Attack Simulation