- AI
- LLM
- MultimodalAI
- Innovation
- OpenSource
- MachineLearning
- Research
•
•
•
•
•
•
-
InternVL 2.5 ; Open-Source Multimodal LLM Raising the Bar ✨
InternVL 2.5 delivers open-source multimodal LLM performance across reasoning, math, OCR, video, and more—with a scalable three-stage training pipeline.
-
Motion Prompting ; Breakthrough in Video Generation from Google DeepMind ✨📹
Motion Prompting enables controllable video generation via spatio-temporal trajectories, opening new possibilities in creative AI.
-
Prompt Formatting ; Does It Really Matter for GPT Models? ✨🤔
New research shows prompt format can dramatically affect GPT performance, especially in smaller models, raising the bar for model-specific prompt engineering.
-
Genie 2 ; Google DeepMind’s Foundation World Model for Interactive 3D AI 🌟🚀
Genie 2 from Google DeepMind generates interactive 3D worlds from single image prompts—unlocking scalable training for embodied AI.
-
Allegro ; Open-Source Text-to-Video Generation Redefining Quality 🎥✨
Allegro by Rhymes.AI introduces a new era in open-source text-to-video generation, combining VideoVAE and VideoDiT for superior quality and consistency.