AI资讯日报 - 2025/10/11

👨‍🔬 Kai Zhang, Xiangchao Chen, Bo Liu, Tianci Xue, Zeyi Liao, Zhihan Liu, Xiyao Wang, Yuting Ning, Zhaorun Chen, Xiaohan Fu, Jian Xie, Yuxuan Sun, Boyu Gou, Qi Qi, Zihang Meng, Jianwei Yang, Ning Zhang, Xian Li, Ashish Shah, Dat Huynh, Hengduo Li, Zi Yang, Sara Cao, Lawrence Jang, Shuyan Zhou, Jiacheng Zhu, Huan Sun, Jason Weston, Yu Su, Yifan Wu

Dream to Recall: Imagination-Guided Experience Retrieval for Memory-Persistent Vision-and-Language Navigation

学术论文 ArXiv 重要度: 6

Memoir利用世界模型想象未来状态，选择性检索环境和行为记忆，提升记忆持久视觉语言导航的效能与效率。

👨‍🔬 Yunzhe Xu, Yiyuan Pan, Zhe Liu

VideoNorms: Benchmarking Cultural Awareness of Video Language Models

学术论文 ArXiv 重要度: 6

推出VideoNorms基准，评估VideoLLM的文化认知能力，发现模型在规范违反、跨文化理解等方面存在显著差距。

👨‍🔬 Nikhil Reddy Varimalla, Yunfei Xu, Arkadiy Saakyan, Meng Fan Wang, Smaranda Muresan

On the optimization dynamics of RLVR: Gradient gap and step size thresholds

学术论文 ArXiv 重要度: 6

理论分析RLVR优化动态，提出梯度间隙概念与步长阈值，解释其收敛条件与性能崩溃机制，验证于LLM实验。

👨‍🔬 Joe Suk, Yaqi Duan

Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing

学术论文 ArXiv 重要度: 5

Kontinuous Kontext引入标量编辑强度控制，实现指令驱动图像编辑的连续强度调节，支持从细微到强烈的多样化操作。

👨‍🔬 Rishubh Parihar, Or Patashnik, Daniil Ostashev, R. Venkatesh Babu, Daniel Cohen-Or, Kuan-Chieh Wang

SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models

学术论文 ArXiv 重要度: 5

提出SpatialLadder渐进训练框架，通过三阶段学习提升VLM空间推理能力，在多个基准上实现SOTA性能。

👨‍🔬 Hongxing Li, Dingming Li, Zixuan Wang, Yuchen Yan, Hang Wu, Wenqi Zhang, Yongliang Shen, Weiming Lu, Jun Xiao, Yueting Zhuang

🤖 AI资讯日报

📊 今日趋势总结

Why Boring Businesses Outlast AI Hype Cycles

The AI Crackpot Index

Ask HN: What's the pain using current AI algorithms?

Ask HN: Is the rate of progress in AI exponential?

Ask HN: Anyone concerned about NYC Local Law 144?

NLP, AI, ML, bots – a passing trend or much more? What's your take on this?

Ask HN: What would you read to learn about "artificial intelligence"?

Ask HN: Dipping my toes with artificial intelligence and what to expect? (CS)

Common Lisp + Machine Learning Internship at Google (Mountain View, CA)

Bioinformatician

Show HN: Startup Raising capital through Book Sales

The Next Bill Gates or Albert Einstein in AI "Chris Clark" – Yourobot

BLAZER: Bootstrapping LLM-based Manipulation Agents with Zero-Shot Data Generation

NovaFlow: Zero-Shot Manipulation via Actionable Flow from Generated Videos

ArenaBencher: Automatic Benchmark Evolution via Multi-Model Competitive Evaluation

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

How to Teach Large Multimodal Models New Skills

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Agent Learning via Early Experience

Dream to Recall: Imagination-Guided Experience Retrieval for Memory-Persistent Vision-and-Language Navigation

VideoNorms: Benchmarking Cultural Awareness of Video Language Models

On the optimization dynamics of RLVR: Gradient gap and step size thresholds

Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing

SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models

📅 历史日报目录