ここ数日、最初のレベルの流動性が以前よりもはるかに悪化していると感じました 2ステージでもインナーゲームでもプレイするのがはるかに難しい感じがする... しかし、コーデック通貨価格のパフォーマンスは依然として 30M と非常に安定しており、私はまだいくつかのポジションを持っていますが、これは非常に心強いことですが、主に最近 Web2 でロボット トラックが非常にホットになっているため、これも非常に心強いです。 • 最近、Hugging Face の中核研究科学者である Remi Cadene 氏は、パリに本拠を置くロボット工学スタートアップ「Uma」のシード資金として約 4,000 万ドルを調達する交渉を行っています。 • このようなロボット研究開発企業は投資家に好まれており、世界のロボット資金調達は2025年に1,600億ドルを超え、昨年通年の1,720億ドルの規模に近づいています。 結局のところ、コーデックの開発者は VLA のオープンソースに非常に重要な貢献者であり、現在は Web3 であるため、ロボット トラックをよく知っています。 最近は家事で忙しいですが、それでも毎日少しずつ資産を稼ごうと心がけており、少しずつATHを続けていますが、WLFIは少なすぎますが、今回はあまり不安ではなく、安定したメンタルで毎日少しの満足感を得ることができます。 ところで、私は毎日の資産統計のためのウェブサイトをすぐにVibeしました、個人的にはそれは非常に便利だと思います、コメントにGithubリンクを入れ、それをコンピューターに直接ダウンロードしてインデックスのWebサイトを開いて使用します。
CodecFlow
CodecFlow2025年8月22日
VLAs are still very new and a lot of people find it difficult to understand the difference between VLAs and LLMs. Here is a deep dive into how these AI systems differ in reasoning, sensing, and action. Part 1. Let's breakdown the key distinctions and how AI agents wrapped around an LLM differ from operator agents that use VLA models: 1. Sense: How they perceive the world Agent (LLM): Processes text or structured data e.g JSON, APIs, and sometimes images. It’s like a brain working with clean, abstracted inputs. Think reading a manual or parsing a spreadsheet. Great for structured environments but limited by what’s fed to it. Operator (VLA): Sees raw, real-time pixels from cameras, plus sensor data (e.g., touch, position) and proprioception (self-awareness of movement). It’s like navigating the world with eyes and senses, thriving in dynamic, messy settings like UIs or physical spaces. 2. Act: How they interact Agent: Acts by calling functions, tools, or APIs. Imagine it as a manager sending precise instructions like “book a flight via Expedia API.” It’s deliberate but relies on pre-built tools and clear interfaces. Operator: Executes continuous, low-level actions, like moving a mouse cursor, typing, or controlling robot joints. It’s like a skilled worker directly manipulating the environment, ideal for tasks requiring real-time precision. 3. Control: How they make decisions Agent: Follows a slow, reflective loop: plan, call a tool, evaluate the result, repeat. It’s token-bound (limited by text processing) and network-bound (waiting for API responses). This makes it methodical but sluggish for real-time tasks. Operator: Operates, making stepwise decisions in a tight feedback loop. Think of it like a gamer reacting instantly to what’s on screen. This speed enables fluid interaction but demands robust real-time processing. 4. Data to Learn: What fuels their training Agent: Trained on vast text corpora, instructions, documentation, or RAG (Retrieval-Augmented Generation) datasets. It learns from books, code, or FAQs, excelling at reasoning over structured knowledge. Operator: Learns from demonstrations (e.g., videos of humans performing tasks), teleoperation logs, or reward signals. It’s like learning by watching and practicing, perfect for tasks where explicit instructions are scarce. 5. Failure Modes: Where they break Agent: Prone to hallucination (making up answers) or brittle long-horizon plans that fall apart if one step fails. It’s like a strategist who overthinks or misreads the situation. Operator: Faces covariate shift (when training data doesn’t match real-world conditions) or compounding errors in control (small mistakes snowball). It’s like a driver losing control on an unfamiliar road. 6. Infra: The tech behind them Agent: Relies on a prompt/router to decide which tools to call, a tool registry for available functions, and memory/RAG for context. It’s a modular setup, like a command center orchestrating tasks. Operator: Needs video ingestion pipelines, an action server for real-time control, a safety shield to prevent harmful actions, and a replay buffer to store experiences. It’s a high-performance system built for dynamic environments. 7. Where Each Shines: Their sweet spots Agent: Dominates in workflows with clean APIs (e.g., automating business processes), reasoning over documents (e.g., summarizing reports), or code generation. It’s your go-to for structured, high-level tasks. Operator: Excels in messy, API-less environments like navigating clunky UIs, controlling robots, or tackling game-like tasks. If it involves real-time interaction with unpredictable systems, VLA is king. 8. Mental Model: Planner + Doer Think of the LLM Agent as the planner: it breaks complex tasks into clear, logical goals. The VLA Operator is the doer, executing those goals by directly interacting with pixels or physical systems. A checker (another system or agent) monitors outcomes to ensure success. $CODEC
42.37K