← back to terminal

X Posts

Top posts by @michaelzsguo — 207 posts with 2,000+ impressions

207 posts

DeepSeek V4 Pro: 100M Tokens for $1.85

My Deepseek V4 Pro agent (inside codex) has been pursuing goal for more than 13 hours, burning ~100M tokens, and has only costed me $1. Yes you saw it right

Claude Code Hackathon Winners

Winners include a personal injury attorney, cardiologist, musician, infrastructure worker, and one software engineer, showing domain experts can use AI coding tools.

Qwen 3.6 Running on 12GB VRAM

People are posting Qwen 3. 6 configs that deliver fast TPS on as little as 12GB VRAM

Running Local LLMs: From First Run to Fine-Tuned

A comprehensive guide covering everything from your first local LLM run to fine-tuning workflows

这个事件基本可以告一段落,这本质上就是一次典乌龙

这个事件基本可以告一段落,这本质上就是一次典乌龙. Fireworks AI 其实早在 21 小时前就已经发布过 Composer 2 的消息并明确说了这是基于他们基础设施进行 RL 训练的模型

I needed to pursue /goal inside Codex, but I burned throu...

I needed to pursue /goal inside Codex, but I burned through my Plus membership tokens. Luckily, I have a capable and very cheap DeepSeek V4 Pro setup that I can connect to Codex

OpenAI 推出了能够让Agent长时间连续运行的 /goal

OpenAI 推出了能够让Agent长时间连续运行的 /goal. Peter Steinberger 的一项 Goal 已经运行了 11 小时 31 分钟

是Claude Code 开发团队里文章写的最棒的(我很好奇他们的产品经理没有太多文章也许我没看到)

是Claude Code 开发团队里文章写的最棒的(我很好奇他们的产品经理没有太多文章也许我没看到). 他把他所有的文章都列在这个Thread里

If you understand these terms in the article, you are alr...

If you understand these terms in the article, you are already halfway into local LLMs

Many of you asked the setup Deepseek inside codex

Many of you asked the setup Deepseek inside codex. I wish x provided better way to see my previous posts but here it is for your easy reference

this is so cute

this is so cute. here is my Codex pet

罗福莉刚刚写了一篇很不错的文章

罗福莉刚刚写了一篇很不错的文章. 即使 Anthropic 正在切断 OpenClaw 这类第三方 agent 对 Claude 订阅的接入,罗福莉依然给整个 AI 生态提供了一个相对乐观的视角

北京的那家也不错, 都已经投入到实际仓库了

北京的那家也不错, 都已经投入到实际仓库了

you did the right thing, sir

you did the right thing, sir. they actually listened to you and built a strike team with sergey taking the lead again

我觉得 Harness 这里最难的是,它不是单纯的“工具”或者“框架”,而是一个驾驭、约束、编排、使其可用可控的外部系统

我觉得 Harness 这里最难的是,它不是单纯的“工具”或者“框架”,而是一个驾驭、约束、编排、使其可用可控的外部系统. 如果让我来翻译, 就叫驭构工程

Google Labs Website Creation

Google Labs created a personal website, showcasing its AI-powered design capabilities.

你是收到了还是只是申请表😂 我几个星期前就登记了,到现在还没收到

你是收到了还是只是申请表😂 我几个星期前就登记了,到现在还没收到. 4000只,怎么也不够啊

Long-running Codex /goal runs are powerful, but they crea...

Long-running Codex /goal runs are powerful, but they create a new question:. What happens while the agent is 3 hours into a run and you want to help without stopping it?

Two days ago, I asked whether I should buy a Mac Studio f...

Two days ago, I asked whether I should buy a Mac Studio for local LLMs. I was genuinely humbled by how much great feedback I received

Hackers can use a crafted GGUF file to leak private infor...

Hackers can use a crafted GGUF file to leak private information you put into your local LLM or agent. Many people may not have a good understanding of what GGUF is, so here is a simple primer

my local LLM community, give me one reason I shouldn't pl...

my local LLM community, give me one reason I shouldn't place the order

A lot of people hear about local LLMs and feel the same m...

A lot of people hear about local LLMs and feel the same mix of curiosity and anxiety:. where do I even start?

So you bought the 128GB MacBook Pro

So you bought the 128GB MacBook Pro. Now the question is not, “Which local model gets the highest TPS?”

The 30-minute China debate was fascinating precisely beca...

The 30-minute China debate was fascinating precisely because the topic is so tricky and fascinating itself: there’s no clear winning side. You lose by selling advanced chips to China (risk accelera...

> MacBook Pro 128GB: $5,500

> MacBook Pro 128GB: $5,500. > DeepSeek Pro tokens burned: 1,075,351,274

这个太聪明了。真希望上个周末做 bake-off test 的时候能用到这个技巧,我花了好长时间去适配话痨的 Gemma

真希望上个周末做 bake-off test 的时候能用到这个技巧,我花了好长时间去适配话痨的 Gemma 和 Qwen. 理解一下这个技巧:它用 GBNF 控制模型输出

用这个技巧重新做了一遍周末的模型测试

用这个技巧重新做了一遍周末的模型测试. 几行 GBNF grammar,把 Qwen 3

I’ve tried driving Qwen 3

I’ve tried driving Qwen 3. 6 on my MacBook Pro with a few different agent harnesses:

这期张小珺的播客又是很精彩的一期

这期张小珺的播客又是很精彩的一期. 主角是00后”华人女孩洪乐潼(Carina Hong @CarinaLHong )

I write about the tools behind practical AI agents: Codex...

I write about the tools behind practical AI agents: Codex, DeepSeek, Claude Code, local LLMs, agentic coding workflows, and the messy configs that make them actually work. Follow me for more field ...

You also need a tight goal in order for codex to run that...

You also need a tight goal in order for codex to run that long. Here is a skill goal-forge that turns your rough ideas to codex/claude goals

We had a great discussion here about what hardware we nee...

We had a great discussion here about what hardware we need for local LLMs. I thought I would give an update on what I bought, and also share the thinking behind the decision for others on the same ...

Found this great tool that may be handy for your local LL...

Found this great tool that may be handy for your local LLM inference optimization:. And apparently 1M tokens for DeepSeek V4 Pro only takes 5GB of RAM

Anthropic keeps moving up the stack

Anthropic keeps moving up the stack. Opus helps you think

Claude Code and Vue Author

Wondering how Claude Code would react if told it's the Vue author, asking not to be fooled.

根据事故详细报告:攻击者并没有通过正常的 GitHub 工作流提交恶意版本,所以 LiteLLM 的维护者没能及时...

根据事故详细报告:攻击者并没有通过正常的 GitHub 工作流提交恶意版本,所以 LiteLLM 的维护者没能及时发现 1. 相反,攻击者使用了一个被窃取的 PyPI 发布令牌,直接把被投毒的包上传到了 PyPI,完全绕过了代码审查

Domain Experts Unlock AI Coding

Domain experts using Claude Code are the real unlock, no longer waiting on engineers to understand problems.

Sprints in Age of AI Agents

Questioning if sprint agile development is still needed when coding is no longer the bottleneck and agents deliver progress in hours.

I experienced firsthand how cost-effective DeepSeek V4 Pr...

I experienced firsthand how cost-effective DeepSeek V4 Pro can be. I used it extensively this weekend for some fairly sophisticated coding work and burned nearly 31M tokens

A summary of this weekend’s AI bake-off: Opus 4

A summary of this weekend’s AI bake-off: Opus 4. 6, Gemma 4 26B, and Qwen 3

Claude Code Mobile Power

After initial struggles with interface and integrations, Claude Code on mobile is incredibly powerful, allowing task delegation and independent completion.

This is a big shift

This is a big shift. Anthropic no longer just wants to be your model provider

While DeepSeek is pursuing the goal, my Codex agent and I...

While DeepSeek is pursuing the goal, my Codex agent and I monitor it in the sidecar and guide or correct it as needed. So I thought I would ask Codex to objectively judge DeepSeek’s capability base...

Most people start with the wrong question when they want ...

Most people start with the wrong question when they want to run a local LLM. They ask: “Which model format should I use?”

我也压codex。感觉他们家的更Polish,产品精雕细琢,更有品味。而且越来越聪明,能记住事儿。使用Claude我经常

感觉他们家的更Polish,产品精雕细琢,更有品味. 而且越来越聪明,能记住事儿

FDE在AI时代流行,是因为企业不再只需要会写代码的人,而是需要能把客户问题、产品判断和软件实现连在一起的人

FDE在AI时代流行,是因为企业不再只需要会写代码的人,而是需要能把客户问题、产品判断和软件实现连在一起的人. AI降低了写代码的门槛,但也放大了真实场景、业务理解、系统集成和落地判断的重要性

You are not alone

You are not alone

Totally fair

Totally fair. The 13 hours wasn’t “one prompt thinking really hard,” it was an autonomous loop doing the unglamorous work:

ds4.c is a tiny, purpose-built inference engine from Antirez

c is a tiny, purpose-built inference engine from Antirez, the original creator of Redis and one of the most respected systems programmers in open source. The project runs DeepSeek V4 Flash, a 284B ...

本地大模型:从跑起来到跑得好

一篇完整的本地大模型指南,从入门到优化

Google needs to hire this engineer

Google needs to hire this engineer

OpenAI 最近推出的 Codex,最令人惊艳的一点,是它的 Computer Use 功能

OpenAI 最近推出的 Codex,最令人惊艳的一点,是它的 Computer Use 功能. 这个能力让 AI 真的可以“使用”你的电脑

seriously though

seriously though. 昨天听了一个播客, 里面的嘉宾提到:在这个AI铺天盖地的时代, 有三种不同反应的人:感到焦虑, 感到兴奋, 满不在乎

#BookToSkill

#BookToSkill. I've been turning books into executable Claude Code skills, started with Never Split the Difference (negotiation tactics), then Radical Candor (tough feedback frameworks)

今天看到Deepseek和华为升腾首付的Slide, 他们刚好谈到内存要求, 并给了蛮详细的公式

今天看到Deepseek和华为升腾首付的Slide, 他们刚好谈到内存要求, 并给了蛮详细的公式. 正好就最近大家玩本地模型要怎么样的硬件配置,再详细讲讲

We builders should read and re-read the README

We builders should read and re-read the README. she (or Ben) is telling a story, not a technical architecture

What better way to demo its power than with fireworks? Yo...

What better way to demo its power than with fireworks? You can even play with it at home using the app in the reply

Mr. Booch, this seems like a clickbait post. A couple of su

Booch, this seems like a clickbait post. A couple of suspicious points:

上个周末刚刚做了几个Gemma 4的实例,感觉蛮惊艳的

上个周末刚刚做了几个Gemma 4的实例,感觉蛮惊艳的. 这个周末准备拿qwen 3

Actually got Gemma 4 E2B running inside Hermes Agent on m...

Actually got Gemma 4 E2B running inside Hermes Agent on my Raspberry Pi 5. There’s a saying: constraints breed creativity

这些战略性思考与情境感知能力是不是表明Mythos已经有意识了?而且都是恶的一面

这些战略性思考与情境感知能力是不是表明Mythos已经有意识了?而且都是恶的一面. - 识别自己正在被评测

OpenAI named the model GPT-Rosalind after Rosalind Frankl...

OpenAI named the model GPT-Rosalind after Rosalind Franklin, the British chemist and X-ray crystallographer whose pioneering work was essential to understanding the molecular structure of DNA

I used to think my A100 40GB was too small

I used to think my A100 40GB was too small. Then I noticed how many people are tinkering with 12GB 3090s, optimizing models and runtimes, and still getting impressive results

many of you asked how to get such a crazy price

many of you asked how to get such a crazy price. I bought from their official site, nothing more needed

thanks for clarification

thanks for clarification. looking forward to what come next at google cloud next next week

Local LLM people know this feeling:

Local LLM people know this feeling:. You finally get the model running fast

在树莓派上把Gemma 4, llama

在树莓派上把Gemma 4, llama. cpp,和爱马仕Hermes Agent整个链跑通了,当然用是指望不上,但也算本地化的实践 哈哈

是前年(2024)😀

你这个主意好玩。 我也有个Meta-skill,把任何你喜欢的书变成Skills, 随时调用。书中自有黄金屋, 书中自

我也有个Meta-skill,把任何你喜欢的书变成Skills, 随时调用. 书中自有黄金屋, 书中自有颜如玉, 这下你看过的书就不会忘了

至今还记得第一次看到他的grill-me skill,灵魂被吊起来拷问

至今还记得第一次看到他的grill-me skill,灵魂被吊起来拷问. 56个单词,加一个嫌多,减一个嫌少

估计Anthropic很害怕把这个“恶魔”放出笼子里来

估计Anthropic很害怕把这个“恶魔”放出笼子里来. 突然想起电影Frankenstein

that's cool

that's cool. but how many tokens they will get?

Very true. I stated similarly before but your picture means

I stated similarly before but your picture means more than 1000 words👍👍

不好意思班门弄斧了。但是可以考虑让Wanman做成一个非常Configurable的即插即用的系统。 垂直化的工具, c

但是可以考虑让Wanman做成一个非常Configurable的即插即用的系统. 垂直化的工具, context, workflow都是专业化的人才能更好的定义, 给餐馆用的和给房地产公司用的应该很不一样, 让客户或者第三方使用你的Wanman去定义, 设置这些Harness

Did agent accomplish anything in that 13 hours?

Did agent accomplish anything in that 13 hours?

美国人仇视AI仅次于伊朗和民主党😂

美国人仇视AI仅次于伊朗和民主党😂

哈哈哈, 太扎心了, 烙铁

哈哈哈, 太扎心了, 烙铁. 和宝玉昨晚的推文有异曲同工之妙,Vibe Coding = 中年男人的钓鱼 = 磨刀

让我用 Claude Design 来试试看😂

让我用 Claude Design 来试试看😂

讲的非常好。Agent 的速度已经接近秒级了,写代码、跑简单任务都很快,但大多数 process 还停留在人类节奏:审批

Agent 的速度已经接近秒级了,写代码、跑简单任务都很快,但大多数 process 还停留在人类节奏:审批、等反馈、层层 review、手动验证,这些正在成为真正的瓶颈. 这让我想到 DORA发明者Nicole Foresgren的新书《Frictionless》里反复强调的一点:AI

他们最新发布的两个功能, 一个telegram pairing卡顿;一个computer use慢的像头牛

他们最新发布的两个功能, 一个telegram pairing卡顿;一个computer use慢的像头牛. wanman如果用户体验好, 可以完胜

How to Stay Organized When Running Multiple Local LLMs

Practical tips for managing multiple local LLM setups without losing track

马上要去东京旅游几天, 看来单向街是必去了

马上要去东京旅游几天, 看来单向街是必去了. 如果能偶遇仁兄, 那就更好了

ds4-agent is so fast

ds4-agent is so fast. I even asked it to write a script to benchmark itself

DeepSeek 做事情很稳重

DeepSeek 做事情很稳重. 我已经一个多月没有升级我的 OpenClaw 老龙虾🦞了

Built an AI stylist that runs 100% local on a single A100...

Built an AI stylist that runs 100% local on a single A100 GPU. 在一张 A100 GPU 上构建了一个 100% 本地运行的 AI 造型师

While working on my AI stylist project, I also spent my f...

While working on my AI stylist project, I also spent my first extended stretch coding with Opus 4. I found it surprisingly weak even on small things, like displaying a comment in the main panel

I gave this famous photo to both Muse Spark and ChatGPT

I gave this famous photo to both Muse Spark and ChatGPT. Muse Spark seemed better at reading the image, especially the subtle cues and implied meaning

KV cache is the model’s working memory during generation

KV cache is the model’s working memory during generation. As the context window gets longer, the model has to keep more key/value attention state for previous tokens

我也是经过一段时间的长考决定买的MacBook pro 128GB

我也是经过一段时间的长考决定买的MacBook pro 128GB

多谢你的介绍。今天看到她出现在我的timeline但我不知道她的来历

今天看到她出现在我的timeline但我不知道她的来历

我每次用这个网站也挺好使

我每次用这个网站也挺好使. 不过我纽约时报和华尔街日报都订了

If you’re about to pull out a calculator to do the math, ...

If you’re about to pull out a calculator to do the math, just use ChatGPT to calculate the flight to the Moon. I didn’t know I’d end up becoming a rocket scientist myself one day

The UI/UX of @OpenAI Codex looks very polished

The UI/UX of @OpenAI Codex looks very polished. It felt incredibly smooth

感觉这次OpenAI有点强者归来的感觉

感觉这次OpenAI有点强者归来的感觉. Codex 在 UI/UX 上看起来很好打磨过,整个体验非常丝滑

I’m concerned about the coming budget cycle as well

I’m concerned about the coming budget cycle as well. Many companies are seeing AI tool spend triple, or more, and the productivity lift appears real

thanks for sharing

thanks for sharing. this looks very solid

I so need this reset as I'm deeply in debt

I so need this reset as I'm deeply in debt

A very nice write-up

A very nice write-up. Fuli puts an optimistic spin on the AI ecosystem, even as Anthropic cuts third-party agents like OpenClaw off Claude subscriptions

他们沿着价值链一路向上,把我们要做的事情一点点给吃过去

他们沿着价值链一路向上,把我们要做的事情一点点给吃过去

A lot of people hear about local LLMs and feel the same m...

A lot of people hear about local LLMs and feel the same mix of curiosity and anxiety: where do I even start? What machine should I buy? Do I need a Mac Studio, an RTX 4090, more VRAM, or unified...

how was the results? i love @googlegemma and have been pl...

how was the results? i love @googlegemma and have been playing with it for the last several weeks (with vision chat, hermes integration, LoRA etc). and the past weekend, I even did a baking test am...

Indeed. I created a skill goal-forge to make sure a tight go

I created a skill goal-forge to make sure a tight goal

People are wondering why Google would invest another $40B...

People are wondering why Google would invest another $40B in Anthropic instead of its own Gemini. After attending Google Cloud Next this week, my view is simple: this is not Google giving up on Gemini

MacMini居然比我的树莓派还麻烦? 树莓派可以提前预装SSH, 接入Home Network后, 用SSH登陆就好了

MacMini居然比我的树莓派还麻烦? 树莓派可以提前预装SSH, 接入Home Network后, 用SSH登陆就好了. 如果还是想要GUI, 用XQuartz就行了

公司会不会来个anti-anti-distillation?

公司会不会来个anti-anti-distillation?

我的Openclaw装在家里的树莓派P5上, 有时候出故障, 我在外面, 全靠Tailscale远程通过手机登录,...

我的Openclaw装在家里的树莓派P5上, 有时候出故障, 我在外面, 全靠Tailscale远程通过手机登录, 处理故障

这种软身段竞争还真是第一次见

这种软身段竞争还真是第一次见. 不过在Claude Code如日中天, Codex在追赶的情况下, 这是一个很聪明的打法

深有同感。Openclaw如果有同样水平的User Onboarding体验的话, 估计Agent的普及率更高了。相信O

Openclaw如果有同样水平的User Onboarding体验的话, 估计Agent的普及率更高了. 相信OpenAI说他们在做的Super App, 应该是在这个方面发力, 借助Peter的Idea, OpenAI自己团队的产品能力

应该加一个功能: 用gpt-5

应该加一个功能: 用gpt-5

This one probably more accurate

This one probably more accurate

This report from NYTimes concerns me

This report from NYTimes concerns me

Hermes Agents with TencentDB Agent Memory

Upgraded Hermes agents with TencentDB Agent Memory, using Qwen 3.5-4B locally on MacBook Pro via llama-server.

That shouldn’t be

That shouldn’t be. The quota between the two are separate

For many local model beginners, Ollama is the right place...

For many local model beginners, Ollama is the right place to start. It is convenient, fast to install, manages models for you, supports hot-swapping, and gives you an API without much setup

Out of stock 😢

Out of stock 😢

好马配好鞍。好的大模型也要配个好的Harness agent。 Deepseek V4又好又便宜。 放在Claude

好的大模型也要配个好的Harness agent. Deepseek V4又好又便宜

哈哈。 谢赛宁深沉老练有见地, 把人生,科研, 艺术串起来讲。做科研也是做人, 不是寻求出人头地, 是帮助别人打开他们的

谢赛宁深沉老练有见地, 把人生,科研, 艺术串起来讲. 做科研也是做人, 不是寻求出人头地, 是帮助别人打开他们的事业, 让他们也被理解

我在我的 Raspberry Pi 上也装上了爱马仕 Hermes 😂

我在我的 Raspberry Pi 上也装上了爱马仕 Hermes 😂. 到目前为止我还挺喜欢它的:

你这么一说, 如果只是想用它的模型, 如果你也有Google Cloud的话,GCP Vertex Model G...

你这么一说, 如果只是想用它的模型, 如果你也有Google Cloud的话,GCP Vertex Model Garden也提供Opus/Sonnet, 而且步骤很简单:. 在GCP Model Garden里找到Opus模型, Enable

The @AcquiredFM session is, as always, packed with real s...

The @AcquiredFM session is, as always, packed with real substance. @JeffDean and Amin Vahdat shared a number of great behind-the-scenes stories: how TPU began, how Google kept innovating through fa...

Gemma 4 E2B on my raspberry pi 5 (8GB RAM) passed the str...

Gemma 4 E2B on my raspberry pi 5 (8GB RAM) passed the strawberry test. congratulations @GoogleAI @OfficialLoganK

未来的模型会在本地运行

未来的模型会在本地运行. 即使它们没有很多闭源模型的能力,但很多日常用例,比如查询、文章总结、定时任务等,其实都用不着那么大的模型

Gemma 4 is so powerful, I built an AI stylist runs 100% l...

Gemma 4 is so powerful, I built an AI stylist runs 100% locally with Gemma 4 26B

30年后, 当已经控制人类的AI记述这段历史,口口口口(此处省略500字)

30年后, 当已经控制人类的AI记述这段历史,口口口口(此处省略500字)

Multiplayer Claude Code: Remote Collaboration Setup

Imagine PMs and engineers all seeing the same session and collaborating with the same agent. Or imagine you have a coding agent running on a cloud VM, and you want to remote-control it from your phone

一两拨千金的小技巧

照猫画虎, 我也做了一个

照猫画虎, 我也做了一个. 还可以再优化, 但codex credit用没了

Long $amzn with Ai and robotics, Amazon will always be at...

Long $amzn with Ai and robotics, Amazon will always be at its best to innovate and create values for their customers

你的这个总结很到位: AI的工程素养

你的这个总结很到位: AI的工程素养. 到头来, 除了工具本身, 也反映了使用工具的人的素养, taste和judgement

刚刚被弹出这个。看样子我就不upgrade了?😂😂

看样子我就不upgrade了?😂😂

Anthropic's Claude Code source leaked this morning

Anthropic's Claude Code source leaked this morning. The internet has been studying it all day

This is happening everywhere

This is happening everywhere. The real question for this budget cycle is whether CTOs are ready to explain that gap clearly to their CEOs and CFOs, and to lay out a credible plan for when and how A...

At this week’s Google Cloud Next, I heard many people sha...

At this week’s Google Cloud Next, I heard many people share the same view: this thing has to work. Otherwise, given the enormous amount of capital pulled into this cycle, the fallout will not be li...

美国人经常怀念那个时代:战后制造业发达, 房价(利率)低只有年收入的2-3倍(现在差不多7-8倍),学费低(只要几...

美国人经常怀念那个时代:战后制造业发达, 房价(利率)低只有年收入的2-3倍(现在差不多7-8倍),学费低(只要几百美金),医疗保险也低,有庞大的中产阶级

Great addition

Great addition

我知道,你的 Claude Code 会写代码

我知道,你的 Claude Code 会写代码. 我刚给我的 agent 装了一套鸡尾酒Skill包,所以它现在不但能帮我做 Old Fashioned,原则上还可以带我做一整套经典鸡尾酒

其实CLI不是真的CLI, 就是在Terminal上chat 哈哈哈

其实CLI不是真的CLI, 就是在Terminal上chat 哈哈哈

我给我的树莓派装上了他们今天发布的最小款 Gemmi E2B, 居然通过了草莓🍓里有几个R的测试

我给我的树莓派装上了他们今天发布的最小款 Gemmi E2B, 居然通过了草莓🍓里有几个R的测试. 看它小心翼翼给R做标记的做法很好玩

It is actually mind blowing how NASA can calculate the tr...

It is actually mind blowing how NASA can calculate the trajectory and solve this n-body equation of motion:

国内被CC封号困扰的同学,可以尝试Pi + Kimi K2

国内被CC封号困扰的同学,可以尝试Pi + Kimi K2. Pi Coding Agent是支持Openclaw小龙虾的基座,Agent感很丝滑

和它聊天稍微有点困难😂对比同样教育目的的Kaparthy的nanochat

和它聊天稍微有点困难😂对比同样教育目的的Kaparthy的nanochat

关键是干正经事😀 如果是不干正经事也算的话, Grok还是很不错的辅助学习工具,尤其是在X上用, 针对当前的Post和

如果是不干正经事也算的话, Grok还是很不错的辅助学习工具,尤其是在X上用, 针对当前的Post和replies, 检索过往的

嫌犯已经被抓住了。 他后来又回到OpenAI总部继续闹事, 估计是豁出去了。

他后来又回到OpenAI总部继续闹事, 估计是豁出去了

1 hour 40 minutes in, and they still haven’t extracted th...

1 hour 40 minutes in, and they still haven’t extracted the astronauts. @elonmusk why aren’t they using a SpaceX recovery vessel? I thought SpaceX was much faster at this

use tmux + ttyd + tailscale also gives you remote-control...

use tmux + ttyd + tailscale also gives you remote-control and multiplayer system

它把曾经爆火的Remotion也给替换了

它把曾经爆火的Remotion也给替换了

我也是前不久刚迁到爱马仕上

我也是前不久刚迁到爱马仕上

看你怎么算盈亏,他们三个月估值从43亿美金到今天的180亿美金,我看他们赢不少

看你怎么算盈亏,他们三个月估值从43亿美金到今天的180亿美金,我看他们赢不少

Thanks for sharing

Thanks for sharing. Indeed a hassle for codex and I had to submit a PR for tool call for vibearound

wow this is a brilliant project, but how do you plan to k...

wow this is a brilliant project, but how do you plan to keep with their release calendar like this?

年轻还有魅力占了很大的优势😀

年轻还有魅力占了很大的优势😀

xAI 被报道的 GPU MFU (Model FLOPs Utilization) 只有 11%,乍一听很尴尬

xAI 被报道的 GPU MFU (Model FLOPs Utilization) 只有 11%,乍一听很尴尬. 但更有意思的是,这个数字可能已经好过市场上很多 GPU 使用场景了

for the deepseek that I used to pursue /goal, that would ...

for the deepseek that I used to pursue /goal, that would be the deepseek v4 pro in the cloud. I can use it inside codex (or claude code)

用 @HiTw93 的 Kami + ChatGPT Image 2,我做了一张把 Gemma、Qwen 和 Op...

用 @HiTw93 的 Kami + ChatGPT Image 2,我做了一张把 Gemma、Qwen 和 Opus 的 coding design 测试,映射成一场 50K UTMB 越野赛的图

我很喜欢这个哥们儿的一个测试:“鹈鹕测试”(Pelican Test),他每次遇到新大模型时都会用完全相同的提示词...

我很喜欢这个哥们儿的一个测试:“鹈鹕测试”(Pelican Test),他每次遇到新大模型时都会用完全相同的提示词进行测试:“Generate an SVG of a pelican riding a bicycle”(生成一只骑自行车的鹈鹕的SVG图像). 这个提示简单却极具挑战性

hope this is not true

hope this is not true

Nathan Lambert 是美国开放模型阵营里比较重要的技术型公共写作者

Nathan Lambert 是美国开放模型阵营里比较重要的技术型公共写作者. 他刚刚在中国访问了多家领先 AI Lab,包括 Moonshot、Zhipu / 、Meituan、Xiaomi、Qwen、Ant Ling、,也提到在北京短时间内走访了 Alibaba

so Claude code build who-wants-to-be-a-millionaire lifeline

so Claude code build who-wants-to-be-a-millionaire lifeline

然后Mythos是10T参数, 比Opus又多了一倍

然后Mythos是10T参数, 比Opus又多了一倍. Scaling law仍在继续

OpenAI also released a realtime translation API today whi...

OpenAI also released a realtime translation API today which may help with tuwa

的确是这样。

hermes: MacOS

hermes: MacOS. openclaw: Windows

昨天测试用本地模型跑 Helio,发现 Helio 和一般的 Agent 不太一样

昨天测试用本地模型跑 Helio,发现 Helio 和一般的 Agent 不太一样. 它会把模型 API 和 API Token 都存储在自己的云端,所以我不得不用 Tailscale Funnel 提供一个公开 API,而不能直接用本地的 127

他当时我就想问:为什么不直接用codex或者Claude code,底下大模型可以用DeepSeek

他当时我就想问:为什么不直接用codex或者Claude code,底下大模型可以用DeepSeek

I use vibearound

I use vibearound. I actually submittedd a PR fixing the tool call issue

这个太酷了, 还有这个例子:

这个太酷了, 还有这个例子:

My MIG (Multi-Instances GPU) setup came just in time for ...

My MIG (Multi-Instances GPU) setup came just in time for testing Gemma 4 with MTP. The nice part of MIG is that I can run two isolated inference tenants on the same A100: one Gemma 4 baseline, one ...

Anthropic’s new harness engineering write-up looks striki...

Anthropic’s new harness engineering write-up looks strikingly similar to Karpathy’s autoresearch loop, just generalized for messier, longer-running app-building work. The same core pattern is there:

美国网友惊呼中国的GPU

美国网友惊呼中国的GPU. 你们给他们指点迷津帮助他们一下吧

杨丽坤其实谈到的是AI Diffusion问题

杨丽坤其实谈到的是AI Diffusion问题. 一个组织要正在adopt AI,要经过transformation,需要很完整的change management

Today’s @WSJ on this launch

Today’s @WSJ on this launch. We are so over

Interesting

Interesting. Codex can continue pursuing /goal even though it has used up my 5-hour session limit? @dotey FYI

我的好像不是 最近codex好像在犯Claude code前不久犯的错误 网上怨声载道 很多人和我一样的经历 to...

我的好像不是 最近codex好像在犯Claude code前不久犯的错误 网上怨声载道 很多人和我一样的经历 token limit几分钟就用完了 我的刚刚更离谱,周limit说还有7个小时 5-小时limit用了43%. 结果周limit一下子就没了 5-小时的limit还有20%

I'm really impressed that she talks about Claude Code CLI...

I'm really impressed that she talks about Claude Code CLI, which made her feels in the driver's seat and considers herself an architect

Yes. Agent speed is real. For most companies, the challenge

Agent speed is real. For most companies, the challenge is not whether agents can move fast

How to turn a rough product idea into a long running code...

How to turn a rough product idea into a long running codex goal. we now turn @ynkzlk methodology into a Codex skill: goal-forge

这个主意是很不错。做一个 Agent 产品,本来要操心的事情很多。现在 Anthropic 直接把最难、最麻烦的那一块拿

做一个 Agent 产品,本来要操心的事情很多. 现在 Anthropic 直接把最难、最麻烦的那一块拿走了:编排协调、沙箱隔离、runtime、session management,这些过去最考验工程能力的部分,正在被它一步步托管掉

Are you using WeChat hongbao or Alipay?

Are you using WeChat hongbao or Alipay?

OpenAI这次推出来的computer use, 比不久前Claude Code的看着丝滑多了

OpenAI这次推出来的computer use, 比不久前Claude Code的看着丝滑多了. 背后的团队实力和技术/艺术积累也不一般

It was in my previous post but here it is:

It was in my previous post but here it is:

Google made that very clear in their first keynote here a...

Google made that very clear in their first keynote here at #googlecloudnext

这个功能太棒了。看看同样一个问题, 用了@chrome 插件和不用的区别。 用了插件才用了4分钟, 不用插件用了7分钟

看看同样一个问题, 用了@chrome 插件和不用的区别. 用了插件才用了4分钟, 不用插件用了7分钟

Claude code在规划与架构比codex好,能更好理解模糊需求、写清晰文档、给出产品级架构和UI/UX建议,...

Claude code在规划与架构比codex好,能更好理解模糊需求、写清晰文档、给出产品级架构和UI/UX建议,适合前期脑暴尤其和superpower skills这样的工具结合和非程序员. 他们自己现在也有plan然后design,implement的流程了

现在开源的也不差啊 譬如pi

现在开源的也不差啊 譬如pi

小扎16年前就开始了

AI lowers the floor, taste raise the ceiling

AI lowers the floor, taste raise the ceiling. 翻成中文意思是: AI 降低了下限,品味抬高了上限

not when they check the RSU value nearly tripled

not when they check the RSU value nearly tripled

thank you @huggingface

thank you @huggingface

NVIDIA GPUs have become a hot topic for anyone playing wi...

NVIDIA GPUs have become a hot topic for anyone playing with local LLMs because the GPU is often the real constraint. Model size, quantization, context length, inference speed, and whether you can r...

Mozilla 参与了 Claude Mythos Preview 的早期测试,并写了一篇报告 Firefox 安...

Mozilla 参与了 Claude Mythos Preview 的早期测试,并写了一篇报告 Firefox 安全实践的复盘. 但这篇报道最有意思的是他们构建的security harness

Can’t agree more

Can’t agree more. You only need to watch this @AcquiredFM on Jeff and Amin to appreciate it

Good suggestion on this one: --n-gpu-layers 99

Good suggestion on this one: --n-gpu-layers 99. Thanks for the additional

yes

我自己也做了一个,还把我以前做的一个小项目“手动烟花”融了进去

我自己也做了一个,还把我以前做的一个小项目“手动烟花”融了进去. 连这么快节奏的烟花,Cheng Lou 的新算法也都稳稳扛住了

I looked at the performance metrics, and Tencent’s AngelS...

I looked at the performance metrics, and Tencent’s AngelSlim, the Hy-MT1. 5 series translation model, delivers translation quality comparable to models several times larger, and in some cases up to...

和我预计的差不多 所以我心动不如行动,抢在涨价之前下手买😜

和我预计的差不多 所以我心动不如行动,抢在涨价之前下手买😜

难道不是@xicilion

难道不是@xicilion

Wow, that's very impressive

Wow, that's very impressive. Hold on a second though

Antirez’s new project, ds4

Antirez’s new project, ds4. c, adds another data point to this debate

To help understand @antirez’s new invention around a loca...

To help understand @antirez’s new invention around a local DeepSeek model and agent, here is an illustration of how it works. Again, this shows that the harness, the agent layer, is just as importa...

那我这是赚大发了, 一个Mac好几个跑车😅

那我这是赚大发了, 一个Mac好几个跑车😅

her smile is so contagious

her smile is so contagious

Thousands of RobotEra L7 (星动纪元)humanoids are set to enter...

Thousands of RobotEra L7 (星动纪元)humanoids are set to enter service across 10+ logistics centers for parcel sorting. RobotEra just raised a $200M+ round led by SF Express, with HongShan, IDG, CICC &a...

另外一个原因就是每个人用这个词的时候都用不同的意思

另外一个原因就是每个人用这个词的时候都用不同的意思. 譬如OpenAI在讲harness的时候,基本上只谈到Agent要用到的Context (agent MD文件,项目knowledge文档)

What they described as Mythos’s behavior during pre-train...

What they described as Mythos’s behavior during pre-training all leans toward the darker side. It has a real Frankenstein feel to it

hey capability does matter too

hey capability does matter too. :-) that's why I chose Deepseek v4 pro

1d 13h 20m, 3,596,831 tokens

1d 13h 20m, 3,596,831 tokens. Goal achieved? Not quite