Discovery of ignorance and the exploration loop

Rereading Sapiens over Christmas, the idea that the engine that drives most of the world’s progress in the past 500 years is a compounding loop among science, capitalism, and empire stuck with me. Surprisingly, this loop is kicked off by a mindset shift: discovery of ignorance.

The scientific revolution is not a revolution of knowledge, but the discovery that humans are ignorant. Europeans openly admit collective ignorance regarding important questions, compared to pre-modern times when many assumed God / tradition already knew all the important matters.

Maps of ignorance

We can see this mindset shift by comparing two maps: A 1459 world map (Fra Mauro) had mythical creatures show up in the margins. Then you get the Salviati Planisphere around 1525 and something changes: parts of the world are just… left blank. Not decorated or explained away—blank on purpose. That blank space is a new kind of public honesty, and an invitation to go find out.

Fra Mauro (1459)

Salviati Planisphere (1525)

This small detail signals a psychological and ideological breakthrough of scientists and conquerors: they admit they’re ignorant of large parts of the world. So they need to go out to discover—which expands both knowledge and territory.

Compounding loop: Science ↔ Capital ↔ Empire

This is where the compounding loop forms: Science turns blank space into knowledge. Capitalism / capital funds exploration before there’s proof it will work. Empire turns discovery into durable advantage (routes, legitimacy, treaties, control). And then it compounds: advantage brings more capital; more capital funds more exploration.

Ignorance admitted → capital funds voyages → science updates the map → empire expands reach/resources → capital grows → more voyages

In 2025, the 'Empire' often becomes corporations that can fund long cycles of exploration, translate discovery into products people actually use, and defend / scale advantage through the energy contracts, the proprietary data sets, and the default distribution.

What would this look like in the age of AI?

The AI map is also mostly blank. We don’t fully know what models will reliably do in the wild. We don’t know what people will trust. We don’t know what becomes habit vs. novelty. So my hypothesis is: Enduring advantage in AI will come from teams that can own the exploration loop, rather than teams that get a single breakthrough.

Capital and infrastructure fund and enable exploration. Exploration produces knowledge. Knowledge creates power and advantage. Advantage attracts more capital. Model talent matters, but the dominant advantage comes from owning the loop (compute, data, distribution, real-world feedback).

In the 1500s–1800s, “exploration” meant ships, navigators, maps, ports, financiers, and state backing. In AI, exploration means running huge numbers of experiments (training + inference), but the constraints are different: compute, energy, deployment surfaces, and feedback loops.

For example: energy access is one physical gate that decides whether capital becomes real experimentation or stays theoretical. Whoever secures it early can run more experiments, iterate faster, deploy more capacity, and get more real-world feedback. That can translate into higher quality, broader distribution, more revenue, stronger habits, and then more capital to secure more infrastructure.

Examples: industrial-scale loop vs tight loop

OpenAI × Microsoft is the industrial-scale version: capital, compute, distribution, and governance intentionally linked. Microsoft has explicitly described a “multiyear, multibillion dollar” investment partnership, and the relationship is designed around turning frontier exploration into real-world deployment at scale.

Midjourney is the tight-loop version: a small but mighty team exploring a narrower knowledge gap (what people want in images / taste). They built a capital loop through subscriptions (steady funding to buy compute and keep iterating). Importantly, they built distribution and feedback through a community workflow (Discord), and as they moved into more compute-intensive territory (video), they explicitly priced it as much more expensive than images.

Photo by NEOM on Unsplash

If you’re building / investing, here’s what I’d watch

For builders

  • Audit for ignorance. What’s your blank space?

  • Pick a loop you can sustain. Don’t build a loop that dies before it learns.

  • Choose “good revenue.” Money that also teaches you.

For investors

  • Where’s their science? What are they actually learning?

  • Where’s their capital? Who funds exploration when it gets expensive?

  • Where’s their empire / corporation advantage? What channel, contract, or platform position makes adoption hard to replace?

Photo by NEOM on Unsplash

What would prove this hypothesis wrong?

  • If small teams repeatedly win frontier capability without privileged access to compute/energy/distribution—meaning the “capital + empire” advantage stops mattering.

  • If distribution empires can win with mediocre AI (defaults/bundling) without needing real learning—meaning “science” becomes optional.

  • If exploration gets radically cheaper (efficiency leaps) so the loop no longer needs heavy capital, and advantage shifts mainly to taste/community.

There are real stress tests to this idea. Efficiency jumps (DeepSeek is the loudest recent example) suggest frontier-ish capability might become less capital-gated. Reuters reported DeepSeek said training its R1 model cost about $294,000 (with caveats around what’s included), which is the kind of number that makes people rethink the “only giants can play” narrative.

And big platforms are also clearly pushing default distribution (Apple Intelligence is now default-on; Microsoft is auto-installing Copilot), which could make “control” matter more than “learning” in some contexts. The open question is whether these are exceptions, or the early shape of what comes next.

Christmas tree: Gesture-controlled interactive hologram

A small holiday experiment: a digital tree made of particles, light, and hand movements, exploring how hand-tracking and particle systems can be used to create playful, ambient digital objects. Created with Google AI Studio & Gemini 3 Pro.

The tree responds to simple gestures:

  • Close fist → collapse

  • Move hand → rotate

  • Pinch → focus

Music: https://pixabay.com/music/christmas-christmas-christmas-434436/

Does writing scale or limit cognitive thinking?

Blog post #31

Thanksgiving felt like the perfect 3-day reading break to revisit Harari’s Sapiens. I first read it eight years ago back in college and it’s one of those books that has been formative to how I think. Back then, “collective imagination” was the big unlock, realizing that money, religion, and corporations are shared fictions that enable large-scale coordination.

This time, a different set of themes stood out: the role of corporations in history, the relationship between science and imperialism, capitalism’s growth logic, the discovery of ignorance, and the emergence of new forms of energy. This post focuses on one thread: how writing reshaped human cognition, and how that maps to the shift toward multimodal AI we’re seeing today.

Writing expands our cognitive limits

Humans scaled cooperation through imagined orders and scripts, external memory systems that extended our cognitive capacity.

Early writing systems like Egyptian hieroglyphs, Chinese logographs, and the Inca quipu weren’t created for poetry. They were invented to manage complexity: taxes, grain storage, inventories, land ownership, political coordination. They complemented spoken language with something more structured, durable, and precise. Civilizations that mastered writing became strong archivists, able to catalog and retrieve information at scale.

Over time, script didn’t just record thought, it reshaped it. It moved humans from free association, our natural cognitive mode, toward categorization, administration, and compartmentalized thinking.

Will multimodal AI interfaces restore our natural way of thinking?

Human cognition is inherently multimodal. We process the world through images, tone, gesture, texture, narrative fragments, and nonlinear jumps. Script, mathematics, and binary code compressed that complexity into narrower formats machines could understand.

Interestingly, multimodal AI might reverse that trend. My grandma and her friends (all in their 60s and 70s) almost exclusively use voice messages in their group chats. For people newer to technology, voice is intuitive, emotional, social, and low-friction. It feels natural. It builds trust. And it mirrors how we actually think.

If writing expanded our capability through structure, multimodal AI expands it again by restoring expressiveness. It shifts technology from something we adapt to into something that adapts to us.

Language also shapes what we can think

Humans invent tools to expand what we can do, but those same tools shape—and often limit—what we can perceive. Each representational system—language, script, math, binary code—extends our capability but also defines the boundaries of the world we notice. Language narrows attention to what can be said. Script narrows thought to what can be recorded.

Every tool enlarges capability, but every tool also creates a frame. People who speak multiple languages, or who can switch between different representational systems (text, visuals, diagrams, narrative), gain access to different slices of the same idea. The concept of “home,” “honor,” or “freedom” shifts meaning across languages. The same is true in design: a diagram reveals relationships that a paragraph obscures; a voice note carries emotion that text flattens.

Photo by Sergio Li on Unsplash

Multimodal AI, in that sense, widens the frames again—bringing machines closer to the full range of human expression we started with. If script once expanded our capacity through structure, multimodal AI may expand it again through expressiveness. The question isn’t whether machines can think like humans, but how we design the cognitive frames we build with them.

Maybe happy ending: modular design for emotion

My sister visited New York last weekend, and we watched the Broadway musical Maybe Happy Ending, a show I’ve been eyeing ever since I first heard about it on the beloved Korean TV series Witty Mountain Village Life (《机智山村生活》) back in 2021. Jeon Mi-do performed a sneak peek of “When you’re in love (Korean ver.),” where she originally played Claire in the 2015 Korean production.

Maybe Happy Ending was inspiring for so many reasons—not only because it won the 2024 Tony Award for Best Musical, but also because of the story behind how this musical came to life. I kept wondering: how did a story written in two languages, set in futuristic Seoul, find its way to Broadway, and what kind of collaboration made that possible?

Photo credit: https://www.forbes.com/sites/jerylbrunner/2025/05/16/the-visionary-design-behind-the-broadway-musical-maybe-happy-ending/

A story about robots, but really about us

Set in Seoul in the 2060s, two helper robots, Claire and Oliver, discover an unexpected friendship, and perhaps something deeper, after being left behind by their human owners.

It sounds like a sci-fi premise, but the story is actually about what it means to be human in a world increasingly shaped by technology, a question that feels ever more relevant today.

Minimalist, modular design

One of the most striking aspects of the production was how modular its design felt—from the stage to the characters to the narrative itself. It reminded me of the IKEA approach: minimal, yet versatile and powerful.

With only four actors and a few movable set pieces, the stage seamlessly transformed from a bustling city street to a ferry ride, and to a quiet forest filled with fireflies. The use of technology was equally thoughtful: lighting, projection, and spatial design amplified the performances, creating a sense of scale and emotional depth far beyond the size of the cast.

It echoed the game design principles I explored in my last blog post: identifying the essential elements that define an experience and made it special is key to effective narrative, and it definitely applies to the broader entertainment design.

Photo credit: https://www.forbes.com/sites/jerylbrunner/2025/05/16/the-visionary-design-behind-the-broadway-musical-maybe-happy-ending/

What was it like to write songs in two languages?

Creating an original musical not based on existing IP is already rare, let alone one written in two languages. Maybe Happy Ending was created by Will Aronson (music & book) and Hue Park (book & lyrics), a Korean-American duo who have spent the past decade refining it across stages and cultures.

In an interview, they shared their music creation process: Will composed the music first, then Hue, a K-pop lyricist, wrote the lyrics. This approach gave the lyricist more control because he is interpreting the music and turning it into lyrics. They began with full Korean lyrics before adapting them into English.

This reveals something interesting about bilingual storytelling: in English, lyrics often need to rhyme and tend to be more straightforward and specific to the narrative. In Korean, lyrics lean toward sound and feeling, with words chosen for their musicality and their meaning. These linguistic and cultural differences shape how the lyrics interact with the music to drive emotion and story.

Photo credit: https://www.forbes.com/sites/jerylbrunner/2025/05/16/the-visionary-design-behind-the-broadway-musical-maybe-happy-ending/

Walking out of the theatre, I kept thinking about the emotional resonance the show created through such a minimal, modular design. It’s a great example of design for emotion: achieving depth through simplicity. That balance is something I aspire to bring into the things I design every day as well.

搬来纽约两周

窗外下着细雨,纽约也正式入秋了。搬来纽约两周,身体和心态还在慢慢适应。从SF搬来纽约是这几年做过的比较大的决定,心情很矛盾,因为要离开在SF陪伴了这么多年的好朋友们,肯定是难过的。虽然搬来纽约并不是客观上必须的,但对我而言做出搬家这个决定却像是被推着走,不得不这么做的状态。这周在Tina家录关于搬家的播客,才意识到一个重要的问题:既然我本科毕业就想去纽约了,刚搬到湾区的两年也一直说“明年就要搬去纽约”,一直知道这里的城市氛围和人文多元性会更适合我的性格和兴趣,为什么等了六年才搬来?

办公室傍晚的心形月亮

坦白来说,就是各方面的条件还不够成熟。在SF的六年,我还在摸索和学会掌握更基础的课题,比如工作的基础训练、了解自己、建立和维系亲密关系。当我以为这些方面都逐渐稳定了下来时,今年年初工作遇到的困难——频繁换组的疲惫、与欧洲团队远程协作的低效和孤独、对大公司适配度的质疑——让我不得不重新审视现在的生活真的让我开心吗?四月的时候,各种痛苦感达到了顶峰,也很质疑为什么自己还要留在SF。工作占据了生活很大一部分,而SF又是一个被科技行业和工作overshadow的城市,因此城市和我适配的冲突更明显。既然工作的起伏是必然要接纳的现实,如果想自己的状态更稳定和可持续,至少我希望在生活上能在一个更开心的环境,来增加整体幸福感的稳定性。所以换城市看起来是打破了某种稳定,但于我而言是增加内心稳定感的尝试

决定要搬家像是主动打破困境的求生本能,这几个月刚好两个条件都ready了:一是这两年对自己的状态和想做的事理解更清晰了,有能力承接独自换城市这样的变动。二是终于有足够的勇气去主动创造想要的生活,而不只是停留在和别人描述对它的期待。听起来很简单,但还是花了很多时间才真正领会,并有能力一步步付出行动。在不换工作的情况下,比起留在SF去做这些改变,通过换到新城市去打破惯性对我来说更容易,包括去测试主线的稳定性、提高行动力、更新看问题和社交的方式等等。

在中央公园抓住夏天的尾巴

我仍然相信 “know thyself” 是最重要的问题,就像Naval Ravikant说的 “你做什么、和谁在一起、住在哪里,是人生中三个最重要的决定。” 很多人都不够重视选择“住在哪里”这个选择,所以我想在还没有伴侣、成本较低的时候,去尝试不同城市的适配度。

每当工作或期望的事情进展停滞的时候,我都会感到轻微的焦虑,想要逃离现有的环境,认为换到新环境就可以解决问题。这也是我还在学习的课题:更有耐心地去接纳生活的起伏,相信时机的自然流动,也更相信直觉。每个人的成长方式不一样,而我需要通过与新环境、新变量的碰撞反馈来认识自己。

搬到新环境这两周意识到:我对环境的舒适度和变化的要求比较高;和朋友们在一起、留出足够的时间休息和写作,仍然带给我很多轻松和快乐;耐心变多了,可以慢慢来不着急;更有行动力把时间花在想做的事情上;想要尝试的东西就马上去做,只做周计划,不做年计划。相对的,要继续练习的课题是:学会清晰表达需求,在独立思考和整理上不偷懒,做出简化可用的版本就好,更自如地建立边界,不再回避冲突。

搬来纽约,对我来说是一种自救,就像三年前开始写博客对我的意义一样。当然,我也还不确定这是不是一个正确的决定,但只要尝试了就会更了解自己,所以无论如何都是一种进步吧。哪怕最后决定搬回加州或离开纽约,也是这段旅程的收获。

Hudson river park

Neighborhood bakery

Designing AI like a game

One of my favorite lenses in The Art of Game Design by Jesse Schell is this: game design is decision making. Every choice, from rules and pacing to risk, reward, and visual feel, is a deliberate decision shaping how someone experiences a world. Increasingly, this same mindset applies to AI designers and builders, who are making small and big decisions that define the nature of artificial experiences every day.

AI product design is shifting away from deterministic, linear interfaces toward something more dynamic and game-like. Think of the shift from scrolling through Instagram to co-creating stories with Sora, the experience becomes less about consuming information, and more about interacting, improvising, and discovering. As AI tools begin to support play, curiosity, and connection, not just productivity, users gain greater agency, and designers must think more like game makers as well.

1. Designing relationships, beyond functionalities

How do people relate to games? What is it that makes them so compelling? “I like playing with my friends,” “I like the physical activity,” “I like feeling immersed in another world,” or “I like solving problems.” What makes designing a game different from designing a tool, like a trip planner? You could argue that a trip planner does feel like a game because travel planning is often full of aspiration and fun. But when you think about the games you’ve played, the experience is often deeply personal. Some games could be much more personally significant, memorable, and compelling for one player, yet mean little to another. That’s because gameplay enables imaginary experiences that are often unsharable and uniquely significant.

Unlike tool-based experience design, games offer something harder to define: a sense of freedom, responsibility, accomplishment, play, friendship, and emotional connection. These feelings aren’t outcomes, they’re relationships we form through interaction, feedback, and immersion.

As AI becomes more embedded in our lives, it begins to inherit the same tensions that game designers have long navigated: balancing user agency with automation, structure with freedom, guidance with exploration. A good game has the right amount of tension, challenge, and reward, whereas a bad game has too little or too many challenges. This resembles behavioral science's approach to nudge design: the right amount of friction creates meaningful engagement. Not everything should be "streamlined." Sometimes, intentional pauses, detours, and small resistances are what make an experience feel alive.

Xiangqi, or Chinese chess 象棋

2. From linear to non-linear narratives

One of the biggest shifts in designing for AI is moving from linear, deterministic flows to non-linear, probabilistic systems, where outcomes are more varied and randomized. With the same prompt or input, a generative AI tool might produce different outputs each time. Unlike traditional UX flows with clear cause and effect, where there is a fairly direct mapping between what designers create and what the reader or viewer experiences, games, and increasingly AI tools, require thinking in multi-dimensional interaction spaces.

Designers don’t just specify what happens; they define how things might happen, under what conditions, and with what probabilities. Like games, we gave users a greater extent of control and affordances over the pacing, sequence, and outcome of events. The craft becomes less about locking down exact UI and more about defining the criteria to guide a meaningful system where the user co-create the experience.

Black Myth: Wukong 黑神话:悟空 (2024)

3. Capturing the essence of an experience

How do you recreate the fight with your sister for the last piece of watermelon? Maybe it’s the heat of the summer air, the fan running loud in the background, the rules of rock-paper-scissors negotiated on the fly, or the sudden dash as someone grabs the fruit and runs. Sound, visuals, pacing, conflict, and rules all work together to not only convey a memory, but a felt experience.

The goal of a game designer is to figure out the essential elements that define it and make it special. Similarly, when we design AI products that help people learn, shop, create, or navigate, we’re designing for the feelings of accomplished after taking a baby step in learning, the delight of self-expression and exploration, and confidence in getting to places as planned. There are emotional layers beyond the utilities that define the essence of an experience, especially as they becomes more personal—what relaxes one person might overwhelm another, what feels expressive to one might feel superficial to someone else.

4. Fun is about generating new questions

So how do we design play? Jesse Schell defines a game as a problem-solving activity approached with a playful attitude. The essence of play isn’t just action—it’s also curiosity. For example, when an assembly line worker tries to answer the question “Can I beat my record,” the reason for his activity is not just to earn money, but to indulge his curiosity about a personal question.

Activity feels more like “play” than “work” when it attempts to answer questions like “What happens when I turn this knob?” “Can we beat this team?” “What can I make with this clay?” “What happens when I finish this level?” When we design AI experiences, let’s ask what questions does this experience raise for the user? What gets them to care? And what might spark even more questions?

The Legend of Sword and Fairy 4 仙剑奇侠传四(2007)

5. A tree is just a means to an end

We often talk about being human-centered, but as experiences become more personal (think the game that really bonded your relationship with a friend, the game that inspired you to see things differently), I love this framing from Jesse Schell:

“If a tree falls in the forest, and no one is there to hear it, does it make a sound?”

“Well, what is a sound? … If our definition of sound is the experience of hearing a sound, then the answer is no, the tree makes no sound when no one is there.”

“The tree is just a means to an end. And if no one is there to hear it, well, we don’t care at all.”

As designers, we don’t care about the tree and how it falls, we care about the experience of hearing it. The design itself, the buttons, flows, algorithms, is just the container, and not the end. How people relate to it, experience it, remember it, is the end that we truly care about. That’s where game design and AI converge: they both ask us to care less about the system itself, and more about the human on the other end.

山西古建设计笔记

The original writing is in Chinese, see the English translation here.

去年因为《黑悟空》游戏的大热,才开始对山西的古建筑好奇,七月休假顶着酷暑还是决定去一趟山西。从炎热湿润的广东降落到山西的那一刻,就能感受到习习凉风,难怪人们都喜欢来山西避暑。出发前对古建所知甚少,只剩下本科建筑史课模糊的印象,所以一路都在看纪录片补课——要特别推荐清华建筑教授王南讲的《千年一窟看云冈》和《千年一塔看应县》,讲得生动有趣,人也很可爱。

华严寺大雄宝殿(Photo credit to Bonnie Luo)

善化寺吉祥天女(Photo credit to Bonnie Luo)

一路从太原到忻州,再到五台山、代县、大同,感觉只瞥见了山西几百座庙宇的一角。哪怕是不起眼的小镇里,也会不经意经过气势恢宏的木塔或楼阁,牌匾和屋顶都有讲究。山西古建筑以唐代木构为代表,其中五台山上的寺庙群就有三百多座,风格偏简约厚重,有别于明清时期的繁复华丽。带着现代设计的视角一路走访古寺庙和博物馆,一路做的笔记:

1、木结构的建筑理念:为什么中国偏好木质建筑,而希腊偏好石材?这是以前上建筑史课讲到的第一个理解中西建筑区别的关键问题。除了自然资源的差异,中国建筑哲学更注重与自然的和谐,追求“气韵生动”和空间布局的灵动。木材有生命感,能呼应四季和时间变化。中国建筑学派认为建筑是动态的,更重视通过更新和修缮去传承,而非永垂不朽。而希腊建筑哲学更崇尚完美比例和几何的永恒之美(比如雕塑的“黄金比例”),以石材为媒介表现理性秩序和不朽。由于木构建筑难以长久保存,另一个有意思的区别是中国现存的文物实际上集中在了地下,更注重留存宏大的陵墓体系和陪葬品(兵马俑,礼器,玉器),希望让死者延续生前的生活规格和享受,是家族权力和身份的象征。而希腊以地上文物为主,比如各类神庙,雕塑,剧场和公共建筑,以公共艺术表现人与神、英雄的关系,而非个人死后的生活。所以墓葬相对简朴,并不追求重建生前世界。

北齐壁画博物馆

山西省博物院舞乐俑

2、屋顶承载的心愿:佛寺在古代作为人们寄托对神明想象的载体,好玩的是其装饰也根据不同门派的修行方法而变化。应县的净土寺是按照净土宗(汉传佛教宗派之一)的修行方法建立的寺院,核心信仰是通过念佛和观想西方极乐世界修行。因此净土寺的设计师将自己对西方净土的想象呈现在大殿的藻井(即天花板顶部装饰)上,比其他寺院都更华丽,屋顶上也设计了层层叠叠的斗拱,象征”天宫楼阁“。而多数寺院的藻井更为简朴,根据教宗和修行方法的不同,可能更注重罗汉造型或菩萨比例等等。以前看寺庙觉得他们都长得很像,看得很茫然。寺庙的设计逻辑以现代视角看就像是以完成来访者的朝拜目标为主的设计,来访者和神明都是建筑的使用者。那么寺庙的装饰、动线和互动模式(烧香、布施、斋饭、晚课)都是帮助人们完成朝拜任务的载体。

应县净土寺藻井(Photo credit to Bonnie Luo)

3、以神为本:工作中总是追求“以人为本”的设计,但走进晋祠和古代佛寺,有点不适应许多设计都是“以神为本”。比如晋祠戏台的位置,一开始很疑惑为什么要这样安排,后来才知道原来戏台是给神明唱戏的,不是给人唱的。很多佛寺大殿的门框也被设计成神明观看来往人流和庭院山水的“画框”。当然对来访者而言,这也象征着俗世和佛国的界限,创造出层层递进、迈入圣境的宁静氛围。虽然是为神而建,但人毕竟无法和神明做访谈了解他们的需求,所以建筑还是更多体现了人对神的美好生活的想象,就像净土寺的“天宫楼阁”,可能神根本就不住宫殿,也不吃蟠桃(笑)。

善化寺大殿门框(Photo credit to Bonnie Luo)

4、多元文化的时代:此行最出乎意料的是了解到从北魏时期开始,山西汉人文化就与印度(佛教传到中国)和波斯(丝绸之路)的西域文化相融合。石窟和寺庙里的壁画、纹饰和雕塑风格皆体现了当时宗教文化的多元。当时,中西亚和西域商人将玻璃、银器和艺术元素带入了平城(今大同)居民的日常生活,也影响了鲜卑与汉人的流行服饰和舞乐风格。这和20世纪初开始的全球化风潮相呼应:不同社区、国家间的人口、技术和文化的紧密流动。当时的平城作为西域与胡汉融合之地,确实称得上“美美与共,天下大同”。

北齐壁画博物馆

附录

5、佛教如何在中国本土化?这次参观山西博物馆才意识到,佛教最早传入中国是在魏晋南北朝,以山西大同为都(当时称平城)的时代。南北朝时期战乱纷起,人们愿意相信佛,因为其教义和修行方式鼓励忍受现世苦难,以换得来世进入极乐世界,所以战争年代的佛教反而兴盛。佛教的本土化也反映在雕塑和建筑上:比如在云冈石窟中的佛像开始穿汉族服饰,而非传统袈裟,让汉族信徒更能共鸣。无论古今,果然建立连接和信任的方式都如出一辙,即使是神明也要接地气。

忻州佛光寺东大殿内壁画—春日莫兰迪色的罗汉朋友们(Photo credit to Bonnie Luo)

6、莫兰迪色的罗汉:忻州佛光寺是此行最喜欢的寺院,在五台山脚下,建筑风格古朴,院子静谧舒服,满眼绿色。东大殿的罗汉生动有趣,神情衣着都洋溢着生命力,难以想象当时的匠人就懂得用春夏季的莫兰迪色了。

忻州佛光寺内院(Photo credit to Bonnie Luo)

Back to the basics: when breath becomes air

Over the past few months, I’ve been rethinking what brings me joy and learning to be more in tune with how my body feels. Back to the basics. I don’t have all the answers yet, but I’ve gathered enough courage to take small steps forward without full clarity—trusting that clarity comes from action, not speculation.

2025 has been the year I’ve made the most progress in understanding myself—owning both the good and the bad. It’s freeing to feel more honest, accepting, and comfortable in my own skin. The reassuring news is confirming, yet again, that my anchors are so easily accessible and simple: reading to explore and writing to make sense of it in my own way. These have quietly guide how I make decisions and how I spend my time.

The Philosopher’s Walk in Toronto, my favorite footpath in college (June 2025)

This past weekend, I picked up When Breath Becomes Air by Paul Kalanithi because it somehow felt like the right timing and his writing was indeed deeply moving. Even though I’m not in the medical field, I felt strangely connected to Paul’s motivation for practicing neurosurgery and how he viewed his role as a medical practitioner. Like him, I also double majored in Literature and Life Science, and was struck by the way he brought a multidisciplinary lens to his work. His approach reminded me of what I strive for in my own role as a researcher: bringing more human-centered perspectives into how we apply emerging technologies to real-world problems.

Paul sees medical and neuroscience as the discipline where biology, morality, literature, and philosophy intersect. It never occurred to me that some of life’s biggest questions—about identity, death, and meaning—often arise most urgently in medical contexts. Paul originally considered studying biological philosophy, but chose to gain direct experience through practicing medicine instead. That tension deciding between abstract critique and hands-on impact echoes my own journey, where I’ve also chosen to be closer to applied research and product development rather than staying solely in behavioral science papers and literary theories. I wanted more direct impact and learnings from the messiness of implementation.

“The highest ideal was not saving lives—everyone dies eventually—but guiding a patient or family to an understanding of death or illness.” That line stayed with me. It’s not about heroism, it’s about guiding someone to make sense of the hardest things: what kind of life is worth living, and what quality of life is acceptable after treatment. In a strange way, it reminded me of my own work. Not because it’s equally weighty, but because research, too, is about helping teams understand perspectives that are often ambiguous, subjective, and deeply human. It’s about making space for hope, fear, love, beauty, envy, striving—things that don’t show up in dashboards or metrics, but are central to the experience of technology.

I was also struck by Paul’s reflection on the early meaning of the word patient: “one who endures hardship without complaint.” It’s a gentle reminder to meet people where they are. To see people not as problems to be solved, but as whole beings to be understood.

Toronto’s summer in King’s College Circle (June 2025)

Hope you’re enjoying the summer green as much as I am! (As you can tell, all the images are green and blue.) See you in the next post.

Keeping the Moore's Law alive

Semiconductors, better known as “chips,” might sound abstract if you don’t work in hardware. But they power nearly everything in our daily lives: phones, laptops, cars, and increasingly, the infrastructure behind AI. The real challenge today isn’t just about having enough data, it’s also about having the computing power to process it. Chips are the bottleneck, and producing them is staggeringly complex, capital-intensive, and geopolitically sensitive.

Like many people in tech, I’ve heard the word “semiconductors” thrown around all the time, but never really understood why they’re so central to everything. A few friends recommended the Chip War by Chris Miller last year and I finally finished it a few weeks ago—loved it. Here are four ideas that stuck with me:

1. Moore’s Law becomes an industry growth roadmap

Moore’s Law is often described as an observation: the number of transistors on a chip doubles roughly every two years, driving exponential increases in computing power.

What I didn’t realize is that it became a self-fulfilling growth roadmap for the semiconductor industry—a shared goal that governments, investors, and chipmakers aligned around. Despite concerns about physical and economic limits, companies organized roadmaps around making this "law" true, treating it less like physics and more like a shared mission. It’s a powerful example of how a narrative turns a forecast into a coordination mechanism for a global industry.

Photo by SpaceX on Unsplash

2. Early demand rarely points to the final use case

When transistors were first invented, few people knew what to do with them. Beyond replacing bulky vacuum tubes, their potential seemed limited, much like how it’s cognitively hard for people to envision how AI could fundamentally change our lives today.

What changed everything was an unexpected early adopter: the U.S. military. Defense agencies and NASA needed compact, high-performing electronics for missiles and space exploration in the 1960s. That early niche demand gave semiconductors a launchpad to scale production. As costs dropped, chips moved into everyday consumer products: radios, calculators, and eventually, personal computers.

What struck me most was a surprising parallel with modern UX and product strategy: Fairchild Semiconductor didn’t just wait for demand to emerge. They actively imagined it, creating detailed blueprints of future consumer devices powered by chips before the market even existed. It was a way to reduce uncertainty and spark demand, much like today’s visionary product mockups or AI pitch decks that help people visualize what doesn’t exist yet.

Photo by NASA on Unsplash

3. Why Intel fell behind in the AI race and Nvidia took the lead?

Intel led the personal computing revolution in the 1970s, driven by Bob Noyce’s bold bet on microprocessors and his belief in the future of personal computing, a vision few shared at the time. But in the AI and graphics era, Intel struggled to keep up, especially in advanced chip manufacturing and AI infrastructure, where Nvidia and TSMC moved faster and captured the momentum.

Despite early investments in foundational technologies like EUV tools that enabled GPU development, Intel was slow to pivot towards AI computing. Nvidia, on the other hand, recognized the opportunity early and bet aggressively on AI acceleration, developing CUDA and positioning its GPUs as the backbone of AI computing. What began as a graphics company transformed into a core infrastructure player for AI.

Beyond technical challenges and leadership strategy, company culture played a key role in this divergence. Intel’s structured, risk-averse environment prioritized predictability and incremental progress—a pattern consistent with the classic innovator’s dilemma, where incumbents hesitate to disrupt their own successful models even as new paradigms emerge. In contrast, Nvidia built a fast-moving, mission-driven culture with flat hierarchy and tight feedback loops. Under Jensen Huang’s leadership, the company is able to move quickly and shape the AI landscape. Building a timeless company isn’t about one single bold move, it’s about making the right bets at the right time, again and again.

Photo by TangChi Lee on Unsplash

4. How Asia broke into the high-value part of the supply chain?

When we think of semiconductors, we often picture Silicon Valley. But today, the center of gravity for advanced chip manufacturing lies in Asia.

Taiwan produces nearly 40% of the world’s new computing power each year. South Korea dominates memory chips, and Japan supplies critical materials like silicon wafers and specialty gases. Europe and the U.S. still lead in chip design tools, like ASML’s EUV machines and ARM’s architectures, but the most complex and valuable manufacturing steps are concentrated in Asia.

This shift wasn’t accidental. Asian governments took a proactive, hands-on approach, shaped by a Confucian-influenced philosophy of state-guided development. They funneled capital and pushed banks to fund strategic sector, hired US-trained engineers, kept their exchange rates undervalued, and secured tech transfer through partnership. In Taiwan and South Korea, support from the U.S. motivated in part by geopolitical rivalry with Japan also played a key role.

Today, power in chips isn’t just about who makes them, it’s also about who buys them. China, though behind in cutting-edge chips, controls massive demand for lower-end components. That market power gives it leverage, as it remains both the U.S.’s biggest customer and competitor. It’s a complex balance of dependency and rivalry, one shaped as much by market dynamics as it is by politics and culture.

It’s fascinating to learn how one of today’s most critical industries has been shaped not just by technology, but by the interplay of markets, culture, and geopolitics. As we explore emerging use cases for AI, the history of the chip industry offers a mirror—that technological shifts are rarely just about engineering, they’re about timing, narrative, and the systems we build around them.

悉达多遇见的河流

Years after first reading Hermann Hesse's Siddhartha in college, I revisited the novel and noted a few themes that stood out for me. This time, I resonated with Siddhartha much more around understanding the limitations of seeking wisdom from others, the idea that life’s meaning lies in the act of living itself, and the depth of insights offered by nature and fictional narratives.

这周重新读了黑塞的《悉达多》,这本学生时代就很喜欢的小说。当时的经历太少,只读懂了情节,对背后的隐喻一知半解。在二十七岁的年纪读来,虽然对悉达多的一些观点仍有疑问,但对他不同时期的经历有了更多共鸣和理解。

1 圣贤理论的局限性

悉达多年少时就意识到要去寻找“自我”(Atman),于是离家追随沙门修行,期望通过戒律和禅定摆脱自我。然而,不久他就发现学习禅定和克己只是暂时逃避生命的痛苦和无意义感,无法带来真正的安宁。圣贤的智慧终归是他人经验的总结,不能代替自己的感悟。许多人追随乔达摩(释迦牟尼),将他当作信仰,但如果信徒们内心没有自己的教义和律法时,最终也难以真正获得救赎。只有走出自己的路,诚实面对内心的渴求,才能获得安宁。不去玩别人设立好的游戏而去创造属于自己的游戏是很难的,但也许这是最靠近正确答案的路径。

Photo by 雨空 on Unsplash

2 活着就是人生的意义

我们这一代人常被鼓励去寻找生命的意义。我仍然觉得追寻意义是重要的,但也开始接受或许生命本身没有固定的意义,活着并不断经历就是其意义所在。通过与不同的人和事碰撞,更了解自己,也有能力诚实面对和接纳自己已经是艰难的课题了。所以当我读到悉达多从向圣贤寻找智慧转向 “拜自己为师,认识神秘的悉达多” 时,也深有共鸣。在以前的悉达多眼中,森林,群星,动物,河流都没有意义,于是他对万物熟视无睹。当他不再询问意义后,反而可以清晰辨明地看见世界,单纯地欣赏自然之美,也有更敏锐的观察力。“自我” 是无法仅靠思考或遵循他人经验来捕捉的,需要实实在在地生活,倾听内心的声音和环境给你的信号。我以前对这些信号也熟视无睹、敏感度很低,但现在越来越认同重要的决定需要等待信号,除了耐心等待和增强基础能力以外,没有太多可以做的。这和学霸猫说过的 “人除了照顾好自己以外,无事可做” 也是类似的逻辑。

3 游戏心态的背后是游离于真实生活

带着游戏的态度生活,实际上也意味着冷眼旁观,只寻得开心就够了,不愿全身心投入生活和劳作中。 过去的很长时间里,我也一直处在这样不够认真、游离的状态,对很多事情都不够在意,觉得好玩就行了。作为一个知识工作者,哪怕我很喜欢理论和建立框架去解构人和世界的运行规律,并以此为乐,这几个月在工作中也越发体会到理论和概念本身的价值有限,只有当这些想法被实现的时候才能带来真正的价值。松弛的态度虽然表面上看似是更健康的心态、也减少了焦虑,但实际上也反映了内心的不坚定和逃避责任。这也是我下半年的功课,学会更 “在场”,愿意承担真实的风险与责任。

4 倾听河流的启示

当悉达多步入尘世成为商人和赌徒后,也承受着沉闷和无意义的生活陷入绝望。他来到河边,原本打算结束生命。站在河流面前,听到潺潺流水的生命力时,他忽然意识到自己的痛苦源于对物质的追求和舒适生活的沉迷,丧失了爱与看见万物的能力。他发现了河流的秘密,“不懈奔流,却总在此处。永远是这条河,却时刻更新。” 这让我想到河流的两个隐喻:一是人的成长就如同河流的不断流动和轮回,不断从头再来,学到一些事情后又会犯错,经历失望和痛苦,再重新站起来。哪怕泥泞不堪,也要心悦诚服地随它走,流动和不稳定是常态。二是悉达多提到的河水中成千上万的声音,“王的声音、卒的声音、牡牛的声音、夜莺的声音、孕育者的声音、叹息着的声音” 象征着生命的多样性和复杂性。倾听河流的声音,也是倾听生命的声音。河水奔涌着流向湖泊、湍流、大海;抵达目标,又奔向新的目标,这也是人一代又一代生命周期的隐喻。

5 虚构作品的自由和广阔

最后,回顾这些年的阅读偏好,我对虚构和非虚构作品的看法发生了很大变化。学生时期看了大量虚构类文学作品,甚至本科也选了文学作为专业之一。但从大四开始,出于工作和专业的考量,我开始看更多应用类的非虚构作品,当时看了大量行为经济学和社科类的文本。直到近几个月,随着工作进入了新的阶段,又机缘巧合重新找回了对虚构作品的兴趣,意识到虚构类题材有更多自由度去探究现实所无法触及的隐喻与问题, 例如《悉达多》中河流的隐喻,以及博尔赫斯短篇中的图书馆与花园。它们让我们跳脱现实的框架,有空间去思考更广阔的命题。

P.S. Check out this song 河流(River)

The world as reinforcing cycles

Ray Dalio’s Principles for Dealing with the Changing World Order: Why Nations Succeed and Fail presents a comprehensive, longitudinal approach for understanding the world, one that our recency bias sometimes forgets. For those who have primarily experienced periods of growth or focused on the post-WWII era, including myself, it can be difficult to envision a world radically different.

To understand and navigate the complexities of our time, it’s important to explore a broad range of historical examples of how nations rise and fall, which would help uncover the fundamental, timeless patterns that shape these cycles. Dalio’s method of analyzing the intricate forces at play and synthesizing the cause-and-effect relationships behind historical progression is a powerful model. Personally, it has inspired me to rethink how we might study the complexity of user and market behaviors, especially how we could distill principles and patterns to better understand and guide the seemingly complex behaviors of AI/ML models as UX and product builders.

Dalio, Ray. Principles for Dealing with the Changing World Order

01 A 1400-year perspective

Looking back 1,400 years (~600 CE), human productivity has steadily increased global wealth and living standards. While different societies rose at different times, the reasons were consistent — education, inventiveness, work ethic, and economic systems turned ideas into output. For example, wealth once centered on agricultural land, then on machine output, and now on digital data and information processing.

Dalio, Ray. Principles for Dealing with the Changing World Order

Personal notes

  • Time scale: Dalio’s 1,400-year perspective is a powerful reminder that our current experiences are just a tiny part of a much larger cycle. Understanding our position within these cycles is crucial for discerning what truly matters amid the noise.

  • Diverse and global perspective: Drawing insights from a diverse, large sample size across space and time is essential. Too often, we only focus on a single country, missing valuable lessons that a global perspective can surface.

  • Cause-and-effect relationships: As we shift towards building probabilistic experiences with ML and AI, our role as designers and product builders increasingly involves defining and communicating the underlying cause-and-effect relationships that guide model behavior. Seeing how Dalio studied and presented the cause-and-effect patterns that drive historical progression is an inspiration for effective communication of complex insights.

Guiding question

  • How might we build collective intuition for long-term thinking?

02 Reinforcing nature of rises and declines

Productivity evolves steadily but doesn’t cause sudden shifts in wealth and power. These shifts come from cycles driven by logical cause-and-effect relationships, such as boosts, booms, evolutions, and wars.

My biggest takeaway is the reminder that strengths and weaknesses are mutually reinforcing. For example, education, competitiveness, economic output, share of world trade, contribute to the others being strong or weak, for logical reasons. This also reflects the old Chinese saying, “That which is long divided must unify; that which is long unified must divide.”(分久必合,合久必分)

Dalio identifies eight key determinants of a nation’s strength: education, competitiveness, innovation and technology, economic output, share of world trade, military strength, financial center strength, and reserve currency status. These determinants reinforce each other, driving a nation’s rise, peak, and decline:

  • Rise: Strong leadership, inventiveness, and education foster a strong culture and efficient resource allocation, leading to economic growth, strong markets, and financial centers.

  • Peak: The nation enjoys prosperity with low debt and minimal gaps in wealth, values, and politics, under a stable world order. However, within capitalist systems, uneven financial gains widen the wealth gap.

  • Decline: Excessive borrowing and financial bubbles weaken the nation as debt rises and wealth, values, and political divides grow. Emerging rivals challenge the nation, leading to a painful restructuring.

Dalio, Ray. Principles for Dealing with the Changing World Order

Personal notes

  • Reinforcing dynamics: Although I’ve long heard of the saying that our weaknesses tend to hide behind our strengths, it’s not until reading this book did I truly see how powerful this means over the scale of history, manifesting through humans in aggregates how our strengths and weaknesses reinforce each other in a cyclical pattern.

  • Mirroring business lifecycles: The rise and fall of nations closely mirrors the lifecycle of a business—from growth to maturity to decline. Similarly, a strong founding team that allocates resources efficiently is more likely to achieve product-market fit, driving rapid growth. However, at its peak, a business may develop inefficiencies that undermine its strengths. Its ability to remain a market leader depends on managing these growth factors effectively.

Guiding question

  • What are the key factors that drive and hinder a company’s growth, and how can we accurately assess them to ensure long-term investment in the right areas?

03 Measuring real value

Dalio’s articulation of how the debt cycle works is the best I’ve seen, so it’s worth getting into more details in this section. In a capitalist system, money, credit, and economic growth are the biggest influences on how wealth and power rise and decline. The difference between real vs. market value varies at different times of the cycle and a typical long debt cycle goes the follows:

  • Early stages: With little or no debt, hard money like gold is used for transactions because no trust/credit is required. Later, to avoid the risks and inconvenience of carrying metal money, credible parties issue paper claims on hard money, which soon function as money itself.

  • Middle stages: Initially, the number of paper claims matches the hard money in reserve. Over time, the appeal of credit and debt grows, leading to trouble when income can’t cover debts, or when claims on money outpace the growth of actual assets or goods to back them up, making debt repayment impossible.

  • Late stages: In a debt crisis, printing money becomes the quickest way to reduce debt, allowing the credit/debt cycle to restart. This approach, though not well understood, seems beneficial because it alleviates debt, obscures the harm to holders of money and debt assets, and inflates asset values in a depreciating currency, giving the illusion of increased wealth.

Dalio, Ray. Principles for Dealing with the Changing World Order

Personal notes

  • Measuring real value: Evaluating real value is crucial in early-stage research—uncovering the honest, unfiltered opinion and behavior of users. Observing where real vs. market value diverges or aligns helps guide investment in products and infrastructure, especially with overhyped areas like generative AI and agent.

  • Aligning with ground truth: As builders of probabilistic ML models, how do we evaluate if our predictions actually align with ground truth to guide model training and iteration? This is a fascinating area to be further explored to keep our products truly user-centered.

Guiding question

  • How might we create a clear feedback loop for model iteration and collaboratively define principles for ML model behavior, involving UX, product, engineering, and data teams?

Being able to combine multi-disciplinary thinking to expand our perspective has been my ongoing passion, and Dalio sets a great example for the kind of in-depth, longitudinal studies needed to unpack the complexity of our world, uncovering clear cause-and-effect relationships that are easy to understand and learn from.

Ending on a personal note -- sharing a photo of the Chapel of Soul from Porto with y'all since I've been traveling in Portugal lately. Humans are so small in front of this ;)

巴黎 French Kaleidoscope

五月和家人在法国休假,重游了巴黎和南法。上次来已经是六年前了,在加州待久了有时候视野和审美会变得局限,巴黎是一个特别尊重美和人文的城市,很喜欢巴黎人自由、优雅、松弛的精神面貌。

01 巴黎的设计风貌

今年七月巴黎即将举办奥运会,看着城市筹备奥运觉得耳目一新,好玩的是奥运场馆竟然把传统的砖红色跑道漆成了淡紫色(”shades of lavender”),这种看似细微、打破常规的设计既有创意又不增加太多成本,完美展现了法国的设计功力。此外,在凡尔赛国家议会场馆还有一组特别的雕塑——将断臂维纳斯设计成参与各项比赛的希腊神像,有打网球的、射箭的、还有冲浪的。与传统希腊雕像的象牙白不同,这些雕像采用了颜色鲜艳的奥林匹克高饱和亮色,使它们既有反差感,又充满活力。这两处设计是我目前看到最有创意、也最喜欢的奥运设计。

杜乐丽花园仍然是我私心最爱的公园,最初因为摄影师 Guillaume Lavrut 的喷泉系列摄影作品而开始关注这座花园的设计。出发前,我正好在读 Jane Jacobs 的《美国大城市的死与生》中关于公园设计的章节,这次逛花园时明显能感受到其以人为中心的设计理念。花园的中心不是景观,而是供行人散步的林荫大道。花园中心的喷泉设计优雅简约,突出了围在喷泉周围的绿色躺椅。每天都有大量游客和当地居民围坐在喷泉边聊天、吃法棍、发呆、约会,或者在逛完卢浮宫后出来透透气,这些坐在喷泉周围鲜活的人才是公园的主角。形形色色的人聚集在喷泉周围,自然形成了一个舞台中心,这些围坐的人既是观众也是表演者。正如 Jacobs 在书中提到的,好的公园不仅需要周围有足够丰富的商业和住宅区,这样有各种各样需求的人群能在不同时间段出入公园,同时也要提供一个能吸引人们自然汇聚的舞台中心,这样公园才会显得热闹而有活力。我想这也是杜乐丽花园的设计理念。

另一个印象深刻的例子是巴黎戴高乐机场的设计,候机楼的桌椅采用丝绒面料搭配黄铜复古手柄,灯的造型宛如烟花即将绽放。整体设计融合了现代的简约与古典的华丽,这一定算得上欧洲最美的机场之一。

Jardin des Tuileries

Assemblée Nationale

Jardin des Tuileries

Charles de Gaulle airport

02 法网:网球爱好者的迪士尼

这次法网的体验远超预期,像是网球爱好者的迪士尼乐园。园区内有三个主场馆,明星球员会在这里比赛。主场馆内设有直播间,现场有新闻记者播报赛况。场馆外还有多个室外训练场,观众可以近距离观察球员比赛,每分结束后观众可自由进出球场。走在园区里,第一次真切感受到来自世界各地的网球爱好者为一项运动而相聚的热烈氛围。

这次买的日场票是第三轮的晋级赛,在主场馆之一 Court Philippe-Chartrier 看了来自意大利的 Sinner 和俄罗斯的 Kotov 的男子单打。看到他们在赛场上拼尽全力去发挥出自己的最佳状态和水准,沉着地应对每一次进攻,观众全神贯注地关注着赛场的动态,那种热烈而专注的氛围很动人,是透过屏幕难以复制的独特体验。

场馆内刻着一句话:“Victory belongs to the most tenacious”(胜利属于最坚韧的人)。当时深深被这句话击中,这不仅是对球员的鼓励和提醒,想做好任何事情,都需要专注力、体力、和意志的坚韧。看 Sinner 和 Kotov 单打时,从观众席上能清晰地看到球员的优劣势,当时一直在想,果然一个人的优势也藏着他的劣势。比如 Sinner 擅长打靠近出界线的球,不好接,但也容易出界;Kotov 则喜欢打刚刚过网的球,也不好接,但经常因为球力度太轻不过网而丢分。看着他们比赛,意识到运动时所展现的状态和风格,其实也是生活中我们做事和理解自己优劣势的隐喻。后来得知 Sinner 在半决赛时竟然是男单积分最高的球员,这场比赛看似势均力敌,是因为他只需要发挥足够赢对手的实力就可以了。果然,胜利属于能留在场上最久的人。

Stade Roland-Garros

Court Philippe-Chartrier

Stade Roland-Garros

03 卢浮宫收藏的人类佳作

重游卢浮宫,再次被这里藏品的高水准所震撼。尤其钟爱群像画,其中最喜欢的是 Jacques-Louis David 的《拿破仑一世加冕大典》(The Coronation of Napoleon,1807)。站在如此大规模的作品面前,会感受到自身的渺小,反衬出作品所描绘世界的宏大。画中的众人见证着拿破仑为皇后加冕而神态各异,各怀心事。这也是我喜欢群像画的原因,每个人都会亲历某些历史事件,尽管只是看似渺小的旁观者,但个人的多样视角本身就具有意义。

去完意大利后,对雕塑的鉴赏力提高了很多。这次在卢浮宫特别喜欢古希腊雕像《胜利女神》(Victoire de Samothrace,公元前190年)。这座雕像将胜利的意象表达的淋漓尽致,尽管雕像失去了双臂,但身后的翅膀和向前倾的姿态展现的磅礴气势令人难忘,优雅轻盈又有力量。

艺术藏品和体育竞技的共性在于人们对探索人类潜能的热爱。在诺大的卢浮宫,场馆地图上只标出了十多件“最值得看”的藏品,这些在上千年历史长河中留下来的作品,就像体育竞技中胜出的冠军,只有最高水准的作品才能在现代仍然与普世价值共鸣。这也是为什么传世作品的题材都围绕着人性最本质的诉求,比如追求真、善、美、权力、爱情和战争。每个时期的艺术流派都有不同的媒介和表现方式,但其内容所表达的人类底层诉求和情感是不变的。

这也对应了一种简单易操作的艺术评判标准:当你站在一件作品面前,如果觉得有打动你的地方,那它就是有价值的。个人喜好会潜移默化地组成当代社会的价值取向,而时代的品味则选出了那些在卢浮宫地图上通过了时间考验的佳作。

Victoire de Samothrace

Vénus de Milo

04 体验人间的镜子

沉浸式体验最佳的是由巴黎商品交易所改造的皮诺私人美术馆(Bourse de Commerce - Pinault Collection),这座美术馆主要展出 François Pinault 五十多年来收藏的当代和新兴艺术作品。前身是巴黎的商品交易所,后来由日本建筑师安藤忠雄(Tadao Ando)以当代建筑风格进行改造。馆内有一个中央展览空间,以便参观者有更多元的动线和观赏视角。

这次最喜欢的是韩国艺术家 Kimsooja 在新展 Le Monde Comme Il Va(《世界如其所是》)中的镜面艺术装置。展览以伏尔泰的哲学短篇为名,讲述一位天使派使者前往人间去观察人类行为的故事。面对人类社会的矛盾与不确定性,神明不确定他们是否值得继续生存,还是应被毁灭以创造一个更好的文明。最终,天使决定让世界如其所是,相信人类能掌握自己的命运。

对应这则故事,Kimsooja 在圆形大厅的地面铺设了一面巨大的镜子,参观者可以穿上鞋套走在镜子上。镜子看似是一个透明的媒介,却诚实地反射出周围的环境、场内形形色色的人和透过展馆玻璃穹顶的蓝天。站在镜子上,看到倒置的世界时,对环境的感知会变得强烈。很喜欢镜子的隐喻:人通过不断经历事情、认识不同的人、与不同环境碰撞,以这些为镜去映照自己的状态。在亲密关系中尤其如此,对他人的认知常常只是映照了自己内心的想法。镜子同时也提醒我们关注现实中的人和具体场景,而不仅仅看到自己想看到的一面。这也是我今年的人生课题,通过接纳和融入真实的环境和他人,真正认知和完善自己。

Bourse de Commerce - Pinault Collection

Bonus: Château La Coste

在普罗旺斯的 Aix-en-Provence 北面约20分钟车程处,有一个艺术酒庄 Château La Coste。创始人是一位热爱葡萄酒和艺术的北爱尔兰人,酒庄的设计理念与加州的 Donum Estate 很像,主打酒庄和艺术中心的结合。酒庄内有许多由安藤忠雄设计的建筑,其中一条走廊由艾未未设计,庄园里还收藏了 Louise Bourgeois 的大蜘蛛雕塑。我们到达酒庄时已经快日落了,是临行前偶然在地图上发现的,也算有缘分,希望下次有机会多待几天。

Crouching Spider, by Louise Bourgeois (2003)

Drop, by Tom Shannon (2009)

Art Centre, by Tadao Ando (2011)

Mater Earth, by Prune Nourry

在法国放空的两周里,虽然身体在旅行,但思绪仍在消化上半年工作中的思考。在新环境里,我依然对事物的设计逻辑和人在其中的角色很敏感,但同时也意识到,自己看到的东西真的只是自己在意的东西而已。我的视线总会落在人本设计、打破常规的创意、人的运作模式和作品的传播性等等。以及关注自己一直渴望的能力,比如竞技场上的坚毅和能够接纳真实世界的包容力。看到不同文化下,大家都困惑着向前,反而觉得自己不是一个人在挣扎。也许就像皮诺美术馆里的展览《世界如其所是》里提到的,重要的是对真实世界保持觉知,认识自己,相信人的能动性会带着我们依旧向前。

What makes a great park?

Why are some parks so lively and popular, while some are so lonely and even unsafe? When we think about how to improve our neighborhood, many would say we need more parks and open space. Parks have been perceived as a cure that can uplift a neighborhood, stabilize real estate value, and bring the community together — but that is a false reassurance because park behaviors are actually pretty volatile and extreme.

In the book The Death and Life of Great American Cities, one of the most influential books in the history of American city planning, Jane Jacobs talks about the use of neighborhood parks and the drivers that are critical in the making of a vibrant, well-loved park. She advocates for community-based planning and the importance of preserving diverse, mixed-use neighborhoods.

At the time, this book was written as a critique of the top-down city planning approach advocated by Robert Moses, the most powerful urban planner in NYC in the mid-20th century, who believed in large-scale urban renewals and modernizing the city at the expense of disrupting existing neighborhoods. Just like human behaviors, parks also have distinct “park behaviors” and layers of complexity, which is a mix of design, urban planning, and psychology.

01 Mixture of Primary Use

The top driver of a park’s success is ensuring a mixture of primary use surrounding it. When you think of a lively park, what matters the most is actually having enough people who enter and leave the park at different times. That required a mix of primary use surrounding the park, including residential, office, and small business, etc. That’s why parks in the financial district tend to be less lively because people all operate on the same daily schedule — they enter the park at once, then leave after work hours. Most of the day and the evening is empty. When an area has a single, dominant use, it imposes a limited schedule, which leads to a vicious circle of an unpopular park.

Photo by Lison Zhao on Unsplash

02 Diversity of Park Design

Besides schedule and usage diversity, Jane Jacobs additionally introduced 4 essential elements that introduced diversity at different levels for a lively park. First, at the eye level, a vibrant park usually offers sufficient stimulation to different usages and moods, also termed intricacy by Jane Jacobs. When the park is too small or its design is very flat, where you can see the whole park at a glance, that’s not enough stimulation at eye level to keep people around. Changes in the rise of grounds or the presence of various focal points introduce subtle differences at the eye level that keep people stay curious to explore. 

For example, San Francisco’s valleys are great examples of this principle, where the ups and downs of the hills introduced intricacy at the eye level. Another great example is the classical gardens in Suzhou, where the location of landscape, rocks, hills and rivers are all strategically located in the garden to introduce subtle eye stimulation from every angle. 

The other related elements are centering and enclosure. Just like a good story, a park also has its climax in the hero journey. We can think of a park as a stage, where there is a center where everyone is both a spectator and performer at once. And finally, the sun is also important, otherwise the park feels gloomy and sad, which attracts less people.

03 Differentiation & Demand Good

The third driver of a lively park is thinking about how it differentiates. There are many parks in a city and sometimes they have similar purposes to each other. Just like building a product, we need to think about what specific, differentiated value a park provides, because there are only so many people in a city and parks essentially are fighting for attention and limited time, just like TikTok and Instagram. 

Jane Jacobs encouraged us to think about the “demand good” for a specific park. For example, having a nice landscape itself isn’t a demand good, but having sports fields, swimming pools, or activities like carnivals are. We can also figure out the demand good by observing its natural use. This again resembles product development where we identify product market fit by observing how real needs are met with our offerings. This is the beauty of multi-disciplinary learnings where we see similar patterns of how things operate from seemingly different fields.

Ultimately, the making of a beloved park isn’t just about the design of the park itself, it's about nurturing diversified neighborhoods capable of using and supporting parks. This is a great example where to make the design successful, designing the ecosystem surrounding it and considering its overall context is critical to its long-term popularity.


There is a video version of this post on YouTube if you prefer a visual walkthrough.

HCI paper review: alignment in the design of interactive AI

In 2024, I set the goal to learn more about designing for human-centered AI and would love to share my learnings from reading academic papers in the field as part of the journey. The hope is to make knowledge in design, behavioral science, and human-computer interaction friendly and accessible for everyone.

In this blog post, I’ll share my review notes for the paper AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support by Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon, and Meredith Ringel Morris. 2023. arXiv:2311.00710.

User Interface Shifts in Computing History

For those who are unfamiliar with user experience (UX) or human-computer interaction (HCI), here is a high-level overview of how user interfaces (UI) have evolved in the past 60 years:

  1. Batch processing: The first general-purpose computer was introduced around 1945. The UI was a single point of contact where people needed to submit a batch of instructions (often a deck of punched cards) to a data center, then they would pick up the output of their batch the next day. It was common to need multiple days to fine-tune the batch to produce the desired outcome.

  2. Command-based interaction: Around 1964, the advent of time-sharing (multiple users sharing a computer’s resources for tasks) led to command-based interaction, where users and computers can take turns, one command at a time. In particular, graphical user interfaces (GUI), using visual elements that convey information and actions a user can take, have become the dominant UX since the launch of the Mac in 1984. A strength of GUI is that it shows the status after each command if designed well. Users don’t need to have a fully specified goal initially because they can reassess the situation and modify their goal/approach as they progress.

  3. Intent-based goal specification: With the third UI shift, represented by the current generative AI (e.g. ChatGPT, Gemini), the user tells the computer what outcome they want, but does not specify how it should be accomplished. Today, users primarily interact with the system by issuing rounds of prompts to gradually refine the outcome, which is a form of interaction that is currently poorly supported with rich opportunities for usability improvements and innovation.

From batch processing to command-based interaction, the speed of fine-tuning the desirable outcome improved drastically. However, with the third shift to human-AI interaction, the lack of transparency of how the AI performs a task, especially for the increasingly complex and high-consideration scenarios, presents new UX challenges for the HCI community today.

Interaction cycle for human-AI systems

The ultimate goal of human-AI interaction is to efficiently achieve a desirable goal for the user. Today, this process involves 3 basic steps: user input, system processing, and system output.

Different from the traditional command-based interaction, where a user monitors and gives commands at every step in the process, with an AI system, the user’s skills shift to focus on (1) being clear and effective at articulating the goal and providing input, and (2) once the output is available, being able to assess if their goal has been achieved.

As an analogy, a human’s role switched from being the executor (take main control to execute) to being the manager (tell another person to execute for you). It requires a different set of skills and mindset, just as when an independent contributor switches to a manager role. For a team to be effective, the manager can’t micromanage every step, otherwise, it decreases the overall productivity. In this case, what are the key touch points where humans (the manager) need to intentionally “align” with the AI system (the executor) to ensure the interaction is effective?

Overview of the paper

To ensure an AI produces desired outcomes, without undesirable side effects (also termed “AI Alignment”), Terry et al. introduce 3 dimensions to consider as we address user interface challenges with AI systems: Specification alignment, Process alignment, and Evaluation support.

  • Specification alignment is the first step in human-AI interaction, where the user defines the desired outcome for the AI system to execute. In addition, the paper also points out the importance of specifying constraints (e.g. safe, cost-effective, aligned with human values). As an extreme example, consider the paperclip thought experiment, where an AI is tasked to produce as many paperclips as possible. The AI may eventually start destroying computers, refrigerators, or anything made of metal to make more paper clips, which is not aligned with how humans will achieve the goal.

  • Process alignment refers to providing the ability for users to view and/or control the AI’s underlying execution process. The paper proposes providing mechanisms that ensure (1) the user can understand how the system executes the task in ways that can be understood by humans (“means alignment”), and (2) give users the ability to modify these choices (”control alignment”).

  • Evaluation support is the final step where users validate that the AI’s output meets their goals. As AI becomes increasingly capable of difficult and complex tasks, a significant challenge is evaluating its outputs. The problem of evaluation can be further divided into two problems: (1) verifying the AI’s output correctly and completely fulfills the user’s intent and comprehension, and (2) understanding the AI’s output, with comprehension being a much more important problem to solve.

Personal notes

1\ Cognitive challenges with defining outcome. Counterintuitively, this step can be tricky because humans are not good at knowing or being able to describe what they want initially, especially for complex and high-consideration tasks. Considering human cognitive limitations, it’s important to account for the process for users to learn, then gradually understand and be able to describe their goal. This resembles a classic decision-making challenge when people shop. Although you know the goal is to buy a vacuum, you still need to go through the lengthy process of reading articles to learn about its major categories and functionalities and talking to friends and families before you know what you truly need and want. Similar to shopping research, the learning process is where we gradually build confidence in our judgment. Open question: how might we help users learn while maintaining efficiency in the process? One idea could be dynamic, personalized support for more or less explanation as users specify the requirement.

2\ Verifying interpretation upfront. One way to improve specification alignment for general-purpose AI is by providing the ability for users to verify and make necessary corrections to the AI’s interpretation of the intended outcomes before it proceeds. I love this direction because it resembles how real-life human collaboration works. Think about the manager and IC example, to ensure your project goal is aligned with what your manager has in mind (which could sometimes be under-specified or ambiguous), paraphrasing the requirement and sharing your plan of action beforehand helps confirm again that you and your manager are on the same page. Future research to understand (a) how real-life human collaboration and communication best practices can be applied for human-AI interaction, and (b) the right balance for efficiency vs. efforts for verification will be interesting to explore.

3\ Bridging the Process Gulf with a Surrogate Process. The paper introduces the concept of Process Gulf, as an extension of Norman’s concepts of the Gulfs of Execution and Evaluation, that highlights the gulf that can arise between a person and an AI due to the qualitatively different ways in which each produces an outcome. For example, a diffusion model for image generation transforms an image of statistical noise into a coherent image, an image creation process unfamiliar to most people. To bridge the Process Gulf, the paper proposes creating a simplified, separately derived, but controllable representation of the AI’s actual process, also termed a Surrogate Process. With a more accessible representation of the set of choices the AI needs to make in the process, the user can better intervene and guide the execution. Open question: since AI systems can be understood at many levels of abstraction, what’s the right level of explainability so that humans can easily understand and control how AI solves a problem?

4\ In-context evaluation and learning. Today, an AI tasked to recommend clothes you like would simply show you visuals of the clothes for at-a-glance evaluation. However, when the task becomes complicated, like creating code for an app, the AI system may provide comments, a natural language summary, or an architectural diagram of the code produced to help you evaluate. Future research: explore ways to provide simple, dynamic, and accessible explanations (e.g. visual, links to learn more) of the outcome produced would be useful for in-context evaluation and learning — it also assists with understanding the state of the problem after the AI performs some work, as the paper alluded to.

5\ Control mechanisms inspired by real-life tools. The importance of control mechanisms has been discussed extensively in the HCI community and I especially love the principles outlined in the People + AI Guidebook. When thinking about the appropriate levels of control, the common mechanism is providing parameters for a user to play with. For example, in Midjourney (a text-to-image model), users can adjust the “chaos” parameter to produce variations of the image. However, no support is currently provided to understand how a particular value will impact the generated images. Relatedly, as an interesting research exploration, PromptPaint provides users the ability to influence the text-to-image generation through paint medium-like interactions, using the paint palette metaphor to provide more control. As a result, it helps users specify their goals at greater granularity and gives users the ability to modify the choices involved as AI is producing the image. Future research: based on the specific task, what other real-life metaphors can be referenced as inspiration for control mechanisms (like pain palette for image generation)?

In Prompt Paint, the user can specify the area of generation with brushing (dark grey) with a prompt stencil. When the user completes brushing, the tool starts generating a part of the image while showing the process to the user (Chung and Adar, 2023)

6\ Interactive alignment for multi-users. The paper has been primarily discussing the user interface challenges and opportunities of a single user interacting with a single AI. As the paper alluded to, it would be useful to consider the alignment for interactions that include multiple parties, which introduces additional dimensions and complexity. For example, when an AI engaged in a music creation task involving two people. Future research: how would the alignment goals, processes, and dimensions evolve for a wider range of collaboration scenarios?

Thanks for reading

This post covers a broad set of themes in the AI alignment problem space. In upcoming HCI paper reviews, I’d love to explore specific use cases and verticals in the field. If you have any thoughts or suggestions, please leave a comment or get in touch!

Thanks to Bonnie Luo and Benjamin Yu for helpful discussions and feedback.

References

Jakob Nielsen. 2023. AI: First New UI Paradigm in 60 Years”. https://www.nngroup.com/articles/ai-paradigm. Accessed: 2024-03-01.

John Joon Young Chung and Eytan Adar. 2023. PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions. In Proceedings of UIST 2023. Association for Computing Machinery, New York, NY, USA, 17 pages. https://doi.org/10.1145/3586183.3606777

Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon, and Meredith Ringel Morris. 2023. AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support. arXiv 2023, arXiv:2311.00710.

Negotiation theory and human agency

Negotiation has been a term I mostly associated with business or politics in the past, involving intense debates and advocating for the interests of each party. However, I began to appreciate and explore this concept more intentionally since last year, when I was exposed to a more diverse set of collaboration scenarios. Then, I realized negotiation is everywhere and understanding its history, philosophy, and practice is important for thinking about how humans interact in a world of complexity. With a background in behavioral science, human-computer interaction, and design research, I began to see deeper connections between negotiation with each of these fields.

Okochi Sanso Garden, Kyoto 京都大河内山庄

Evolution of negotiation theory

Two notable milestones in negotiation literature are Getting to Yes (1981) by Fisher and Ury, and Never Split the Difference (2016) by Chris Voss. The former focuses on identifying interests and creating value for both parties, while the latter recognizes the emotional nature of negotiation and emphasizes the importance of building tactical empathy to gather information and influence the other party's thinking.

The shift from objectively identifying a win-win solution to challenging the idea of seeking a compromise is fascinating and counterintuitive at first. As the title Never Split the Difference suggests, Voss believes it’s better to not make a deal if compromise is involved. Instead, drawing from his experience as a former FBI hostage negotiator, he focused on uncovering Black Swans, which are hidden pieces of information that can change the course of a negotiation and push the other party towards a deal. This became his primary strategy for finding unconventional solutions.

This evolution in negotiation philosophy is an interesting parallel with the shift from classical economics to behavioral economics — both evolved to recognize the limitations of purely rational and utility-maximizing models. Similar to Never Split the Difference, behavioral economics shifts the focus from simplified, rational economic models to a more nuanced understanding of human behavior, which is shaped by emotions, biases, and heuristics.

Konchi-in Temple, Kyoto 京都金地院

Human agency at heart

People want to be heard, understood, and respected. In Never Split the Difference, building tactical empathy in negotiation means ensuring sufficient trust and safety for a real conversation to begin. Since change represents uncertainty and people want to be in control, saying no to a proposal is the easiest way to maintain that control and the status quo. This completely changed my perspective on the nature of negotiation because it’s ultimately about addressing fundamental human needs with psychological principles. It’s not just about fighting for individual interests, it’s much more about building connections, helping each other feel in control, and identifying creative solutions together.

Another memorable idea is that “Yes” has multiple layers (i.e. counterfeit, confirmation, and commitment), while “No” is the gateway to “Yes.” Saying “No” allows us time to pivot and adjust, creating an environment for the one “Yes” that mattered and gives us an opportunity to convince others that the proposed change is more advantageous than maintaining the status quo. Then, negotiation is the process of helping the other party feel protected and safe, so they can consider other possibilities with a relaxed mindset.

This also resembles the dynamics of how humans interact with technology, especially with AI systems. When systems (e.g. algorithms) collecting human input (e.g. data) without making people feel heard, respected, or in control, it becomes difficult to establish a genuine conversation (e.g. engagement). An effective feedback and control mechanism needs to account for human motivation and provide a clear incentive structure, so that the value and impact of input is meaningful. When considering human-computer interaction through the lens of human-machine negotiation, it’s interesting that we’re applying similar psychological principles to help individuals maintain their agency as foundational needs.

Practice of tactical empathy

When it comes to tactical steps to build tactical empathy or uncovering the black swan, the approach mentioned in Never Split the Difference shared a lot of similarities with user experience research moderation practices. Methods like asking calibrated question, focusing on discovery and uncovering insights, and active listening are all familiar to researchers. Although the relationship between a user and a researcher isn’t a negotiating one, the process and desired outcome is similar. Both the negotiator and the researcher aim to uncover insights about the other party to deeply understand their needs, so they can identify unconventional solutions or framing that change the course of the conversation or strategy.

Finally, the practice of emotional labeling reminds me of methods used in psychotherapy. It involves identifying and verbalizing the predictable emotions of a situation, which helps build empathy and insights for both parties. Once the emotion is labeled, we can talk about them without getting wound up because using language to objectify negative thoughts make them less frightening and disrupt their raw intensity.

Nanzen-ji Temple, Kyoto 京都南禅寺

At its core, negotiation is not about being competitive and skillful in applying complex methods or tactics. It is all about creating the right environment for genuine connection and conversation to begin.

日本设计随笔

年末在日本度假,旅途中一路都感受着独特的日式设计美学,很钦佩这种愿意用心、耐心把所有事情考虑到极致的理念。无论是宏大的建筑景观还是料理食器的每一处细节,都能感受到背后的设计哲学:尊重自然,返璞归真,现代与传统的巧妙融合。

01 景观与建筑

在京都游览青莲院时,一踏入华顶殿内院里的竹席,就被绿意盎然的庭院景观吸引,视线由内庭延伸至外庭的青松、池塘和竹林。外景与内景相融合的设计将庭院和自然环境作为一个整体,庭院的结构实是为了突出园林景观之美。坐在竹席上,映入眼帘的绿色让人觉得不自觉地安静下来。青莲院的设计完美体现了日本美学中「静寂」(Seijaku)的原则,让人沉思安静,不感到浮夸、躁动。

青莲院华顶殿

青莲院内庭

京都知恩院大门

岚山常寂光寺大门

每次游览寺庙都喜欢在大门前后停留很久,因为喜欢看大门作为画框内流动的风景。看着游人进进出出,大门内外也自成一景。京都知恩院的大门尤其壮观,体积比别处寺院的大门更宏伟。站在台阶底下望向大门,每向前走几步,画框内的风景也不停变换,由远到近能看到不断放大的竹林,再往前走时便能看到正要跨过门栏、左顾右盼的旅客。

东京麻布台之丘

除了传统寺庙内外景相结合的结构之美以外,东京2023年末新落成的建筑「麻布台之丘」(Azabudai Hills)在线条和形状上也令人眼前一亮。麻布台之丘背后的设计理念是「现代都市村」(Modern Urban Village),采用格状花栏、棚架和弧线的元素,屋顶种满绿植,像是都市中的一座绿意山丘。该建筑由英国设计师 Thomas Heatherwick 设计,他也是纽约 The Vessel 的设计者。麻布台之丘体现了日本美学中「不匀整」(Fukinsei)和「脱俗」(Datsuzoku)的设计原则,前者指的是让设计不规则、不对称、不呆板、充满惊喜,后者指的是让设计风格自由而无拘无束,能超越传统和惯例,有创新的思考。

京都清水寺

京都金阁寺

日本建筑既有沉稳安静的一面,也有大胆运用鲜明色彩的一面。不同于中国皇家建筑钟爱的朱红色,日本寺庙常用更活泼的橘红、玫红或鲜绿,色彩饱和度更高。京都的清水寺和伏见稻荷大社皆用橘红作为主色。

不同于中国皇家建筑常用华丽的图案作装饰,金阁寺的主建筑舍利殿没有华丽的图案,外墙却大面积地使用金箔装饰,呈现出华丽辉煌,又严谨克制的美。这样的设计反差感既简素、不留过多装饰,又以一两个巧妙的元素呈现惊喜、大胆、有创意的风格。

第一次听说金阁寺还是小时候读《少年文艺》短篇选集,有一篇写两位在BBS论坛上相识的网友约在金阁寺见面,才对这个名字有了最初的印象。如今身临其境地感受过金阁寺黄昏时的美,好像也和那篇故事有了更多连结。

02 料理、食器与菜单设计

豆腐泡饭(岚山常寂光寺前)

冷制京茄(ここら屋 烏丸店)

日本料理特别能体现日本人对待生活的态度,在日本觅食容错率很高,哪怕是角落里的隐蔽小店也能体会平凡简单的美味。现代生活往往在各方面都最大限度地被效率优化了,在旅行时留出空间允许自己尝试新东西,鼓励更多偶然性和未知进入生活是难得的体验,而日本恰恰提供了能让人安心探索、不担心踩雷的餐饮环境。

在日本最让人念念不忘的反而是最简单的小菜,像豆腐、茄子、白萝卜、鱼干、细葱这些看似朴素的食材,浓郁又有香气,和他们的建筑风格如出一辙 —— 简素却经得起推敲,令人印象深刻。

在京都住了四晚,其中两天晚餐都在我们钟爱的一家居酒屋「ここら屋 烏丸店」(Kokoraya Karasuma),以京都家庭料理为特色,也是我们随机发现的一家小店。特别喜欢店里手写菜单的设计,视觉层次清晰,橘红色的标题和京都庙宇常用的橘红相呼应,手写字体也流畅雅致,竖版排列有古韵,特色菜品和店名以朱红印章强调,点菜的时候静静观赏了许久。

菜单(ここら屋 烏丸店)

梅子茶渍饭(ここら屋 烏丸店)

正好年末在东京喜迎新年,尝试了日本新年特色的御节料理。据说从平安时代开始,为了不惹怒火神,日本人在春节里尽量不用火,因此会提前准备好能够长时间放置的食盒,留给新年的头三天吃,家庭主妇在新年也能好好休息。食盒里以小菜为主,皆是寓意吉祥的食材。比如黑豆与「勤勉」为谐音,寓意新一年健康、勤劳、踏实的生活。鲭鱼卵寓意子孙繁荣,像鱼卵一样能多子多孙。沙丁鱼干原本是肥料,因此寓意来年丰收。虾在日本代表长寿,寓意「活到腰都弯了」。

御节料理食盒

蟹肉茶碗蒸(东京 Miyake Akira)

京都 Iolite Coffee Roasters

除了食物本身,日本食器的设计和搭配也很有巧思。印象深刻的两个例子:一是在东京吃 omakase 的时候碰到一道菜品与食器巧妙呼应的螃蟹茶碗蒸 —— 茶碗蒸盛在蟹壳里,与盘子里的螃蟹腿相呼应。二是在京都一家色彩鲜明的咖啡店 Iolite Coffee Roasters,每个人的杯垫和杯子都是活泼的多巴胺色,咖啡店老板递水的时候会先附上与后面咖啡杯颜色相应的杯垫。店里咖啡豆质量极佳,香气逼人,是希望再访的小店。

03 地图与导览

从小就喜欢研究地图,因为观察从不同角度将多维复杂的现实世界化繁为简总是很有趣。这次逛日本的园林、寺庙、美术馆和机场时也发现了一些让人眼前一亮的地图范例,原来好的设计每次都会令人惊叹「这个真有用,确实是我需要的」,以后对类似的体验也有了更高标准。

岚山大河内山庄的山景导览图

京都南禅寺金地院的地图

京都岚山大河内山庄的山景导览图:根据所站的位置,导览图会介绍近景,中景,远景中的山。想到以前爬山到山顶,虽然能看到远方的全景,开阔的视野每次都觉得震撼,但并不了解这些高楼、河流和山脉的来历,不知道具体在欣赏什么,只是一种模糊的对自然广阔之美的感受。当知道了远处各个方位具体的景致,对眼前的风景也更有概念,可被描述的风景和感受更容易被记住。

京都南禅寺金地院的油画庭院地图:油画版的庭院地图很少见,把具体的景观也微缩在地图上,不像在中国景区常见的地图,往往只把线路抽象出来。庭院中水池的画风甚至有莫奈荷塘的影子。

东京森美术馆的少年展览导读

羽田机场安检处的登机口地图

森美术馆的少年展览导读(Junior Guide):看展时很少见到特地为少年制作的导读,逛展的时候觉得很惊艳,这也体现了无障碍设计(Accessibility Design)的益处 —— 虽然设计的主要目的是为了让青少年更容易理解有时晦涩难懂的介绍,实际上所有观展的人都受益了,包括像我这样对题材不了解的成人游客。Junior Guide 的易读性为所有来看展的 amateurs 提供了更容易理解的内容,从而丰富了整体的观展体验,是极佳的人本设计范例。

东京羽田机场安检处的登机口地图:年末的羽田机场人满为患,安检的队伍很长。站在队伍里抬头就能看到一张登机口的地图,具体的登机口大概在安检处的哪个方向,有多远都很清楚。这张地图虽然看似简单,但放在安检这个位置很巧妙,对于时间紧张、过了安检就要马上奔向登机口的人很方便,能缓解安检时对于登机口位置未知的焦虑。

这几处地图的例子能看出日本设计师对用户需求的深度理解,避免了用户调研中的一个经典误区 —— 很多人会直接问用户想要什么功能,但人是很难准确描述或预测自己的偏好和需求的,这也是理解用户需求真正困难的地方。好的设计能够满足那些用户自己都无法描述的真实需求,就像在机场安检处展示登机口地图,以及策展时包含少年导览那样。这些看似简单的设计,看到之后会感受到被关照到了。

04 与传统相结合的人本设计

商店窗户的圆形画框

植物造型夹

碎石网

在日本街头闲逛的时候,还记录了一些看似简单,但会不自觉惊叹的设计细节:

  • 商店窗户包含圆形边框,拉下窗帘时,也调整了画框的形状,让内外窗景变得灵动好玩

  • 在咖啡店里看到线条优雅的绿色夹子用于调整植物造型,相比于传统使用的钢丝或绿色绑带,夹子本身的线条很美,绿色也和植物相称,整体效果更融合

  • 日本寺庙地面大多铺了碎石,碎石上铺网避免了风大时容易扬起细石的问题

  • 羽田机场安检时提供拖鞋给需要脱鞋的旅客,既卫生又贴心,作为游客觉得被照顾到了

京都青莲院

后记

离开日本后,整理笔记的时候才意识到原来这些美学细节的背后就是禅宗哲学里「侘寂」(Wabi-sabi)的设计理念,是一种简朴、克制、以接受短暂和不完美为核心的日本美学和世界观。由于所处的地理位置,日本作为一个经常遭遇爆发性、不规律、和毁灭性自然灾害的国家,需要更早地思考如何与自然这样既赋予生命力和灵感,又时而危险有破坏性的力量相处,这也解释了为什么日本设计中对自然的尊重贯穿始终。有趣的是,同样在火山多发地带,夏威夷岛上的居民也敬畏自然,但他们以活在当下、及时行乐的态度生活,以及充满热情的社区、舞蹈和冥想来与自然的无常相处。反观日本,大和民族则用更沉静、克制的「侘寂」的美学去理解和面对自然中不可避免的生命流逝和不完美。感谢日本的设计灵感,下次再见。

Elon Musk's production philosophy

Reading notes from Elon Musk, by Walter Isaacson.

Photo by NASA on Unsplash

01 Musk’s Production Algorithm

First-principles thinking has been a well-established practice for some time, which involves questioning all assumptions about a problem and creating new solutions from the ground up. Reading about how Elon Musk applies it extensively in design, engineering, and manufacturing, from building rockets to designing car factories, made me appreciate even more its elegance in finding creative solutions to complex problems.

Here is Elon Musk’s version (“Algorithm”) of applying first-principles thinking:

  1. Question every requirement. Each should be attributed to the person who made it. Think for yourself and don't simply follow instructions. Never accept a requirement blindly just because it comes from a department.

  2. Delete any part or process you can. You may have to add them back later. In fact, if you do not end up adding back at least 10% of them, then you didn’t delete enough.

  3. Simplify and optimize. This should come after step two. A common mistake is to simplify and optimize a part or a process that should not exist.

  4. Accelerate cycle time. Every process can be speeded up. But only do this after you have followed the first three steps. In the Tesla factory, Musk mistakenly spent a lot of time accelerating processes that he later realized should have been deleted.

  5. Automate. That comes last.

Noteworthy examples of applying first-principles thinking:

  • Innovate on material and product design. When designing the cybertruck, the Tesla team initially considered using titanium for its durability. However, Musk was reevaluating the material choice for SpaceX's rocket ship at the time and realized stainless steel could be a viable option, which could also be used for a pickup truck. A stainless steel body eliminated the need painting and could bear some of the vehicle’s structural load. This opened up new possibilities for a more futuristic and edgier design, featuring straight planes and sharp angles, which pushed the team to explore new ideas.

  • Cut costs in the auto and rocket industries. Musk believed that reusable rockets were essential for establishing a multi-planetary civilization, particularly for sending humans to Mars. To achieve this, he introduced the concept of the "idiot index," which measures the ratio of a component's total cost to the cost of its raw materials. A high idiot index indicates overly complex design or inefficient manufacturing processes. By reducing the idiot index, Musk aimed to lower the cost of rocket production and make space travel more affordable.

  • Reinvent the policy incentive structure. Musk proposed an alternative incentive process to the traditional "cost-plus" contracts used by NASA and the Defense Department. Instead of providing detailed specifications and awarding contracts to large companies, SpaceX introduced a new method where private companies bid on specific tasks or missions. This approach allowed SpaceX to have more control over the design and construction of their rockets. They took on financial risk and were only paid upon successfully completing milestones, which incentivized results and fostered innovation.

Another important production philosophy of Musk is to have tight end-to-end quality and cost control through vertical integration, while also applying first-principles thinking:

  • Design-manufacturing feedback loop: Musk follows the principle instilled by Steve Jobs and Jony Ive at Apple, where design is not just about aesthetics but also about connecting the looks of a product to its engineering. However, Musk takes it a step further by applying this obsession not only to product design but also to the underlying science, engineering, and manufacturing. This approach highlights the importance of connecting the art form with its underlying science, which is one of the key themes in Zen and the Art of Motorcycle Maintenance.

  • Redesigning the manufacturing process: While creating a good car is important, Musk believes that creating efficient manufacturing processes and factories is even more crucial. In order to have tight control over the manufacturing process, Musk redesigned the assembly line. This involved questioning every requirement, making quick decisions to change or remove elements, and iterating on a daily basis. This design-manufacturing feedback loop has given Tesla a competitive advantage in its manufacturing process, resulting in solutions that are simple in design, communication, and cost.

02 Inspirations from science fictions, toys, and games

Musk often thinks at the scale of what truly impacts humanity, and this includes endeavors in space travel, internet infrastructure, sustainable energy, and artificial intelligence. He believes technology does not automatically progress, it requires human agency. While he sees the mass production of electric cars as inevitable, he believes that becoming a space-facing civilization is not. For example, although America had achieved sending men to the moon in the 1970s, there had been little progress until Musk founded SpaceX.

Musk had founded SpaceX, he liked to say, to increase the chances of human consciousness surviving by making us a multi-planetary species. Tesla and SolarCity were established to lead the way towards a sustainable energy future. Starlink was created to promote freedom of information, while Optimus and Neuralink were launched to develop human-machine interfaces that would protect us from malevolent artificial intelligence.

Photo by Leyre on Unsplash

Beyond thinking at the grand, historical scale of humanities, I really enjoyed learning how he took inspiration from science fictions, toys, and games, exemplifying combined creativity at the intersection of arts and technology.

Inspiration from science fictions

The most notable source of sci-fi inspiration was from the Hitchhiker’s Guide to the Galaxy, which featured a supercomputer designed to figure out the “Answer to The Ultimate Question of Life, the Universe, and Everything.” Moreover, the idea for Neuralink was inspired by the space-travel novels Culture by lain Bank, which feature a human-machine interface technology called “neural lace” that is implanted into people and can connect all of their thoughts to a computer.

Inspiration from toys

Musk mentioned getting ideas from the design of toys and the production process of the toy industry. For example, a little model car has inspired him to make real cars using big casting presses and Legos helped him understand the importance of precision manufacturing. When Musk inspected the Lego factory floor, he learned that each piece is accurate and identical to within ten microns, which means any part can easily be replaced by another — precision is not expensive, it’s mostly about caring to make it precise. On the production process, Musk also learned that toy companies need to produce things very quickly and cheaply without flaws and manufature them all by Christmas.

Inspiration from video games

When designing new car models for Tesla and someone proposed something conventional, Musk would push back and forth to the car from the video game Halo, Cyberpunk 2077, or from Riddle Scott’s movie Blade Runner as design inspiration. The other genre of games Musk loved is strategy games, including Civilization, Warcraft: Orcs and Humans, and Polytopia. Players in these games take turns making moves as they compete to win a military or economic campaign using clever strategy, resource management, and decision-tree tactical thinking. Musk’s passion for strategy games provides a window to understand his intensity, focus, competitiveness, die-hard attitudes, and love of strategy for business.

Photo by NASA on Unsplash

03 Bridging virtual and physical AI

Solving full autonomous driving is to solve real-world AI and Musk attempted to bridge virtual and physical artificial intelligence with Tesla and Twitter/X, which could provide the data sets and the processing capability for both approaches: teaching machines to navigate in physical space and to answer questions in natural language. In his grand vision, with Full Self-Driving, the Otimus robot, and the Dojo ML supercomputer, Tesla will not just be a car or clean-energy company, it will be an artificial intelligence company that operated not only in the virtual world of chatbots but also in the physical real world of factories and roads.

Musk sees his ventures as different, yet connected experiments for exploring AI.

  • Tesla: Besides freeing people from the drudgery of driving, Tesla aims to eliminate the need for people to own cars. Musk envisions a future dominated by Robotaxis, driverless vehicles that can be summoned, take passengers to their destinations, and move on to the next customer. While some Robotaxis may be owned by individuals, most would be owned by fleet companies or Tesla itself.

  • X/Twitter: The video footage from self-driving cars and the posts on Twitter offer vast flows of real-time data for training and analysis. Musk sees the Twitter feed as a representation of humanity's collective knowledge, capturing real-life human conversations, news, interests, trends, arguments, and lingo.

  • Neuralink: A device to explore the human-machine interface by connecting our brains to computers through a skull-implanted chip. This allows for faster information exchange and promotes collaboration between humans and machines.

  • OpenAI: Musk initially had the vision to make OpenAI truly open, allowing many people to build systems based on its source code. He believes that the best defense against AI misuse is to empower as many individuals as possible with AI technology.

Photo by NASA on Unsplash

04 The other sides of Elon Musk as a human

Much more can be said about Elon Musk's achievements as a technology entrepreneur. However, beyond his accomplishments, the biography also sheds light on other aspects of Elon Musk's life, including his role as a partner, father, and friend. It explores his risk-taking nature, a maniacal sense of urgency, recklessness, mood swings, and occasional toxicity towards towards people around him. Elon Musk exemplifies the human experience - daring to dream big and change the world, while also grappling with his limitations and weaknesses.

If anything, reading Elon Musk made me realize the greatness we can achieve as humans, while reminding us of the underlying trade-offs we’re constantly making for ourselves, our loved ones, and the environment.

Finite and infinite games as modern analogies

I was first introduced to Finite and Infinite Games in 2016, a book by James P. Carse, who was a Professor of History and Literature of Religion at NYU. At the time, its influence was cited by Chinese tech entrepreneurs like Wang Xing, founder of Meituan, and Kevin Kelly, cofounder of Wired magazine and author of Out of Control. Lately, I started this book out of curiosity to rethink the familiar concept of gameplay in everyday life.

Photo by eleonora on Unsplash

01 Boundary differentiates finite and infinite games

As the title suggests, Carse argues that there are two types of game in the world: finite and infinite games. Finite game is played for the purpose of winning, with an agreed winner and an end, whereas an infinite game is played for the purpose of continuing the play.

To play a finite game, players must agree to a set of externally defined spatial and numerical boundaries. For example, a game is played in that place, with those people. Each game is defined by its rules, or its range of limitations on the players, which allow considerable room for choice within those restraints and by which the players can agree who has won.

On the contrary, an infinite game does not have such boundaries. The rules of an infinite game evolve to prevent anyone from winning the game and to bring as many people as possible into the play. This resembles the mechanics of open-world games like Minecraft and The Legend of Zelda, where players can freely choose how to approach the game without the traditional linear structure. In short, finite players play within boundaries; infinite players play with boundaries.

Emotionally, it’s interesting to note that finite game feels serious because of the competitive nature of the zero-sum game, whereas infinite game comes with laughter and feels like play because new possibilities are continuously discovered to be explored with other players.

02 Power through title is won in a finite game

In a finite game, what one wins is a title. When a person is known by title, the attention is on a completed past and may take a person out of play. On the other hand, infinite players are only known by their names and the attention of others is open to the possibility of their future interactions. This is a gentle reminder to focus on the concrete, specific person, instead of the abstract titles.

Carse argues that titles are theatrical, where each title has a specified ceremonial form of behavior. This reminds me of a common improv acting technique to assign titles to your partner to shape the relationship in the narrative. For example, using titles such Captain, Mrs., Professor, Comrade, Father, Secretary signals the mode (e.g. appropriate respect), the content (e.g. only certain subjects are suitable for discussion with the District Attorney), and the manner (e.g. shaking hands, bowing, averting the eyes) of address.

Unsurprisingly, titles also conveyed power. Power can only be measured in relation to others and is determined by the amount of resistance one can displace within spatial and temporal limits. Those around them are expected to withdraw their opposition and conform to their will in the area (i.e. the game) the title was won. Validity of these titles depends on the repeatability of the game.

The finite player plays to be powerful, whereas the infinite player plays with self-sufficient strength. Power refers to the freedom people have within limits, strength is the freedom people have with limits. Strength is allowing others to do what they wish in the course of my play with them, whereas power is considering how much resistance can I overcome relative to others.

Carse further argues that society is a finite game whereas culture is an infinite game. A society preserves its memory of past winners with record-keeping functions like large bureaucracies to maintain social order. Culture, on the other hand, has no boundaries and anyone can participate and shape. Cultural deviation does not return us to the past, but continues what had begun and not finished in the past. In contrast, social convention requires that a completed past be repeated in the future.

03 Storytelling can be seen as an infinite game

Storytelling can be seen as an infinite game. A good story presents a vision that moves and inspires you with its underlying belief. The interactive and engaging elements of storytelling is what truly connects the speaker and the listener. The end of a compelling story is the new beginning of the listener’s imagination and reflection on their own journey, which is similar to how an infinite game continues the play without true ending.

When comparing storytelling with explanation, Carse argues that explanation settles issues, showing that matters should reasonably end as they have. Narrative raises issues and inspires reflections in others. In this case, explanation sets the need for further inquiry aside, whereas narrative invites us to rethink what we thought we knew. I see where Carse’s argument is coming from, but am not convinced that explanation is a finite game in this sense.

The concept of explanation actually makes me think of its inherent role in interpreting and predicting the future. If the rules governing past events can be discovered and explained, we can make better prediction about the future. This captures an important philosophy of user experience research as a discipline: if we can understand the motivation that guides human behaviors and perceptions, we can meaningfully derive aggregate patterns of human needs to inform investments for the future. Investing in research is a way to play the infinite game, where the focus is to co-create a long-term vision on the horizon. The more insightful framework we have to understand a problem space based on existing behaviors, the better we can predict and build for the future. In this example, explanation is also an infinite game that opens up new possibilities.

04 Garden is an infinite game, while machine isn’t

Carse further extends the finite and infinite games framework to the machine and garden analogy. Think of a gardener who uses machine as a tool to help with gardening. Machine is a finite game because it is operated to complete a task. When it is most effective, the tool becomes invisible and eliminates itself because the effort is minimal. Garden, on the other hand, is a place of growth and maximized spontaneity. “To garden is to design a culture capable of adjusting to the widest possible range of surprise in nature.”

Machinery can exist in the garden quite as finite games can be played within an infinite game. Technology is a tool that helps with gardening as a means to an end, not the end itself. The question is not one of restricting machines from the garden but asking whether a machine serves the interest of the garden.

Additionally, the relationship between the machine and its operator is very much like how humans interact with technology today. We often think of a machine as a tool—the extended arms and legs of the operator. However, Carse suggested that “to use the machine for control is to be controlled by the machine.” For example, when using a search engine, many start with a broad query then gradually refine and add keywords as they review the result. This is not how we naturally talk to others when we look for things. Search engines are designed to help look up information today as a tool, but we as operators are also trained to interact with it in such specific ways.

05 Celebrate spontaneity and forgo control

In the garden analogy, gardeners celebrate variety and spontaneity, which may seem chaotic and out of control on the flip side. But vitality comes from an abundance of styles and sources of change. Gardeners are acutely attentive to the deep patterns of natural order, while having the freedom to choose how to play with nature and its force.

If we play the finite game, the more power we exercise over nature, the more powerless we become before it. In a matter of months we can cut down a rainforest that took tens of thousands of years to grow, but we are helpless in repulsing the desert that takes its place. The human desire to control and organize chaos means transforming the remote into the familiar. When we attempt to take control of nature, we’re essentially reducing an unpredictable vitality to a predictable mass. Sometimes, the desire for control, just like the need to declare war, is a way for us to re-identify ourselves.

Ultimately, Carse gently nudges us to rethink the type of game we are playing. When it comes to interacting with other humans or with nature, it’s easy to go straight to playing the finite game, so we can gain the immediate reward quicker. Carse reminds us to think about the trade-offs behind these finite games and whether they truly serve the interest of the garden—the infinite game—that we are working towards.