Inside the fish bowl: the vibe shift in AI
This piece reflects my own views only, and was largely written to help document and structure my own thinking during a weird but important time. I hope someone else enjoys it!
Despite the hype of 2022–23’s “ChatGPT moment” and the decade of progress that telegraphed it, 2024 felt like a vibe shift in using AI. Especially in my actual job as a product manager, which now focuses on building tools around it.
In 2023, it seemed like everyone in tech was launching an AI startup, repositioning their products around AI, or being told to reprioritize to “do AI” by their boss. My own projects were no different. In my day-to-day life, however, actual tools deploying the new class of LLMs still felt limited and gimmicky, and I hardly ever found ways to meaningfully speed up tasks at work. Sure, it was impressive that I could fake my friend’s voice absolving me of blame for a decade-long feud, and it was fun generating motivational ballads around everyday activities, but it’s not that often I need to do this. I found all kinds of ways to use ChatGPT as a better Google, but it was hard to get mileage on tasks that required any type of thinking vs. summarizing.
2024 felt different. The models themselves do not feel fundamentally different from 2023 — or at least not in the way that GPT-3 to 3.5 or 3.5 to 4 felt like going from elementary school to high school level quality of work overnight. 2024’s advancements felt more like a high school senior steadily progressing through each year of college, and college students are quite capable. We hire college students to support research, build and maintain websites, write for newspapers, and all kinds of things we don’t ask high school students to do. Most tasks in the knowledge economy are geared toward a college education, so while the leaps have felt smaller and more obvious, they are proving useful on many more use cases I actually deal with. Models also got way better at interacting with the tools and primitives where knowledge work occurs (spreadsheets, websites, PDFs, etc), so everything is just easier.
In practical terms, this has meant I have started to use AI for more tasks with commercial value vs simply fun. High school level reasoning with infinite and instant access to the world’s knowledge is already useful for low-stakes research, so by the end of 2023 I was already using ChatGPT at least as often as Google. In 2024, I began using AI (including more tools like Claude and Perplexity) for things like:
- Analyzing options against a set of provided criteria, or generating more options based on that criteria
- Assessing risks, trade-offs, pros/cons, or extrapolation of some scenario
- Generating examples for some project or idea, given some set of parameters
- Reviewing, categorizing, tabulating, or summarizing some type of content (logs, feedback, transcripts, etc) based on a set of provided criteria, or identifying examples that meet that criteria
- Role-playing and simulation to test ideas, critique work, and anticipate objections
- “Pinch hitting” on basic tasks outside my area of expertise (e.g. write an SQL query)
- Navigating complex topics with basic competence (e.g. understand applicable regulation)
Unlike “give me a motivational pump-up speech to go exercise in the style of Bridgerton,” there is a more limited range for “correct” on these tasks, and I have some intuition about what correct means. Importantly, they are also all “input” tasks vs. output — I am asking the AI to help me think through a problem, with criteria I have provided, where I retain enough control and context to know when it’s not what I am looking for. When that happens, I can nudge it in the direction I want, or abandon it if it’s becoming too complicated. If there’s a hallucination buried in there, it’s unlikely anything has created a singular “crux” on which an entire project depends, so the stakes are low. It only takes a few minutes of tinkering to see if AI will be useful, so my marginal cost is very low, but my marginal benefit is unbounded.
Did these tasks take up the majority of my day pre-ChatGPT? Not really — I still find AI output too generic and unconvincing when it comes to actually making a plan or case about what to do, or anything where it’s important to be specific (let alone creative or contrarian) about what, why, or how amid many plausible options. Coincidentally, I’ve found my reaction to AI output on these tasks is often identical to feedback I have to give more junior peers: “I mean, sure, I guess, but why this?” What was articulated is clearly related to the goal, but the strategy or causal hypothesis running through feels shallow. This is some of the hardest feedback to give humans — it’s hard to identify what’s missing without thinking through it yourself — and it’s just as hard with AI. Except saying “be smarter” kinda sometimes helps and doesn’t hurt its feelings.
Overall, AI is probably only saving ~30% of time on ~10–20% of tasks I used to do, but that number would be closer to 30% if not for various workflow friction like tooling access, data policies, and even basic reliability with certain content like websites. The bigger impact is that it’s making 30% of my tasks (sometimes the same ones) 2–3x better quality. On tasks with more importance, leverage, or scrutiny, I would have done something similar to what I now do with AI (or recruited help), but you don’t have time to do that for everything. Most tasks received much fuzzier and more heavily caveated methods (searching, skimming, approximate tabulations, etc), or your gut. Now, I have a team of consultants to find and analyze every piece of feedback or data, stress test ideas against example personas or edge cases, assess risks and trade-offs, etc — instantly, at no marginal cost, for whatever I want.
Where these “AI consultants” once felt like student interns, their work is increasingly about what I would expect from an experienced consultant given the same context.
When looking past the hype of what people say in sales calls, webinars, Twitter posts, and earnings reports, this is primarily what I see occurring: people using AI to make them a little bit faster, quite a bit smarter and much less bogged down by lacking specialized skills or knowledge. That doesn’t just go for tech workers in closer proximity to AI development, either — I talk to small business owners almost every day for my job, and most of them report using ChatGPT with some regularity. If anything, they use AI more to unblock themselves on tasks requiring specialized knowledge (contract law, marketing copywriting, etc), because by nature they have a lot of needs, very little in-house specialization, and very limited resources to buy it. There is definitely some flashy automation occurring, especially in areas like coding and support, but AI is permeating the mainstream through quiet, mundane utility.
This is today, with current models — the dumbest and least integrated we’ll ever have — and ignoring all the insane activity in multimodal and autonomous agents. Even if progress stopped immediately, it seems like we can already bake in the fact that almost any form of “first pass thinking” is now easier, cheaper, and faster. It just takes less time now to generate the equivalent of “a lot of research,” or leverage knowledge of fields that are complex but not necessarily complicated (e.g., most people with the same education or experience will come to a similar conclusion). I can get the equivalent of an average lawyer’s initial consult, an average doctor’s evaluation of labs or symptoms, an average engineer’s prototype, or whatever else — on almost any topic, for free, whenever I want, as a consumer or collaborator. That’s not the same as having the best in the field’s deeply reasoned work on the most complicated problem, but if you think that’s what you are getting (or need!) most of the time, you are either extremely wealthy, extremely delusional, or both.
Cheap, on-demand, average-quality, first-pass thinking on any topic is already enough to reshape many jobs, including mine.
“AI IQ” is clearly continuing to increase across the board, and it’s approaching superhuman in many fields. However, I’m not sure the average person will know what to do with a much smarter, but ultimately highly obedient assistant. There is clearly still work to do to increase the share of intellectual tasks AI can take on from humans, but much of the gaps seem related to skills other than pure reasoning, especially related to agency, soft skills, and the fuzzy concept of “taste.” Those are all showing progress, but let’s say somehow they stay unresolved — what would I even ask someone with a PhD in every field that I wouldn’t ask someone with a bachelor’s in every field? Without a PhD in the field of my questions, will I even notice the difference?
Some will figure this out, as clearly there are many examples of people leading others who are much smarter than them. As a species, we also seem to have an unbroken track record of finding harder problems to work on, and I am sure competition and capitalism will find a way to make use of all that intellectual horsepower. However, in my experience, most people are terrible managers of all intelligences. To add economic value with our brains, will we eventually all need to become better managers, or shift into some type of AI training role over a dwindling subset of tasks?
I’m not sure. There are lots of reasons things don’t play out as might seem obvious, and I don’t have much confidence in anyone claiming certainty with AI, especially those peddling in post-religious prophecy. How can you predict anything, if you say it will change everything? I won’t offer another here, and there are already studies on individual job impact. Still, I want to take note of several areas I have already personally observed changes in how I or my network — primarily but not exclusively in tech — think about different staffing decisions:
- Roles that trade grunt work for experience or a foothold. The old trade (for almost every junior role or internship) is just not the same when AI can do the grunt work, especially grunt work that requires some specialization (e.g. basic coding). Anecdotally within my own network, I’ve seen very few junior roles open, and a pretty strong default toward asking if some AI tool would suffice. To a smaller degree, I have seen an increased willingness to look at less traditional candidates who display very high agency, betting AI can round out their gaps.
- Roles primarily built around advice, analysis, or interpretation. It is just incredibly easy now to get decent advice, on-demand, on any topic, at near-zero cost, and it’s only getting easier. Personally, I ask for help less, and when I do ask, I come with much better questions and have much better conversations. While this may sound like a doomsday prediction for such roles, this is likely going to play out extremely unevenly based on whether there is unmet demand held back by price or effort, and whether what people are really paying for is something other than advice (e.g. confidence, validation, someone to listen, etc). If I were in such a role and not nearing retirement, I would be obsessing about how to be on the right side of this.
- Roles predicated on specialized knowledge, but where tasks are routine. It’s nice to believe that our education and experience means we’re bringing unique judgment or creativity, but often we’re just bringing gate-kept know-how that drives up the cost of everything. I know the tech world best, and we love to think we’re all doing deeply technical work that changes the world. But for every person spending their days deep in thought about how to build the perfect solution for an unmet need, there are five others working on some internal migration, translating a doc into a Figma or Figma into HTML/CSS, or fixing bugs. Tasks with a fixed scope and fixed ceiling for impact, but high costs due to specialization, have extreme incentives to automate. Coincidentally, they are also where I’ve seen the most measurable impact so far.
- Roles predicated on inflated team sizes. Teams are generally built around achieving some type of goal, and doing so usually requires assembling many different types of specialized skills or knowledge. People can only have so many specialties, so specialization on its own tends to inflate the number of people involved in any project. With more people always comes more “work about the work,” ranging from rote tasks like gathering updates to complex ones like managing and aligning stakeholders, and eventually you fill entire jobs externalizing coordination costs. If achieving a typical goal requires less specialization, we should eventually see smaller teams, and with that, (blissfully) fewer coordination costs.
Again, that is what already seems baked in, with the dumbest and least integrated models we’ll ever have. 2025 will surely shatter more barriers and create a different picture, but in the meantime DALL-E can already make pretty good ones: