macintosh.world | Log In | Register
Today | News | Books | Recipes | Notes | YouTube | QuickTake
Translate | Wiki | Browse | Maps | Reference | Reddit | About

Back to HN

SpaceX to buy Cursor for $60B

by itsmarcelg | 1145 points | 1696 comments | 2026-06-16 05:44:24 Central

Open Source Link | Read Source Here

Open on Hacker News

Comments

01100011
I stopped using Cursor when I started getting comfortable
with Codex/Claude. Cursor is just annoying with the
constant popups and it's just not as good. Now my workflow
is to use my normal editor, add a todo describing what I
want, and then ask Codex+gpt-5.5 to implement it. It
absolutely nails it. Using codex is so much more like
working with a partner vs the noise and annoyance of
Cursor.That said, I think we're in a narrow window of time
right now where any of this matters. Prompt "engineering"
and working around your tools will be over in a year or
so.Fwiw I am a c/c++ systems engineer. I think anyone
mentioning anecdotal experience like this should clarify.
Maybe frontend JavaScript folks have a totally different
take and that's expected.

  > ghshephard
I use cursor 8+ hours/day at work, and have full (and
effectively unlimited) access to Claude Code and Codex
- tools which I also use personally. I suspect that
your "constant popups" were when you were using the
editor - a mode that I'll confess I haven't touched in
3+ months.Workflow in Cursor is actually awesome - I'm
a little outdated in how I use it - I still establish
goals/objectives, rather than managing the loop which
does so - but if you can think broadly enough - I find
it's pretty efficient.Key things I like about Cursor
(and I recognize I'm dating myself a bit here):
- Plan Mode is really solid - I shift-tab, have it go
create the plan using whatever insanely expensive SOTA
model is available - I will usually spend 5-10 minutes
on the Plan - review it, maybe even tweak it a little.
(though 90% of the time it's fine out of the gate) -
Ability to select any model for every task - I'll
switch between Opus 4.8 High/xHigh/... I'll even
switch to 1M context for the planning phase upfront.

- It does an *excellent* job managing permissions and
looping the agents and spinning up sub-agents for you
- you set the goal, run the plan mode - and then let
it churn for however long is required - pretty common
to have a 30-45 minute run and come back to a fully
created/tested product.


The nice thing about Cursor (and honestly Claude Code,
Codex) - there isn't really any "prompt engineering"
involved. You just say, "Go Build me x - it should
have y,z features - and build it in golang for me" -
and that's it - the 3-4 page Plan comes back - usually
pretty credible - and then you click "build.".

    > > embedding-shape
> there isn't really any "prompt engineering"
involvedYou should make an experiment; take
someone who never used any LLMs or agents, and
tell them to use it for the first time in front of
you, and tell them to build something like a
calculator program or whatnot. Bonus points if
they're ICs or at least not-managers.I think there
is a lot us engineers take for granted, when it
comes to communicating via text, how to state
things clearly and what we think/reason when we
read things. A lot of people don't have those
"skills" innate, and the first time they use LLMs,
they basically don't know how to interact with
them, until they realize what they're able to do
and not. Then they also learn what to say to steer
the model into the right way, this is quite
literally a "prompt engineering" skill they're now
learning.

      > > > hibikir
You don't even have to go outside engineers. I
have teammates that get very little out of
Claude Code because the way they integrate
their own knowledge doesn't allow them to
think of what Claude might not know. They'd
say a task was impossible with the tooling,
and I'd get instant answers, because I
understand what is weird internal business
logic sitting 6 repos away, and what is
knowledge claude has by default. I can commit
Claude.md files for them, but I have to
include EVERYTHING, because otherwise they'll
let Claude make assumptions and waste minutes,
if not hours.It's a big part of what, in my
experience, is separating the very good
engineer from the iffy one: Do you have a good
mental model, and can you put yourself in the
shoes of people sitting in a different mental
model? It makes you a better dev, and even
more so when it comes to AI tools, which have
their own kind of alien brain.

        > > > > gcanyon
Coding LLMs are distilling developers.
It's like the old experiment where you
have someone write down the steps to make
pancakes and they don't tell you to crack
the eggs before adding them to the batter:
it takes a particular mindset to be able
to make a model of what is supposed to
happen and deconstruct that to the level
appropriate for implementation.Until now,
the actual act of writing code:
terminology, syntax, etc. was a
significant hurdle, and that underlying
mindset was a very useful, but missing in
a surprisingly large number of developers,
skill.Now with LLMs doing the work of
"translate this into code," increasingly
the only thing that matters is that exact
ability. And developers that don't have it
or can't develop it won't be developers
for long.

          > > > > > rimliu
or when LLMs won't be able to run on
non-existing money any more the
scenario will be the opossite.

        > > > > ohmahjong
Thanks for putting into words what I have
been seeing a lot at work and haven't been
able to put my finger on. We tend to have
quite diverse _workflows_ between devs at
my company, and success seems to correlate
with injecting better context earlier in
the process.I like to chat with Claude
about how to approach a given problem,
bring in extra context, etc, before even
really drafting up a plan, while other
people dive into implementation
immediately and go on wild goose
chases.90% of the time we end up in the
same place in roughly the same amount of
time, and there are obviously tradeoffs to
spending more time planning vs
implementing. I'm oversimplifying as well.

        > > > > acron0
I couldn't agree more. Socratic
methodologu, domain modelling, systems
thinking, pipes-and-arrows problem solving
etc. These are the skills that get real
work done in coding agents these days.

      > > > whstl
This makes a lot of sense and explains why
some people are so captivated by modern
models, while others see progress as merely
incremental.

        > > > > jeremyjh
I'm sure that explains some of it but I
really don't think it explains most of the
people who have been AI-pilled in the last
nine months. There was no amount of
context I could give GPT-4o that would
make it a net benefit to use that for
agentic development. I tried it with quite
sophisticated prompt systems and much
simpler ones, compendiums of code &
business analysis and sparser ones. Yet it
just wasted my time - still there were
people using Cursor with that model and
saying it was life changing. I didn't have
that experience until Opus 4.5 - its
possible I could have had it earlier but
that was when I happened to try it again.

          > > > > > ghshephard
I think many of the people who have
become "AI Pilled" (I'll include
myself here) had it happen in the last
3 months. Even over the Christmas
break, when the Wiggums loop got so
much coverage - I still wasn't that
blown away going into
January/February- 50%+ of the time I'd
just write the code myself. I like
coding.But - I don't know if it was
April, or May - but very recently -
the coding harnesses paired with
decent SOTA models like Opus 4.8/GPT
5.5 - just started showing a lot more
consistency, and completeness, and
sometimes downright clever behavior -
that they started to become way more
useful.Just one out of hundred+
examples - I gave Claude Code (Opus
4.8 High) a complex task that involved
consul, vault - but I had neglected to
give it sandbox permission to download
from hashicorp.com. So - it created a
entire test harness that simulated
both the behavior of Vault and Consul
- created all it's test cases,
verified that they passed - and when I
came back 40 minutes later said that
it was all done.It's test harnesses so
accurately simulated the behavior of
Vault/Consul - that on first try - no
refactoring whatsoever - all of the
protobuf/AESGCM/API behavior (that has
varied significantly between versions)
- worked.This was something that would
have taken me, someone super super
familiar with the code and tools and
APIs - a minimum of 3 solid days of
work - and that would likely involve
hundreds of attempts and refactors as
I unwound all the weird encryption and
packaging layers. It zero-shotted a
full solution without having an API to
test againstIf these agents actually
have an actual test-harness - It's
honestly hard to imagine what they
can't do - subject only to imagination
and budget at this point.Speaking
personally - something changed Between
January and, Let's say May - in which
instead of seeing these things as
mostly interesting technology
demonstration, in which the flaws
outweighed the benefits - I now
genuinely think they are the future of
programming. I'm dubious that I'll
write much software manually in the
future - beyond what I do for personal
pleasure.

            > > > > > > fragmede
Asked to write a driver for macOS
for some thing that didn't have
macOS support, GPT-55 found Linux
OS firmware on the vendors site,
downloaded it, ran binwalk,
extracted out the driver, got
halfway to reimplementing it on
macOS with barely any help from
me. I did need to dive into it
somewhat to get it across the
line, but it showed some ingenuity
along the way.

            > > > > > > Jagerbizzle
Fantastic post. This sums up my
experience perfectly with a near
identical time frame to yours.

        > > > > jmalicki
Which way do you think that goes? Are the
ones who "get it" the ones who are
captivated or see them as incremental?

          > > > > > whstl
I guess all of them?Some people "got"
LLMs back in 2022, others needed it to
evolve a bit.It's not unlike
computers. I started using them back
in the 90s and absolutely nobody I
knew was interested, while today
everyone carries one in their
pockets...

      > > > gyanchawdhary
By that same logic (and I'm agreeing with you
as of now), engineers shouldn't get too
comfortable treating "being good at text
communication" as a lasting edge. With how
quickly agentic coding is evolving, it's worth
considering the possibility that many of the
prompting and steering skills we view as
valuable today could become far less important
in a matter of weeks or months.

      > > > thatjoeoverthr
Recently I have the SEO guy governing the
mostly static, public site with Claude Code.
He loves it but you would never imagine the
level of mental illness Claude comes up with.
If it were an employee I'd literally throw him
out the front door, labor laws be damned. And
as always, every insane thing it does is some
direct echo of its concept and training.

    > > UncleOxidant
But what's the $60B differentiator here? There are
so many similar tools out there. I generally use
Opencode, but also Claude code, antigravity and
sometimes Kilo code on VS Studio. How can cursor
be worth even 10% of 60B?

      > > > matt-p
I don't know what cursors market share is but
it feels like 20-25% to me. That is not worth
nothing. Then;1) The data they have flowing
through the system that enabled them to build
composer (which is much better than stock kimi
2.5) and is presumably allowing the training
of a new model on space Xs compute.2) Cursors
new 'github' replacement.3) Enterprise
sales/tractionIf you look at all of these
together, it's not implausible that they end
up mostly 'owning' coding in 5 years time. If
they replace GitHub with something more
compatible with agentic coding and bring it
into their whole ecosystem providing cloud and
local agents, PR review and own frontier
coding model.It's specialised vs 'borg' isn't
it. One way of thinking is that the world is
owned by Anthropic/OpenAI and coding is just
one of many things their model and software
does. Another view is we have a 'coding with
LLMs' company that specialises in this field
of endeavour. Hard to say which wins, but I
think they have a shot.Personally my only
objection to cursor is that it's more
expensive. That's it, otherwise it is great to
be able to choose say GPT-5.5 when I want to
work on backend and Opus when I want to work
on front end. Great to have PR review built
in. If they were able to get composer 3 to as
good as GPT5.5 / fable at the price of
composer 2.5 they'd be winning on price again.

        > > > > pqtyw
> If you look at all of these together,
it's not implausible that they end up
mostly 'owning' codingThey really need to
change their trajectory then?And
regardless being owned by xAI, a failed AI
company which turned into a datacentre
operator probably won't help them to
achieve that.> Hard to say which wins, but
I think they have a shot.The market for
"coding harnesses" and "AI IDEs" is
already oversaturated and they are
effectively a commodity at this point, you
can use any of them with any provider more
or less interchangeably.

          > > > > > matt-p
> They really need to change their
trajectory then?
They need to step up progress sure.
> And regardless being owned by xAI, a
failed AI company which turned into a
datacentre operator probably won't
help them to achieve that.I think near
unlimited access to compute is exactly
what they need to train a frontier
level coding model and serve it
cheaply and profitably.> The market
for "coding harnesses" and "AI IDEs"
is already oversaturatedI think my
entire point was that it's not just a
AI IDE. It's a coding focused model
(currently Composer 2.5, soon
hopefully something better), a Github
Replacement, PR review/Bug Bot, Cloud
Agents and so on and so forth. It's a
ecosystem. An enterprise signs a MSA
with you and gets everything they need
all in one place.

            > > > > > > pqtyw
> unlimited access to computeYes
because Grok failed and they now
have "unlimited" compute they can
sell to other. I mean you are
right that if they did X, Y and Z
they could be very successful but
their is no indication that might
happen. In any meaningfully way
seems like Cursor has peaked a
while ago.> An enterpriseWell
either they are the type of
companies which just buys whatever
Microsoft is selling OR they let
their developers to mostly pick
what they feel is the best tool
for the job on their won. I don't
think there is that much in
between (and its a cutthroat
market e.g. GitLab)> a Github
Replacement, PR review/Bug Bot,
Cloud AgentsThose things are a
dime a dozen, you can vibe code
them in weeks/months and there
plenty of options on the market
already. Well not Github of
course, but there are various
reason for that which have little
to do with product quality and
features (not that I think there
are many companies which could
build a meaningful GH replacement
in a realistic time period despite
its many flaws).I just don't
really see a huge income stream
for dev tools companies (just like
there never was) they can skim of
something from the top by
reselling AI models (generally at
zero or negative margins..) but
that's not the most lucrative
business model when you have no
real moot.

            > > > > > > ballon_monkey
How did grok 'fail' ? This is news
to me.

            > > > > > > pqtyw
By not succeeding? It's an also
ran, a closed proprietary model
which is behind Anthropic, OpenAI,
Google and a a bunch of Chinese
companies, how do you make money
with a produce like that? (besides
the absurd IPO of course...)

            > > > > > > hackermanai
At least it didn't succeed yet.
They should drop a model
somewhere, beating something else
in some use case, and maybe people
would use.

            > > > > > > XorNot
My company has Claude. People were
excited to use Claude. Absolutely
no one, despite the option,
considered a grok model.

            > > > > > > Saline9515
For a lot of people, Grok is the
first AI they got to use through
Twitter. Grok does get quite a lot
of usage, and isn't out of the
game - coding tools aren't the
only use case for AI.

            > > > > > > bdangubic
this is like saying people still
use google glass. sure, some
people
might but AI-wise it is as dead of
a product as it gets

            > > > > > > Saline9515
Google glass has been
discontinued? Besides, many people
use it on Twitter everyday. Usage
is not limited to what you can see
on the Openrouter dashboard.

            > > > > > > bdangubic
many people use copilot inside
outlook to auto-complete their
sentences as well :)

            > > > > > > Saline9515
Meaning that Copilot is actually a
success, even if you don't like it
;-)

            > > > > > > pqtyw
Because Microsoft managed to sell
it to a huge number of companies,
not directly because people are
using it. Hardly anyone is paying
for Grok.

            > > > > > > Saline9515
US government is paying for Grok
to help it send bombs on Iranians.
That's a use case.

            > > > > > > bdangubic
with this thinking I wish that all
your products are as successful as
copilot is ;-)

            > > > > > > Saline9515
A bad product can be successful
with the right distribution. This
is what happens with Grok and
Copilot.

            > > > > > > pqtyw
> Twitter everydaySo what? How
much money is that making?

            > > > > > > jerojero
these users are probably losing
the company money.the failure is
in converting regular people into
actual ai product consumers.
Companies are realising that the
money is not in regular consumers
but in enterprise and they are not
considering grok as a serious
alternative.if anything, the name,
the branding and the x/twitter
affiliation has hurt adoption from
money makers rather than help
it.so yes, people know it, but no
one is willing to pay for it

            > > > > > > Saline9515
Depends, Grok stimulates
engagement and pushes to stay on
the plaform and feed it data. If
anything, it helped justify a
massive valuation for SpaceX,
which is a metric of success for
most corpos.

            > > > > > > pqtyw
It helped the valuation but as
just like SpaceX hallucinations
about the space data centres.
Doesn't mean its not a crappy low
end model itself. Btw is Twitter
even making any money?

            > > > > > > ballon_monkey
"my company doesn't use it so no
one uses it" - typical out of
touch HN commenter.

            > > > > > > mthoms
Given that Grok is selling all of
the compute capacity from its
flagship data centre out to a
direct competitor sorta speaks for
itself.Does it mean they are out
of the race? I have no idea, but
things don't look
great.https://news.ycombinator.com
/item?id=48037986

            > > > > > > rtehfm
They're only selling compute from
Colossus 1.

            > > > > > > scheme271
There's a HN article and
discussion about Anthropic
expanding to use Colossus 2.
https://news.ycombinator.com/item?
id=48214017 I think it's fairly
clear that grok isn't using as
much compute as expected.

            > > > > > > bjelkeman-again
Seriously though. I haven't heard
anyone use Grok in software
engineering context. Maybe I live
under a rock.

            > > > > > > youre-wrong3
There are more uses to AI than
just software engineering...

            > > > > > > pqtyw
So far seems like none of those
use cases have generated
meaningful income streams? The
consumer/non-developer market is
mostly dominated by OpenAI and
Google anyway...

          > > > > > fnord123
> The market for "coding harnesses"
and "AI IDEs" is already oversaturated
and they are effectively a commodity
at this point, you can use any of them
with any provider more or less
interchangeably.Yes and no. I've used
a few different harnesses with closed
and open models and there is
definitely something going on that
makes some harnesses work better than
others. Many of the differences are
hard to pin down and some are things
people don't care about. But I
wouldn't say they are commodified just
yet.1. Memory use. I have colleagues
complaining that Clause Code uses
several GB of memory. Meanwhile I
haven't heard about that regarding
codex or goose, or even opencode for
that matter.2. Suitability for local
models. When you use Anthropic models,
you use Anthropic as a provider. They
can have software between the model
and your harness that will fix issues
with the model. One notable thing that
even the best open weights models
struggle with is broken tool calls.
There is a lot that a harness can do
to fix broken tool calls when working
with a straight up ollama running a
raw GGUF file.3. Ease of use with non
mainstream models. OpenCode has GREAT
coverage of models/providers. Goose,
less so as it relies on people to set
up their own anthropic or openai
compatability settings. e.g. Zed
doesn't let you use Z.ai (which, if
you speak British English, sounds
ironic because "zed ai" isn't directly
supported by Zed the editor).4.
Worktree support. Opencode and
probably all the TUI harnesses works
in a local directory - so you need the
terminal to be in the worktree. Zed,
however, works centrally on your git
repo and tracks the worktrees so you
can bounce around your work in a
single window.Of these, '2' is maybe
the most important one but also the
hardest to pin down as a feature. '3'
is a one time cost. Of course '1'
could be a blocker for someone using a
macbook air or neo.

        > > > > sg0nzalez83
I agree, Composer Fast 2.5 is getting
really good. I started using it for a
personal project after I had to switch
from Sonnet because I hit the API limits,
and I was surprised by how good it has
become.

        > > > > chocrates
Have you looked at gitlab lately? They
have a ton of ai features built in.I'm not
a gitlab user, just learning it, so I
can't say how half baked they are or
not.At a high level though it seems like a
huge step forward than GitHub

      > > > arcanemachiner
I believe they have some very good training
data because of all the data generated by
people using the service.This is the same data
they used to finetune Kimi K2.5 to make their
newer Composer models, which benchmark
substantially better than Kimi K2.5.I've heard
they also want to build their own base models,
which will also benefit from their large
amount of high-quality training data. Which
will solve Grok's model quality problem.This
is all unsourced conjecture of course. But
it's what I've heard.

        > > > > ifwinterco
Also from what I understand (not my day
job) we're now at the point where the
post-training tuning (RLHF etc.) is
increasingly important since pre training
no longer scales.So it's not really fair
to call it "fine tuning", it's an
important part of building a coding model
in 2026, and cursor have done a pretty
good job with Composer

      > > > woobar
> How can cursor be worth even 10% of
60B?Maybe because SpaceX paid with monopoly
money (all stock deal)?

        > > > > nwienert
It's the data. To do RL.
      > > > Romario77
they are paying for marketshare/customer base.
Cursor has a good chunk of it.xAI overbuilt
their data centers - they can't find paying
customers for them, that's the reason they
made deals with other companies like Google to
use their own datacenters.Cursor has the
opposite problem of not having enough
capacity. So this works well for them
together.Weather it's worth it - if you
beleive that AI will solve every problem then
having a piece of the pie early on might be
worth it.Remember how when google bought
youtube for 1.65 billions people thought they
are crazy? Or when facebook bought
instagram.60B is a crazy number but might be
worth it for someone fighting for world
dominance :)

        > > > > iririririr
you are completely equivocated on most
points.xai is on the line to delivery
capacity they already sold to Google and
most analysts think they are 50/50 on
actually meeting it.the only proof they
have capacity is that musk claims all the
money they are burning is going to
datacenters and gpu (mostly because if he
put it on anything else the lie would be
obvious)

          > > > > > imtringued
Musk is the type of person that would
raise billions in funds for a
datacenter in space and then just
build a datacenter on the ground.

        > > > > nix0n
> Remember how when google bought youtube
for 1.65 billions people thought they are
crazy? Or when facebook bought instagram.I
think these are good examples: in both of
those cases the buyer had a plan to
monetize.If you are a user of Cursor,
expect to pay more for it or switch.

        > > > > 01100011
> they are paying for marketshare/customer
baseOr are they paying for talent? It
seems like xAI is sorely lacking in
talent, most likely due to the CEO and
folks' aversion to him. By throwing around
some SpaceX monopoly money he can trap
some talent with retention clauses and try
to invigorate his failed AI business.

      > > > ghshephard
I think the argument for Cursor is that it's
the dominant tool that enterprises are using
for coding, so the theory is Cursor wins that
as the "model agnostic", it has a phenomenal
Enterprise Sales Team.From a valuation model -
$4B ARR with rapid growth, and the ability to
shift traffic to internal models (honestly,
massive amount of the time "composer" - their
internal model is fine, and obviously going to
get better). Say 17x Multiple which isn't
unheard for a rapidly growing Startup with
solid future structural profit elements
(moving to internal model) - that gets you to
$68B.

        > > > > lwhi
The fact it's agnostic has to be
useful.Being able to compare outcomes for
workflows involving competitors will
obviously be v v v v useful.

        > > > > UncleOxidant
> so the theory is Cursor wins that as the
"model agnostic"But there are many model
agnostic harnesses out there: OpenCode,
Roo, Cline, and many others. And even
Claude Code can be setup to use
non-Anthropic models.

          > > > > > mikestorrent
As a Cursor user, I don't have to have
thought about the providers behind the
compute - I get name brand Claude, or
cheap Kimi, or Grok, and it's all got
roughly the same agentic experience,
and only one bill. Enterprises love
this.

            > > > > > > AgentMasterRace
You get all that at the price of
Cursor. Enterprises do love to
spend money that's true.Open
routers prices are no different
than cursors and you can use any
harness you want.Big brain, small
brains? Hmmm

            > > > > > > gjulianm
Cursor has BYOK support too. I
think it also has Bedrock support.

          > > > > > sumedh
> And even Claude Code can be setup to
use non-Anthropic models.Too much
friction though, with Cursor its out
of the box.

        > > > > rimliu
Terminal is also model agnostic. Does it
matter where you enter your prompt text?

        > > > > pqtyw
> $4B ARRIf you resell something worth $5
for $5 while having to pay for R&D and
operating expenses that's not exactly
comparable with a company that's selling
actual products.> Say 17x MultipleOn an
extremely low margin business it is, yet
again that wouldn't be the stupidest thing
in today's market.

      > > > fuzzfactor
>How can cursor be worth even 10% of 60B?It
can't as long as there is plenty of AI without
it.The real differentiatior is that if $60B
today turns out to be all thrown away in a
worst-case scenario, it would be easily more
affordable and there would be less negative
impact than $47B at the time if it was all
thrown away on Twitter.

      > > > trhway
Their revenue is 3B, and 20x is pretty
typical.We're in the new era where startups
boast about and bought based on revenue and
not on just a number of users with unclear
path to monetizing as it had been for the
previous couple decades.We can also note that
we see Thrive Capital (Kushner) again in a
win.

      > > > ryanjshaw
Where else are you going to get access to a
real-time fresh high quality stream of human
intelligence to grow your baby AGI? You can't
buy Codex, Claude, Copilot, so what's left?

        > > > > 05
Chinese transfer stations?
          > > > > > ayewo
> Chinese transfer stations?For anyone
that doesn't get the reference, please
start here [1].1:
https://www.chinatalk.media/p/how-to-b
uy-cheap-claude-tokens...

      > > > dakolli
How are you switching between like 5 different
editors lol. Bro sloppers will do anything to
get their fix. Like the old people at the
casino switching slot machines all day based
on some occulted understanding that only they
think they have.

    > > flyingcircus3
There is most certainly still prompt engineering
involved. How there can be both the responsivity
to different cues like "plan this", "write this",
"analyze this", "defend this", "poke holes in
this", but not responsivity to the various
terminology you provide in your explanations of
"this", where to get information about
specs/standards/requirements, what details I care
about, and therefore can't compromise on, vs what
details I'm willing to accept whatever the top
reddit post from 4 years ago recommends.I don't
see how these systems can have the ability to be
effectively expressive about all of the minutia,
and not have all of the various different possible
expressions lead to vastly different outcomes.

      > > > ghshephard
I think all of the cues that you just
described are in the plan.For example - I
might (real world example from this
morning):"Create a script that installs
hashicorp vault and consul, store the data on
consul. Then create ahelper script that will
fill the vault server with sample data. Add
HTTPS support. Now write a framework that
reads and decrypts the encrypted data in
consul. Support old (pre 1.3) and new (post
1.3 vault). "That generates a 6 page plan
using Opus 4.8 w/1mm context, including notes
on what to prioritize, what format to create
the scripts in, etc... (My cursor guidance
already has a couple months of hints as to
what I want in terms of scaffolding unit
tests, canonical linux, performance, security,
etc...)That 6 page plan is the "Prompt" - but
it's entirely generated by Cursor/Opus. It's
there to tweak if you want to emphasize, or
provide some taste - but, honestly - it
probably does a better job than I would - so
~90% of the time I just accept the plan as is.

      > > > smoe
I would say prompt engineering, in the sense
of people claiming you need to include in
every prompt magic incantations like "You are
a senior engineer from a superintelligent
alien species" and "take a deep breath and
make no mistakes" doesn't really do that much
for everyday work I feel or they are all
already included in the system prompt maybe. I
reckon it can still edge out a few percentage
points in automation.What actually matters is
the ability to communicate well in general,
not anything LLM-specific. Being able to state
what you want clearly and unambiguously, and
having a sense for what additional information
you need to dump, even when the other side
claims they already have everything they need.

    > > hackermanai
> You just say, "Go Build me x - it should have
y,z features - and build it in golang for me" -
and that's it - the 3-4 page Plan comes back -
usually pretty credible - and then you click
"build.".What you're describing seems like a
workflow for building toys only. There's currently
no reality in which someone would actually know
what the y,z features are before making them. A
plan generated in 5min would likely suggest a
suboptimal solution compared to what a good
solution would look like (which might take a year
or two to figure out, for a human, so still a week
or so for SOTA models if at all possible).
Building something in golang is cute, but hard to
be convinced until more novel applications are
being generated from prompts.The data submitted by
Cursor's users tho, that seems to be very
valuable.

    > > 01100011
Yes, I tried to use Cursor as an editor. Terrible
idea in hindsight.So your workflow now looks like
mine except I prefer a different editor and only
use the latest and greatest model so Cursor
basically offers nothing over Codex.I disagree
about prompt engineering, but it's one of those
things that probably varies because of what
language you use, what problems you solve, and the
degree to which you care about the output. Unless
I'm writing tests, I keep AI on a very short leash
because I'm writing critical code used by a very
large number of users. I have noticed big
differences in output quality depending on how I
steer AI. Without steering, it will happily leave
in dead code, change the use of variables so they
need to be renamed, assume or fail to assume
invariants, etc. As I said in another comment, I
think we won't need to do that for very much
longer, but right now it seems essential.

    > > davedx
But that sounds like the same workflow as Codex or
Claude, except Cursor is only a harness without
its own model? (Or do they have their own model?)

      > > > ghshephard
You nailed it - in fact, most of Anthropic's
early revenue came from Cursor - much of
claude code programming components is
essentially a feature copy of Cursor, so it
makes sense they are similar.Cursor does have
it's own model - it's a heavily reworked
version of KimiK2, called "composer" - that I
use a lot of the time when I have fairly
straightforward tasks that don't require a lot
of exploration or independent thought. Lot
cheaper - the
Input/CacheWrite/CacheRead/Output costs of
Opus 4.8 are $5/$6.25/$0.5/$25 per mm tokens,
vs $0.5/-/$0.2/$2.5.

    > > sramam
> Key things I like about Cursor (and I recognize
I'm dating myself a bit here)What a world we live
in - "dating oneself" is measured in weeks/months!
:)

    > > kopirgan
Not trying to be funny but seriously, if these
tools can produce a tested 'product' in 45m,
shouldn't we be seeing millions of them out there?
I mean how far are we from a fully AI built Oracle
ERP or even a notepad or helix?

      > > > z3t4
Have you ever heard "I can do that in a
weekend" and they usually can. The difficult
part is not building the product, it's selling
and marketing, the buisness part. It's quite
common buisness tactic to outright copy
someone else's product or buisness.

      > > > ghshephard
It's a solid question - and to some degree
what https://programbench.com/ tries to
measure.Some of the issues (off the top of my
head):- Note - that my "product" was about
3,000 lines of code - so tiny. But
https://metr.org/ should give you some insight
into the complexity the models are capable
of.- you have to be able to imagine the
product. If I have the time, and energy, to
imagine what I want - the model will build it.
Here is an example of a much better programmer
than I and something he wanted built -
https://www.boatbomber.com/blog/claude-fable-5
- These are the first drafts. On average - any
complex system needs about 10 years and at
least 1000 active and enthusiastic about
reporting users to really get robust code.
Writing if via LLM doesn't (at least so far in
my experience) help that much in reducing bugs
if you were previously following any semblance
of TDD. Lots of bugs in the code - the
products you listed above have literally tens
of millions of years of user experiences and
bug reports that got them to where they are
today. No silver bullet yet - just faster,
less effort - and it enables non-technical
people to create (still buggy) products.

      > > > __patchbit__
Millions of produced verified software
engineered products in 45 minutes in the
likeness of Oracle ERP or notepad++, helix are
small potatoes when you see the unbounded
ambitions of SpaceX in full.The end point may
squeeze quality of operations at the subminute
time span for ground control environment
seriously launching Starship rockets one an
hour, for example.

    > > gigatexal
I think I do this with Claude every day. I don't
see why I need to pay for cursor to get this too.

      > > > ghshephard
You absolutely don't. I use all three
products. My preference is Claude Code for my
personal project. The one at work is kind of
sandboxed off - but does have the benefit of
an MCP for every enterprise service we have
(Kibana, Victoria Metrics, Grafana, Jira,
etc...) - which is nice.Over time - I expect
Composer will be cheaper than Opus 4.8 - but
the nice thing about Cursor - you can flick
between models.And (this is purely a personal
thing) - I really like the extensive
collection of "Plans" that cursor tracks -
there isn't really a similar thing in Claude
Code - but I really like the Claude.AI
interface for everything else. It's also a
much better general knowledge agent - the
Cursor Chat interface isn't as nice.

        > > > > gigatexal
I'm not sure what you're on about. I had
Claude doing swarm engineering using
different models. It would write specs
that haiku would implement, it would check
itself etc etc. with a simple phrase it
goes into planning, multi agent mode, and
chews on a problem until it's done. It's
pretty autonomous.Maybe you haven't looked
deeper into what modern Claude can do?

          > > > > > ghshephard
The Different Model approach is where
from tasks to task - I can switch from
Opus 4.8, GPT 5.5 and (very often)
composer 2 at 1/10th the cost.It's not
perfect, btw - to some degree you are
at the mercy of which models they
support - currently only 27 from
Gemini, OpenAI, Anthropic, Grok, and
Kimi (Just K2.5) - presumably because
they have commercial arrangements with
them. The "Bring your own Model" model
requires you plunk in your API key -
which sucks. And only one at a time.To
the best of my Knowledge, Claude Code
only supports one model at a time if
it's not one from Anthropic (which
will use the the entire suite of
Anthropic Models depending on the
task) - and you have to override it to
a single model with an environment
variable at startup - no ability to
flick between models from task to
task.Depending on your workflow - you
can save 70-90% on costs just by
chosing a reasonable model for really
extensive tasks that don't require
thinking, max context, etc....

          > > > > > baq
Different models aren't subagents -
they're completely orthogonal. I use
Gemini subagents for code review in
cursor, but mostly use gpt for actual
coding.

  > tombert
Same.When I first used Cursor, I hadn't used any of
the "Vibe Code" tools out there, so it was pretty neat
to have an assistant directly tied to the editor.Once
I learned how to use Codex, I just used a tmux split
with NeoVim and have the effect I wanted. I haven't
felt compelled to use Cursor at work since.

  > redox99
I also work with C++, and I use Codex (desktop) which
writes 99.99% of my code, plus Visual Studio, which is
nice for reading and navigating code. For webdev I do
VSCode + Codex.I started with Cursor back in the day,
but switched to Claude Code and then Codex when Cursor
got too expensive.If price wasn't an issue, maybe I'd
prefer Cursor only because I can easily switch between
models. But that's it. I always disliked the
"accept/reject" workflow in cursor, but that's
probably optional nowadays I guess?

    > > digitaltrees
I love the accept reject flow because I still
constantly have to stop AI models from writing
awful architecture or reimplementing code we
already wrote elsewhere

      > > > flyingoat
Yeah, I have found the same. A lot of times it
does get things right, but if it deviates man
it can just drift hard.For example, sometimes
Claude just obsessively reads files and goes
on massive tangents. Then when I stop it and
ask, "why are you doing that?", it kindly
apologizes and admits it shouldn't have gone
on a tangent.The token burn if I don't stop it
would be quite high.Granted, this might be
because I'm not giving it optimal
prompt/negative-prompt instructions though.

      > > > chamomeal
I just check the git diff after claude code
writes stuff. Stage things before letting it
run wild so I can undo whatevs.

        > > > > noworriesnate
That's expensive though. The sooner you
stop it from acting out the less you spend
on a rabbit trail.

      > > > tclancy
How is it different from Keep / Discard in
other tools? I've been slowly converting my
git repositories to jj locally because that
gives me more granular fallback and mix and
match options.

        > > > > digitaltrees
Well I tried CLaude Code for the first
time in a while (I am building my own
coding app www.propelcode.app so I can
code on my phone when I take my kids to
classes and such) and it literally ignored
my question and suggestion and just kept
coding away.

      > > > imtringued
I hate the accept reject flow, because I want
a conventional code review workflow where I
can write comments on specific lines of code
and maybe edit the code myself.If I reject,
then the AI will struggle to modify just the
parts I disagree with, if I accept, the AI
will tend towards adding code rather than
updating the bad code.At that point copy paste
without agentic coding tends to work much
better.

    > > echelon
Fable makes any IDE AI integration almost entirely
unnecessary. Claude one shots pretty much
everything, and fixing any small errors is easier
when just talking to Claude again.Anthropic is
going to offer better pricing using their agentic
harness. Why pay more for less?An IDE at this
point is best as a tool for code review. They need
to start building better code review tools.

      > > > hakfoo
I can't quite understand the "fixing small
errors is easier when just talking to Claude"
flow.I tried having it write some tests today.
It got very close to what I want, but picked a
stupid set of input values (two fields that
look independent that should only be used with
related values). I thought about "how do I
explain this" and then just went in and fixed
it myself.How is it easier to write "Okay, go
back to testBlah and change xxx to yyy" versus
clicking on XXX in the IDE and typing YYY by
hand? Maybe if you had 500 faulty tests and
were forbidden from using search-and-replace
for some reason.It makes sense when code
generation is the limiting factor, but I end
up with a lot of changes where the actual code
delta is smaller than the necessary prompt to
convince the bot to produce it.

        > > > > tobyhinloopen
Try the superpowers plugin, let it write a
spec (what do you want?) and a plan (how
is it implemented). Then let it implement
the plan.Review each step as much as you
care. These things take time so you can
just do other stuff while it's
cooking.With proper isolation of projects
you can easily have multiple sessions in
parallel. I frequently have 4 to 8
parallel Claude Code sessions, each with
whole trees of agents reproducing,
speccing, planning, implementing and
reviewing things.For common mistakes, you
can make it remember things or rely on
reviews.

      > > > slopinthebag
Some of us are working on things that Claude
can't one shot. Like, not even close.Also
https://xcancel.com/mitchellh/status/206665703
2938442833#mI really don't see IDE's going out
of fashion anytime soon.

      > > > hackermanai
> Claude one shots pretty much
everythingWhat?> An IDE at this point is best
as a tool for code review.I heard from a
friend that most devs building serious stuff
still write code. It's shocking but true. (No
code review needed.)

  > baq
the reason to use cursor nowadays isn't the IDE
(though it's helpful perhaps once a week), but how it
makes running models from multiple providers trivial
out of the box. I don't have to juggle keys or drop to
a shell tool call, it supports calling out to e.g.
gemini in a subagent natively. I have multiple models
cross-reviewing plans and diffs as a matter of
course.claude code was seriously annoying with the
flickering, maybe it's fixed now, I don't know.cursor
also has a (bad) cli if you need it, it seems it's
mostly used to setup remote agents, but it does the
job in a pinch.

    > > stavros
OpenCode and Pi do those things as well, and
without a whole annoying IDE bundled in.

      > > > infecto
OpenCode is miserable from a security
perspective. Well clarification the plans they
offer where they bundled in free models that
train on your use. You are then left to use an
OpenRouter which I find pretty flaky for at
least the leading Chinese models.

        > > > > jeremyjh
I doubt most people use OpenCode coding
plans, nor do they use OpenRouter. I use
subscription plans from ChatGPT, z.ai,
MiniMax & Xiaomi with OpenCode. It handles
authentication with all of them
seamlessly. I switch between models based
on task/subtask and based on usage limits.
You can get the most value out of a lot of
these plans at their second-tier and they
are often switching in value relative to
each other, so it makes sense to arbitrage
them like this.Most of that switching is
automated (oh-my-openagent - defaults
sub-tasks to different roles, so for
example I use MiniMax for explorer tasks
and GPT 5.5 for deep design & review
tasks, and GLM 5.2 for general
orchestrator & most coding). If I hit
usage limits it switches to a backup for
that task. I'm not sure Cursor
authenticates with all the subscription
coding plans from all those companies -
but if it does it can't be doing it any
better.I run it in a sandbox and its not
phoning home.

          > > > > > infecto
Which is cool but I think it's an
important callout because it's shady
how they do it in my opinion.

        > > > > stavros
I just use my ChatGPT subscription with
it. Not sure what you mean about security.

          > > > > > infecto
"Well clarification the plans they
offer where they bundled in free
models that train on your use."Just
what I said. They offer paid plans
through their tool. Said paid plans
are kind of a dark pattern where it's
not immediately obvious the models are
training on your data. The harness is
fine but that kind of business turns
me off and I am usually pretty neutral
about those sorts of things.

    > > loufe
For what it's worth, flickering in CC has been
fixed since around the beginning of the year.

      > > > g42gregory
I still saw a lot of flickering in VS Code (I
simply use CC as a terminal in VS Code,
without the plugin) as of 2 weeks ago. I think
it's a combination of CC bugs + Electron(?)
rendering the VS Code uses for terminal.Moved
on to Zed (native Rust rendering) 2 weeks ago
-> nothing flickers.Sadly, with Fable 5
cutoff, I am actively exploring CC
alternatives. Pi/OMP.sh works great as an
agent (definitely better than CC). GPT is
seemingly not as good as Opus, but with better
agent and better skills, it probaly won't
matter anyway. GPT lets you use any agent on
Pro subscription.

      > > > xdennis
Maybe flickering, but it's still broken in
various ways. Only a few days ago I had an
issue where the text I was typing was outside
of the textbox frame. Resizing the terminal
still maintained the broken view.

      > > > dwaltrip
The rendering still breaks many times a day
for me, in fairly catastrophic ways. Usually
because I have the audacity to resize my
terminal window.Ctrl+c -> new tab -> `claude
-resume` is deeply ingrained at this point.

  > marcuschong
It's curious that the person claiming LLMs will soon
skip code entirely and go straight to binary is
willing to spend $60bn on Cursor.

    > > zs234465234165
I'm sure he has a good reason
  > sergiotapia
On the flipside, I enjoy Cursor now and came back to
it after leaving it over a year ago. The 2.5 model is
fast as hell and very good. And whatever harness they
have it's terrific, great results. I also really enjoy
the fact that I can open my website in the Cursor
in-app browser and just click and reference stuff.
It's a really cracked workflow. The models can only
get better for them.

    > > jr3592
I would also add that Cursor's "Debug" harness is
incredible. Hit "Tab" in the AI editor to Tab
through the options (Plan, Multitask, Ask, etc.)If
you do any kind of on-device work, it will spin up
a local HTTP log server, and pipe logs from your
real device (phone, hardware, etc.) to the server
and do realtime debugging.Claude will mostly
guess, have you copy + paste logs, etc.

    > > chasd00
> I can open my website in the Cursor in-app
browser and just click and reference stuff.I've
never used cursor and have only seen it in a
couple work lunch and learn demos. I've never seen
that feature. I have a lot of use cases where I'm
asking cc to move a widget down a little bit or
make a data table full width etc. Being able to
reference the actual UI would be useful.