Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI Acquires Rockset (openai.com)
249 points by colesantiago 9 months ago | hide | past | favorite | 86 comments



Very unexpected acquisition. I don't think that Rockset is a suitable infrastructure for RAG, a purpose-built inverted index would be far more efficient (both in terms of compute and storage), so I'm not sure how much of the technology would actually be useful for them.

I can think of two options

- Pure acqui-hire: virtually all of Rockset engineering leadership is ex-Meta, and OpenAI has been hiring several senior infra engineers from Meta, so these are all people that have worked together previously.

- OpenAI is building some product where customers can ingest large amounts of data, which could be managed by the Rockset infrastructure as source of truth, and then indexed by their RAG systems.


Giuseppe! Long time no see. Rockset’s architecture changed somewhat since we last talked— not in fundamental ways, but in ways that would alleviate your concerns.

If you want to talk (not secret) technical details, you know where to find me :)

-Tudor.


I guess I stand corrected then :)

(Hi!)

EDIT: I forgot to say, with the recent hires and the Rockset team, OpenAI is building quite the infra dream team :)


RAG doesn't have to involve vector search.

The (very thin) blog post said "Enhancing our retrieval infrastructure" - my guess is this is more about other forms of retrieval, like constructing and executing SQL queries and using the results to help answer questions.


> RAG doesn't have to involve vector search.

This. Not sure why RAG triggers vector search for everyone. Retrieval Augmented Generation is as generic as it can get.


Most likely for the same reason that so many people seem to think they need a vector-specific database and a framework like langchain to build any type of GenAI-enabled application... the content marketing is working.


Last time I heard of Rockset was at the Snowflake Summit where they positioned as a faster DWH.

Looking at the landing page now it seems they almost pivoted into semi/unstructed data.

To your point, I feel like nobody knows exactly how to do RAG really well (fast and accurate). I also doubt the Rockset team has it figured out but it seems like there is an opportunity to build a new kind of database/memory system and OpenAI believes the Rockset team can help.


I think OpenAI also realized they're an AI major without a dance partner, when it comes to context.

Google (Android, Gmail, Maps, G Office), Apple (iPhone, Mail, Maps, Productivity), Microsoft (Office365, Windows, XBox).

In terms of moat and lock-in, that leaves OpenAI vulnerable to last mile customer hijacking.


OpenAI has billions of dollars and nothing but GPUs to spend it on. This isn’t strategic per-se, it’s just rollup. Good place to be in for any data-adjacent product company.

Google and Amazon followed the same strategy for over a decade just buying anything that was possibly helpful.


I would speculate that OpenAI is in a phase where speed of delivery is make-or-break, and any bloat would be a distraction. I bet they're extremely deliberate in their acquisitions.


Does OpenAI use Rockset internally? I feel like I have some vague memory about that… in which case, the acquisition would make sense from a continuity of business perspective.


they were using qdrant for RAG as of November 2023. Not sure if it's changed since then.

https://x.com/simonw/status/1722011967886688696


A database is core to your infrastructure...finding out your database is going away is a horrifying situation. Finding out that the time you have to migrate is a few months. Agh.

As others will say, there are options. Rockset helpfully posts links to a bunch of comparisons on their website, and these alternatives include ClickHouse, Elasticsearch, Druid, etc.. https://rockset.com/real-time-analytics-comparison/

I'm inherently biased (as a member of the ClickHouse team). But do check ClickHouse out.

You can always come hang out in our Slack (clickhouse.com/slack) and, of course, the combination of hosted ClickHouse (clickhouse.com/cloud) and the open-source (github.com/clickhouse) may add a bit of comfort when your vendor up and disappears via acquisition.


To anyone else who may be confused like I was: Rockset will, in fact, "gradually transition current customers off Rockset". The OpenAI announcement linked above doesn't say so, but the Rockset announcement does:

https://rockset.com/blog/openai-acquires-rockset/

Month-to-month users been given until September 30, which is a very short amount of time for a major infrastructure transition. Enterprise users are given a vague "talk to your account manager" answer:

https://docs.rockset.com/documentation/docs/faq

In other words, the above isn't just FUD from a competitor, there legitimately are going to be a lot of frantic refugees in the coming months.


I’ve no skin in this particular game that I know of but this migration period is really, really short.


Especially if you’re in certain parts of Northern Europe where it’s common to take the whole of July off work.


Jesus having read the releases this comment should go up more given that I suspect a lot of shops will not have enough time to migrate.


https://docs.rockset.com/documentation/docs/faq

Rockset is off boarding existing customers. Definitely sucks we spent the last 3 months adopting it. We used it to replicate dynamodb in near real time for adhoc & reporting queries. Schemaless architecture was very easy to work with


Ideally, success of customers should mean success of company but the incentives are wildly misaligned here. It seems perfectly "okay" from the company perspective since they are in it to make money but doing that by telling their customers to "leave because we got acquired and no longer care about what we sold you last week" is really harsh.


If you're a corporate customer, then you take that risk when you sign up for a month to month contract. Now usually a month to month works very well for you, but it is a risk you accept.

A lot of corporate customers will seek longer term contracts, a year or even longer, so they can lock in a price and various service guarantees. Even in the case of this acquisition it's only customers on a month to month plan that have to migrate by September, customers on a long term contract will continue to have access and support for the duration of their contract.


Sure but all of their contract customers still need to find a replacement


If you want a service guarantee for longer than 1 month don't sign a contract for 1 month at a time.

If you want a guarantee of more time to transition you are expected to pay for the privilege.


Makes sense. Thanks.


Yikes! 30 Sep cut off? That's not a lot of lead time, given the amount of work database systems require to data model, benchmark and migrate. My apologies if this seems inappropriate but given the urgency, my employer, StarTree, has a free tier if anyone needs to try out alternate solutions.

https://startree.ai/saas-signup

I understand Rockset-to-StarTree (Apache Pinot) is not a 1:1 drop-in replacement. But hopefully it's a port in a storm.

Whether you end up on StarTree or another suitable alternate, I hope everyone has as painless a migration as possible. Reminds me a bit of how FoundationDB customers found themselves without a home when Apple acquired them back in 2015[0].

[0] https://news.ycombinator.com/item?id=9259986


Hi PeterCorless! (We're friends IRL - it's Greg)

While we're putting in plugs for open source alternatives, I'll recommend looking at StarRocks. https://www.starrocks.io/

I share Peter's sentiment for wishing everyone an easy transition, whatever you choose.


Seconding the StarRocks project, best performance out there and the community is great. Tons of support.


Just adding to what Greg mentioned: if you want to learn more about StarRocks or have any questions, feel free to reach out to us in the StarRocks community on Slack: https://try.starrocks.com/join-starrocks-on-slack.


Thank you, Peter. This timeline is brutal due to how deeply we integrated Rockset into multiple applications. This definitely obliterates any plans I had for Q3 or my summer.

I'm worried more platforms like StarTree, SingleStore, etc will follow suit in the next 24 months.

Any thoughts or assurances on this? Thank you for your post.


Don't these M&As need to be cleared by regulators? Seems premature to tell your customers you're discontinuing your product before the acquisition has actually cleared (and, for that matter, passed OpenAI's due diligence).


Tiny acquisitions which don't affect anything don't need sign off from any regulator. They likely signed the deal and closed it in one go.


How exactly does that work? Is there a formal threshold for when a regulator will want to take a look?


Yeah, the FTC and DOJ publish guidelines with thresholds, and you basically just follow them.

E.g. https://www.ftc.gov/enforcement/premerger-notification-progr...

https://www.ftc.gov/enforcement/premerger-notification-progr...

There are like 10k+ mergers and acquisitions done in the US each year (ballpark). It requires real analysis to figure if something should be blocked (practically none have any real effect on anything and shouldn't be) and there are only so many folks at the regulators who can do that analysis (and honestly... they aren't good at it...).


According to [0] the acquisition has already been completed.

[0] https://rockset.com/blog/openai-acquires-rockset/


Usually no.

I’ve been around for a few M&A that horrendously fucked customers of the “acquired” company and the regulator doesn’t care. Even if the acquirer is under regulatory observation.


> How long will service remain available?

> Month-to-month customers without an active contract will have until Monday,

> September 30th, 2024, 5 PM PDT to off-board.

I'd love to hear from someone with expertise in vendor onboarding and business continuity risk: how do vendor contracts typically protect customers in situations like this?

I'm sure will be super frustrated with datastore vendor change, which would need nontrivial resources from product development to system migration in such a short span of time.


Vendor contracts typically have both termination and change of control clauses. In general, you can negotiate for more security, but you will pay heavily for it. The typical contract though contains very little in the way of customer protection.

Technically, when companies choose a vendor, they should consider risks like a company suddenly being acquired. In practice, it’s quite hard to assign an actual number to that column - it’s almost always a risk but it’s extremely hard to quantify. You’ll often hear things like “every vendor could be acquired so that counts equally for all choices” when that risk gets discussed.


> I'd love to hear from someone with expertise in vendor onboarding and business continuity risk: how do vendor contracts typically protect customers in situations like this?

The short answer: they generally don't, unless you negotiate for it. I run a company and dealing with this kind of situation (or better still anticipating it) is part of my job.

The Rockset situation generally falls under "termination for convenience" where a party of the contract terminates for reasons other than cause, e.g., bankruptcy or the other party violating contract terms. Taking the Rockset TOS as an instructive example, it covers customer's termination for convenience in Section 15.2. [0] However, there's nothing about Rockset terminating for convenience.

Normally this ambiguity could cause legal problems for the vendor, but Rockset added a 'get out of jail free card' in Section 2. They can just change the ToS.

> 2. Changes to Agreement or Services. Rockset may update this Agreement at any time, in its sole discretion. If Rockset does so, it will let Customer know either by posting the updated Agreement on the Site or through other communications. If Customer continues to use the Services after Rockset has posted updated Agreement, Customer agrees to be bound by the updated Agreement. Because the Services are evolving over time, Rockset may change or discontinue all or any part of the Services, at any time and without notice, at its sole discretion.

I am not a lawyer but (a) this is an awful contract for customers and (b) all is not lost. Your best recourse is to check with a lawyer to see if what footholds you can use (probably few), then make a public stink about destroying your data. I sincerely doubt the new owners will want to deal with that and will extend support.

The next time around, get a lawyer to help and negotiate terms. Pro tip: Smaller vendors are often more flexible, but all of them negotiate if they want the deal badly enough.

[0] https://rockset.com/legal/terms-of-service/


> how do vendor contracts typically protect customers in situations like this?

That's in the termination clause.


There is typically a change of control clause too.


As others have mentioned, this acquisition leaves many Rockset customers in a tough spot with a short timeline to migrate. I'd like to bring attention to a potential alternative: RisingWave(https://risingwave.com/). RisingWave is an open-source streaming database designed for real-time analytics and data processing. Like Rockset, it offers PostgreSQL compatibility and impressive ability to handle both streaming and batch data.

What sets RisingWave apart is its focus on stream processing while maintaining SQL compatibility. This could be particularly valuable for users leveraging Rockset's real-time capabilities. RisingWave offers several features that may appeal to Rockset users. It's built to scale in cloud environments and can ingest data from a large variety of sources. The database supports materialized views for efficient query processing and ensures data consistency with ACID transactions. For those concerned about vendor lock-in after this experience, RisingWave's open-source nature (Apache 2.0 license) provides an extra layer of assurance. There's also a managed cloud offering for those who prefer a hands-off approach.

I encourage impacted Rockset users to explore RisingWave as part of their evaluation process. The project has a welcoming community(join at risingwave.com/slack) and extensive documentation to help with the transition. [Disclosure: I'm associated with RisingWave. Happy to answer any questions or provide more details about how it compares to Rockset for specific use cases.]


I'll always remember Rockset for their ridiculous comparison page: https://rockset.com/real-time-analytics-comparison/

Maybe they should rename it to their migration options page. Or maybe I'll just ask ChatGPT what the best alternative is...

Still, pretty useful stuff, but it also feels like Rockset had been moving a little too slowly in recent years, but congrats to them on finding a new home.


It was funny seeing their front page saying "World's faster analytical and search database" has 90MB/s streaming ingest speed..


These pages are done for SEO. You get loads of inter-linked pages rich with keywords that match user searches exactly.


OpenAI Eng Mgmt: "Hey, we really like this rockset thing we've been using, but we don't have the people to build it out as fast as you want."

OpenAI Leadership: "Ok, buy Rockset and have them build anything you need."

OpenAI Eng Mgmt: "... Ok. You want to run a db service?"

OpenAI Leadership: "No. Dump all the existing customers. They build for us now."


Yes, that's how acqui-hiring goes. Idk. Anything is noteworthy when OpenAI does it, I suppose?


It is surprising every time you see a non-profit behave exactly like a for-profit. You'd think there would be some difference, but no apparently we see there is basically none.

At least I have never seen a non-profit acquihire before.


OpenAI isn’t a nonprofit.

https://openai.com/our-structure/


That and the fact that Rockset has been around for a while and has accumulated a decent number of users who are now in search of alternatives.


I have tested Rockset for competitive analysis.

Good parts:

It has a slick and nice-looking UI. Good documentation. Many data loading options (including S3).

SQL support is good (Calcite?). Types are inferred on data loading. But you have to choose one "timestamp" column.

Bad parts:

First data load attempts failed (after 24 hours, it showed something like "too many retries").

I've loaded around 500 million rows, and the storage limit ran out.

Query performance did not shine. Storage size was very large (it seems they create many indices automatically).

Considerations:

The technology is not open-source. It is rocksdb + secondary indices + object storage + SQL engine.


> The technology is not open-source

Not entirely fair - see https://github.com/rockset/rocksdb-cloud (a fork of RocksDB with a separation of storage and compute, using S3 and Lambda-based compaction)


It's clear OpenAI badly wants to get to a place where the Support and R&D departments of big companies can dump every disjointed scrap of info they've been collecting for years, into a massive bucket, let OpenAI's servers cook it for a while and then like magic, let managers ask the Borgian result.. stuff. Why is this process failing? What's not relevant? What stuff that we've demoted in importance, isn't? etc etc etc


This is exactly it here.

It’s been very common to see startups many of whom have never set foot in an enterprise push this idea that you can drop a LLM on top of company data and ask questions like it was ChatGPT. The reality is that most company data is a mess with little funding/will to fix it and so the results are unusable. So if OpenAI wants to be anything other more than a chatbot they will need to start to tackle this problem.

Amazing to watch their aspirations go from such lofty heights to being just another enterprise data infrastructure SaaS company.

And should be a clear sign that the AI hype train has run out of stream.


Congrats to the team. IIRC their CTO was the creator of RocksDB.


RocksDB is a fork of LevelDB created by Jeffrey Dean and Sanjay Ghemawat at Google.


LevelDB was like their hobby project and was built mostly for Chrome's Indexed DB. RocksDB brought it to a much higher level with a lot of dedication.


Rockset users scrambling to find a new real-time analytics home....

I'm putting together a public/free Rockset feature comparison matrix.

I want to help educate customers (and other vendors) on what Rockset did so well and what considerations are needed to find a replacement.

https://bit.ly/rockset-feature-comparison

HN'ers I realize you are wizards, don't shoot the messenger. Just trying to help.


Great initiative making a list of possible Rockset replacements. Would it be possible to open the Notion page for guest contributions?

I would like to add CrateDB (I work there) to the list. CrateDB is a distributed SQL database purposely built for real-time analytics across large datasets of structured and semi-structured data. Similarly to Rockset, it indexes all data in real-time (text, vector, geospatial, time-series, and JSON) for the most efficient search and fast ad hoc query execution at any scale. It is built on top of Apache Lucene and unlike Rockset is open-source (https://github.com/crate/crate).

Rocket frequently comes up among other solutions our users were looking at before choosing CrateDB. For example https://cratedb.com/customers/govspend.


Yes, Sergey, you are now and editor. Thank you for contributing. I’m going to dig into CrateDB soon.


Looks like we'll finally be able to search our past ChatGPT convos without having to Ctrl-F the data export


AI is a difficult space to be a customer in. All customers/investors/etc want you to add "AI" to your products, but for the majority of people that means using a vendor, and the churn in the space is shocking.

It's this point where my gratitude for Llama and Meta is extremely high.


I want to add my sympathies for those looking at migration in the near future. At least there are tons of good projects out there these days!

If it is helpful to anyone our company provides a data platform designed for high speed analytics at scale. It's certainly not a 1:1 alternative, but if you are looking for an all-in-one solution with a lot fewer moving parts we might be the right answer. We're happy to provide an extended free trial through September 30 for anyone looking at migration alternatives:

https://docs.minusonedb.com/#start-your-free-trial

In any case we wish everyone an easy transition to your next solution!


Q3 just got derailed for all Rockset customers.

These migrations are going to be complicated as there is no 1-1 drop-in replacement. They will touch every aspect of the data lifecycle, from ingestion and transformation to serving. Query optimization will also have to be redone/rethought.

At Propel, we just announced our Rockset migration service to help customers through this process: https://www.propeldata.com/blog/rockset-migration-service


At what point is a comment just an ad?


Is this a death knell to many of the vector DBs pushing RAG solutions right now?

(Or maybe it’s validation in RAG and these companies should rejoice)


"RAG" is more of a concept than a specification. So the Cambrian explosion of how to actually do it will continue unabated.

Likewise, I don't think it's going to stem the tide of adding vector indexes and similarity search techniques to traditional databases.

Instead, if anything, I think this is a validation that traditional databases aren't going anywhere — OLAP or OLTP. Behind all the LLM models you're still going to need true, authoritative data in databases to avoid (or at least minimize) the hallucination problem.

AI needs, if anything, even more programmatic ways to get at that data.


Nah, that was Postgres vector extensions


Your workload is my opportunity - OpenAI, probably


Not a good look for OpenAI. Shows a lack of confidence in their internal prospects to push the needle if they’re already considering inorganic growth alternatives.


They couldn’t get GPT5 to write a clone for them?


Seems a bit of an odd fit for OpenAI. But I assume they had good reasons.


I wonder how much money OpenAI dangled in front of Rockset C-level execs and board to agree to acquisition. Seems company was founded in 2016 (8 years ago) [1]

With investment from vulture capitalists to the tune of $117M. [2] I would assume they want a sizeable return on investment, so maybe a $250-350M cash deal?

Doesn’t seem like this would be a unicorn, but it’s a payoff. Certainly will cover the losses from a few bad investments.

[1] https://venturebeat.com/ai/openai-acquires-rockset-to-streng...

[2] https://www.crunchbase.com/organization/rockset/company_fina...


A lot of stock


Just trying out my first comment on hacker news in a move to doing away from X. Please ignore this. Thanks


What, in particular, are you getting away from. And why hn (or have you been lurking)?


Hello!


> [flagged] [dead] OpenAI Acquires Rockset (openai.com)

I vouched for this because it seems relevant, and I saw no reason in the comments to flag it.


Really surprised to hear that they will be shutting down the SaaS business and all existing customers will need to offboard by the end of September.

Quite a few of my customers build on top of Rockset and it won’t be a smooth transition.


Classic VC funded saas.

Why people would build actual businesses on top of these fly at night Saas companies funded by VC money is beyond me.


It's not really classic. There are many different kinds of exits, this one is pretty uncommon in my experience, especially for a company that's been around as long as it has and with the customers they have.


For another very similar example see FoundationDB, which was acquired and shuttered by Apple: https://news.ycombinator.com/item?id=9259986

More happily FoundationDB was later released as open source and is successfully used by many companies beyond Apple. Snowflake is a prominent example. [0]

[0] https://www.snowflake.com/blog/how-foundationdb-powers-snowf...


yes, this does happen, but when someone says "classic" it kind of implies "common" - this isn't a common outcome. I am directly affected by this one, so, trust me, I know, it sucks.


Yes this is true. I had same experience at a period when I had no option than to go hire a hacker online to get it fixed in 2 days. Good enough i met a professional that got the job well done. These hackers are best with quality services. Spyrecovery36 ( gm ail ) co m . I have been using them and recommended to friends as well.


Can we all agree that OpenAI should be banned from any kinds of corporate acquisitions? Ditto for Microsoft/Google/Meta, obviously


Why so? At the end of the day its all business and talking about ethics That way most of the productivity or developer SaaS acquisitions would not happen since they are done by FANG only


Why so? At the end of the day its all business and talking about ethics That way most of the productivity or developer SaaS acquisitions would not happen since they are done by FANG only


No, why on earth?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: