Very unexpected acquisition. I don't think that Rockset is a suitable infrastructure for RAG, a purpose-built inverted index would be far more efficient (both in terms of compute and storage), so I'm not sure how much of the technology would actually be useful for them.
I can think of two options
- Pure acqui-hire: virtually all of Rockset engineering leadership is ex-Meta, and OpenAI has been hiring several senior infra engineers from Meta, so these are all people that have worked together previously.
- OpenAI is building some product where customers can ingest large amounts of data, which could be managed by the Rockset infrastructure as source of truth, and then indexed by their RAG systems.
Giuseppe! Long time no see. Rockset’s architecture changed somewhat since we last talked— not in fundamental ways, but in ways that would alleviate your concerns.
If you want to talk (not secret) technical details, you know where to find me :)
The (very thin) blog post said "Enhancing our retrieval infrastructure" - my guess is this is more about other forms of retrieval, like constructing and executing SQL queries and using the results to help answer questions.
Most likely for the same reason that so many people seem to think they need a vector-specific database and a framework like langchain to build any type of GenAI-enabled application... the content marketing is working.
Last time I heard of Rockset was at the Snowflake Summit where they positioned as a faster DWH.
Looking at the landing page now it seems they almost pivoted into semi/unstructed data.
To your point, I feel like nobody knows exactly how to do RAG really well (fast and accurate). I also doubt the Rockset team has it figured out but it seems like there is an opportunity to build a new kind of database/memory system and OpenAI believes the Rockset team can help.
OpenAI has billions of dollars and nothing but GPUs to spend it on. This isn’t strategic per-se, it’s just rollup. Good place to be in for any data-adjacent product company.
Google and Amazon followed the same strategy for over a decade just buying anything that was possibly helpful.
I would speculate that OpenAI is in a phase where speed of delivery is make-or-break, and any bloat would be a distraction. I bet they're extremely deliberate in their acquisitions.
Does OpenAI use Rockset internally? I feel like I have some vague memory about that… in which case, the acquisition would make sense from a continuity of business perspective.
A database is core to your infrastructure...finding out your database is going away is a horrifying situation. Finding out that the time you have to migrate is a few months. Agh.
As others will say, there are options. Rockset helpfully posts links to a bunch of comparisons on their website, and these alternatives include ClickHouse, Elasticsearch, Druid, etc.. https://rockset.com/real-time-analytics-comparison/
I'm inherently biased (as a member of the ClickHouse team). But do check ClickHouse out.
You can always come hang out in our Slack (clickhouse.com/slack) and, of course, the combination of hosted ClickHouse (clickhouse.com/cloud) and the open-source (github.com/clickhouse) may add a bit of comfort when your vendor up and disappears via acquisition.
To anyone else who may be confused like I was: Rockset will, in fact, "gradually transition current customers off Rockset". The OpenAI announcement linked above doesn't say so, but the Rockset announcement does:
Month-to-month users been given until September 30, which is a very short amount of time for a major infrastructure transition. Enterprise users are given a vague "talk to your account manager" answer:
Rockset is off boarding existing customers. Definitely sucks we spent the last 3 months adopting it. We used it to replicate dynamodb in near real time for adhoc & reporting queries. Schemaless architecture was very easy to work with
Ideally, success of customers should mean success of company but the incentives are wildly misaligned here. It seems perfectly "okay" from the company perspective since they are in it to make money but doing that by telling their customers to "leave because we got acquired and no longer care about what we sold you last week" is really harsh.
If you're a corporate customer, then you take that risk when you sign up for a month to month contract. Now usually a month to month works very well for you, but it is a risk you accept.
A lot of corporate customers will seek longer term contracts, a year or even longer, so they can lock in a price and various service guarantees. Even in the case of this acquisition it's only customers on a month to month plan that have to migrate by September, customers on a long term contract will continue to have access and support for the duration of their contract.
Yikes! 30 Sep cut off? That's not a lot of lead time, given the amount of work database systems require to data model, benchmark and migrate. My apologies if this seems inappropriate but given the urgency, my employer, StarTree, has a free tier if anyone needs to try out alternate solutions.
I understand Rockset-to-StarTree (Apache Pinot) is not a 1:1 drop-in replacement. But hopefully it's a port in a storm.
Whether you end up on StarTree or another suitable alternate, I hope everyone has as painless a migration as possible. Reminds me a bit of how FoundationDB customers found themselves without a home when Apple acquired them back in 2015[0].
Just adding to what Greg mentioned: if you want to learn more about StarRocks or have any questions, feel free to reach out to us in the StarRocks community on Slack: https://try.starrocks.com/join-starrocks-on-slack.
Thank you, Peter. This timeline is brutal due to how deeply we integrated Rockset into multiple applications. This definitely obliterates any plans I had for Q3 or my summer.
I'm worried more platforms like StarTree, SingleStore, etc will follow suit in the next 24 months.
Any thoughts or assurances on this? Thank you for your post.
Don't these M&As need to be cleared by regulators? Seems premature to tell your customers you're discontinuing your product before the acquisition has actually cleared (and, for that matter, passed OpenAI's due diligence).
There are like 10k+ mergers and acquisitions done in the US each year (ballpark). It requires real analysis to figure if something should be blocked (practically none have any real effect on anything and shouldn't be) and there are only so many folks at the regulators who can do that analysis (and honestly... they aren't good at it...).
I’ve been around for a few M&A that horrendously fucked customers of the “acquired” company and the regulator doesn’t care. Even if the acquirer is under regulatory observation.
> Month-to-month customers without an active contract will have until Monday,
> September 30th, 2024, 5 PM PDT to off-board.
I'd love to hear from someone with expertise in vendor onboarding and business continuity risk: how do vendor contracts typically protect customers in situations like this?
I'm sure will be super frustrated with datastore vendor change, which would need nontrivial resources from product development to system migration in such a short span of time.
Vendor contracts typically have both termination and change of control clauses. In general, you can negotiate for more security, but you will pay heavily for it. The typical contract though contains very little in the way of customer protection.
Technically, when companies choose a vendor, they should consider risks like a company suddenly being acquired. In practice, it’s quite hard to assign an actual number to that column - it’s almost always a risk but it’s extremely hard to quantify. You’ll often hear things like “every vendor could be acquired so that counts equally for all choices” when that risk gets discussed.
> I'd love to hear from someone with expertise in vendor onboarding and business continuity risk: how do vendor contracts typically protect customers in situations like this?
The short answer: they generally don't, unless you negotiate for it. I run a company and dealing with this kind of situation (or better still anticipating it) is part of my job.
The Rockset situation generally falls under "termination for convenience" where a party of the contract terminates for reasons other than cause, e.g., bankruptcy or the other party violating contract terms. Taking the Rockset TOS as an instructive example, it covers customer's termination for convenience in Section 15.2. [0] However, there's nothing about Rockset terminating for convenience.
Normally this ambiguity could cause legal problems for the vendor, but Rockset added a 'get out of jail free card' in Section 2. They can just change the ToS.
> 2. Changes to Agreement or Services. Rockset may update this Agreement at any time, in its sole discretion. If Rockset does so, it will let Customer know either by posting the updated Agreement on the Site or through other communications. If Customer continues to use the Services after Rockset has posted updated Agreement, Customer agrees to be bound by the updated Agreement. Because the Services are evolving over time, Rockset may change or discontinue all or any part of the Services, at any time and without notice, at its sole discretion.
I am not a lawyer but (a) this is an awful contract for customers and (b) all is not lost. Your best recourse is to check with a lawyer to see if what footholds you can use (probably few), then make a public stink about destroying your data. I sincerely doubt the new owners will want to deal with that and will extend support.
The next time around, get a lawyer to help and negotiate terms. Pro tip: Smaller vendors are often more flexible, but all of them negotiate if they want the deal badly enough.
As others have mentioned, this acquisition leaves many Rockset customers in a tough spot with a short timeline to migrate. I'd like to bring attention to a potential alternative: RisingWave(https://risingwave.com/). RisingWave is an open-source streaming database designed for real-time analytics and data processing. Like Rockset, it offers PostgreSQL compatibility and impressive ability to handle both streaming and batch data.
What sets RisingWave apart is its focus on stream processing while maintaining SQL compatibility. This could be particularly valuable for users leveraging Rockset's real-time capabilities. RisingWave offers several features that may appeal to Rockset users. It's built to scale in cloud environments and can ingest data from a large variety of sources. The database supports materialized views for efficient query processing and ensures data consistency with ACID transactions.
For those concerned about vendor lock-in after this experience, RisingWave's open-source nature (Apache 2.0 license) provides an extra layer of assurance. There's also a managed cloud offering for those who prefer a hands-off approach.
I encourage impacted Rockset users to explore RisingWave as part of their evaluation process. The project has a welcoming community(join at risingwave.com/slack) and extensive documentation to help with the transition.
[Disclosure: I'm associated with RisingWave. Happy to answer any questions or provide more details about how it compares to Rockset for specific use cases.]
Maybe they should rename it to their migration options page. Or maybe I'll just ask ChatGPT what the best alternative is...
Still, pretty useful stuff, but it also feels like Rockset had been moving a little too slowly in recent years, but congrats to them on finding a new home.
It is surprising every time you see a non-profit behave exactly like a for-profit. You'd think there would be some difference, but no apparently we see there is basically none.
At least I have never seen a non-profit acquihire before.
Not entirely fair - see https://github.com/rockset/rocksdb-cloud (a fork of RocksDB with a separation of storage and compute, using S3 and Lambda-based compaction)
It's clear OpenAI badly wants to get to a place where the Support and R&D departments of big companies can dump every disjointed scrap of info they've been collecting for years, into a massive bucket, let OpenAI's servers cook it for a while and then like magic, let managers ask the Borgian result.. stuff. Why is this process failing? What's not relevant? What stuff that we've demoted in importance, isn't? etc etc etc
It’s been very common to see startups many of whom have never set foot in an enterprise push this idea that you can drop a LLM on top of company data and ask questions like it was ChatGPT. The reality is that most company data is a mess with little funding/will to fix it and so the results are unusable. So if OpenAI wants to be anything other more than a chatbot they will need to start to tackle this problem.
Amazing to watch their aspirations go from such lofty heights to being just another enterprise data infrastructure SaaS company.
And should be a clear sign that the AI hype train has run out of stream.
Great initiative making a list of possible Rockset replacements. Would it be possible to open the Notion page for guest contributions?
I would like to add CrateDB (I work there) to the list. CrateDB is a distributed SQL database purposely built for real-time analytics across large datasets of structured and semi-structured data. Similarly to Rockset, it indexes all data in real-time (text, vector, geospatial, time-series, and JSON) for the most efficient search and fast ad hoc query execution at any scale. It is built on top of Apache Lucene and unlike Rockset is open-source (https://github.com/crate/crate).
AI is a difficult space to be a customer in. All customers/investors/etc want you to add "AI" to your products, but for the majority of people that means using a vendor, and the churn in the space is shocking.
It's this point where my gratitude for Llama and Meta is extremely high.
I want to add my sympathies for those looking at migration in the near future. At least there are tons of good projects out there these days!
If it is helpful to anyone our company provides a data platform designed for high speed analytics at scale. It's certainly not a 1:1 alternative, but if you are looking for an all-in-one solution with a lot fewer moving parts we might be the right answer. We're happy to provide an extended free trial through September 30 for anyone looking at migration alternatives:
These migrations are going to be complicated as there is no 1-1 drop-in replacement. They will touch every aspect of the data lifecycle, from ingestion and transformation to serving. Query optimization will also have to be redone/rethought.
"RAG" is more of a concept than a specification. So the Cambrian explosion of how to actually do it will continue unabated.
Likewise, I don't think it's going to stem the tide of adding vector indexes and similarity search techniques to traditional databases.
Instead, if anything, I think this is a validation that traditional databases aren't going anywhere — OLAP or OLTP. Behind all the LLM models you're still going to need true, authoritative data in databases to avoid (or at least minimize) the hallucination problem.
AI needs, if anything, even more programmatic ways to get at that data.
Not a good look for OpenAI. Shows a lack of confidence in their internal prospects to push the needle if they’re already considering inorganic growth alternatives.
I wonder how much money OpenAI dangled in front of Rockset C-level execs and board to agree to acquisition. Seems company was founded in 2016 (8 years ago) [1]
With investment from vulture capitalists to the tune of $117M. [2] I would assume they want a sizeable return on investment, so maybe a $250-350M cash deal?
Doesn’t seem like this would be a unicorn, but it’s a payoff. Certainly will cover the losses from a few bad investments.
It's not really classic. There are many different kinds of exits, this one is pretty uncommon in my experience, especially for a company that's been around as long as it has and with the customers they have.
More happily FoundationDB was later released as open source and is successfully used by many companies beyond Apple. Snowflake is a prominent example. [0]
yes, this does happen, but when someone says "classic" it kind of implies "common" - this isn't a common outcome. I am directly affected by this one, so, trust me, I know, it sucks.
Yes this is true. I had same experience at a period when I had no option than to go hire a hacker online to get it fixed in 2 days. Good enough i met a professional that got the job well done. These hackers are best with quality services. Spyrecovery36 ( gm ail ) co m . I have been using them and recommended to friends as well.
Why so?
At the end of the day its all business and talking about ethics
That way most of the productivity or developer SaaS acquisitions would not happen since they are done by FANG only
Why so?
At the end of the day its all business and talking about ethics
That way most of the productivity or developer SaaS acquisitions would not happen since they are done by FANG only
I can think of two options
- Pure acqui-hire: virtually all of Rockset engineering leadership is ex-Meta, and OpenAI has been hiring several senior infra engineers from Meta, so these are all people that have worked together previously.
- OpenAI is building some product where customers can ingest large amounts of data, which could be managed by the Rockset infrastructure as source of truth, and then indexed by their RAG systems.