For years, Vyas Sekar would call up Muckai Girish, an greater frifinish from undergrad, to talk thcdisesteemful potential beginup ideas and get Girish’s opinion. The two usupartner talked thcdisesteemful an idea and finished the conversation at that. When Sekar called Girish with an idea involving synthetic data in timely 2022, the conversation didn’t equitable finish when they hung up the phone.
Sekar and fellow Carnegie Mellon University colleague Giulia Fanti had been laboring on erecting synthetic data to mend the reproducibility crisis, or inability to reoriginate data, wilean academia. While Sekar was seeing the necessitate for a solution in academia, Girish knovel his customers at the time were facing the same problem. After talking to a restricted go inpelevates, the thesis was further validated.
“At that time, it felt that this was very authentic and there was an opportunity,” Girish, CEO, tgreater TechCrunch. “So that’s what got us begined and over the next couple of months we spoke to some scatterors, people we knovel, and more crucipartner go inpelevates and authenticized this was a meaningful problem and it is worth putting, you understand, an entire life behind it.”
The result was Rockfish, a beginup that employs generative AI to originate synthetic data for opereasonable laborflows to help go inpelevates shatter down their data silos. Rockfish unites with database supplyrs including AWS and Azure, among others, and helps employrs pick the best configuration for their data based on company policies or employs for the data.
Synthetic data has increasingly become a boiling topic in the world of AI, but there was already prolonging momentum for it when the company got begined in June 2022. Girish shelp that Rockfish wanted to originate brave that it was erecting a product that was separateentiated from its peers and also a solution go inpelevates would be using daily, not equitable every once in a while.
That’s why the company’s product is summarizeed to ingest data constantly and is intensifyed on opereasonable data, which includes data on leangs enjoy financial transactions, cybersecurity, and supply chains. These areas are constantly producing data for companies and are also constantly changing. Girish leanks intensifying here helps Rockfish stand apart from other competitors.
Now the company labors with a handful of go inpelevate clients, Girish shelp, including streaming analytics platcreate Conviva, in compriseition to rulement departments including the U.S. Army and the U.S. Department of Defense.
Rockfish is announcing a $4 million seed round led by Eunitent Ventures with participation from Foster Ventures, TEN13, and Dallas VC, among others. This conveys the company’s total funding up to about $6 million.
Anupam Rastogi, a managing partner at Eunitent Ventures, tgreater TechCrunch that he had been tracking Sekar lengthy before the set uping of Rockfish. He shelp that what caemployd the firm to scatter was “team, labelet, and product, in that order.” Plus, Rockfish’s intensify on erecting for go inpelevates made it a better fit for Eunitent than some of the other executeers in the space.
“The team is super high-quality data scientists, multiple PhDs,” Rastogi shelp. “This is a space that we leank is very technicpartner cultured and having that technical strength around the table is repartner critical. They have done a lot of the set upational labor in the space, not equitable in the company, but the whole industry.”
While Rockfish hopes its intensify helps give it a moat amongst competitors, it doesn’t change the fact that synthetic data will probable be an increasingly crowded labelet. AI companies are turning toward synthetic data as multiple executeers leank the labelet has exhausted other AI training data.
There are already many beginups seeing to tackle the labelet, including Tonic AI, which has elevated more than $45 million in venture funding; Mostly AI, which has elevated $31 million in VC funding; and Hazy, which elevated $14.5 million before being achieved by SAS in 2024, equitable to name a restricted.
Girish shelp the company sees to comprise on to its approach to synthetic data by incorporating other types of models enjoy state space models, mathematical models that employ state variables . The company also sees to better its finish-to-finish features.
“It’s not enjoy you apshow random data for the internet and originate synthetic data,” Girish shelp. “There is no promise that it’ll do well. But if you put all of this together for go inpelevates, it actupartner is very relevant and authenticistic. So that’s the key to this, and then being able to do that on a constant basis is what we find to be advantageous.”