Fireworks.ai is a California-based artificial intelligence (AI) startup that is offering a unique solution for enterprises. The AI firm does not build large language models (LLMs) or foundation models from scratch but fine-tunes open-source models and converts them into an Application Programming Interface (API) to help businesses deploy the AI capabilities in a seamless fashion. The fine-tuning reduces the scope of the AI model and focuses it on a specific functionality. This allows them to reduce instances of AI hallucinations and improve the capabilities of the model significantly.
The AI firm was co-founded by Lin Qiao who also holds the seat of the CEO in the company. After serving as the Senior Director of Engineering at Meta and working with AI frameworks and platforms, Qiao and her team founded the startup in October 2022, as per her LinkedIn profile. In a conversation with TechCrunch, she explained the business model of Fireworks.ai, highlighting the fine-tuning service they provide. She said, “It can be either off the shelf, open source models or the models we tune or the models our customer can tune by themselves. All three varieties can be served through our inference engine API.”
This puts the firm in a unique position where while it is not innovating at the foundation model level, it is bridging the gap between an LLM and a business-ready product that can be deployed seamlessly. With a primary focus on building APIs, Fireworks.ai lets its enterprise clients plug and play any open-source AI model in its catalogue. As per the report, the company also lets businesses experiment with different AI models to choose the one that fits their needs.
At present, the startup claims to contain 89 open-source LLMs such as Mixtral MoE 8x7B Instruct, Meta’s Llama 2 70B Chat, Google’s Gemma 7B Instruct, Stability AI’s Stable Diffusion XL, and more. The AI firm offers the models in either serverless format that does not require businesses to configure hardware or deploy models, or as on-demand models which are available for dedicated deployments, served on reserved GPU configurations according to business needs.
For the on-demand format, Fireworks.ai has three payment plans — Developer, Business, and Enterprise — where the Developer plan comes with a pay-per-usage structure and a rate limit of 600 requests per minute, the Enterprise tier has custom pricing offers and unlimited rate limits. The serverless format is billed at a per-token pricing plan where different models, depending on whether they are text-only, image-only, or multimodal, will fetch a different price.