Serve Models From Replicate Locally

Replicate
AI model serving platform
Local hosting
macOS
Linux
Cog
Continue
This article discusses the process of hosting AI models locally on macOS and Linux systems using Replicate, an AI model serving platform. It highlights the use of Cog for packaging and serving large models, as well as Continue, an open-source Copilot alternative, to connect local LLMs with VSCode.
Published

March 1, 2024


Replicate internally use Cog for packaging and serving large AI models in Docker containers. Currently it only supports macOS and Linux.

According to the doc it offers nearly the same functionally as Replicate such as API calls, fine-tuning.


You may connect your local LLM to VSCode using Continue, an open-source Copilot alternative.