This is a BentoML example project, showing you how to serve and deploy open-source Large Language Models (LLMs) using LMDeploy, a toolkit for compressing, deploying, and serving LLMs. See here for a ...
Added ElevenLabs POST endpoint. Added local TTS with https://huggingface.co/unity/inference-engine-jets-text-to-speech/tree/main. Planning to add custom POST endpoint ...