There are a wide variety of open source models. For example, there has been a lot of interest in LLama and variations such as Alpaca or Vicuna, Falcon, Mistral etc. Hosting these is a challenge as they require GPUs which are expensive so often customers want to compare cloud providers to find the best hosting option to meet their own needs. In this example we will work with Google Cloud Platform.

 

In addition, customers may want to integrate with the same cloud provider that hosts their VPC. That way they can ensure proper authentication and access only from within their VPC. While this authenticator uses authentication over the public internet, it should then be possible for the user to extend to leverage Google's cloud infrastructure to adjust to suit their cloud architectural needs, including provisioning scale out policies.

 

Finally, by leveraging Vertex AI in a managed format, it can integrate into the customer's existing infrastructure level monitoring needs. For example, instances can be labelled to correspond to the customer's billing attribution polices, or logs and analytics can be set up to be written into their Big Query for monitoring and analytics.

Labels (1)
Contributors
Version history
Last update:
‎01-19-2024 01:27 AM
Updated by: