Skip to main content

AWS Bedrock

Introduction

Mistral AI's open and commercial models can be deployed on the AWS Bedrock cloud platform as fully managed endpoints. AWS Bedrock is a serverless service so you don't have to manage any infrastructure.

As of today, the following models are available:

  • Mistral Large (24.07, 24.02)
  • Mistral Small (24.02)
  • Mixtral 8x7B
  • Mistral 7B

For more details, visit the models page.

Getting started

The following sections outline the steps to deploy and query a Mistral model on the AWS Bedrock platform.

The following items are required:

  • Access to an AWS account within a region that supports the AWS Bedrock service and offers access to your model of choice: see the AWS documentation for model availability per region.
  • An AWS IAM principal (user, role) with sufficient permissions, see the AWS documentation for more details.
  • A local code environment set up with the relevant AWS SDK components, namely:

Requesting access to the model

Follow the instructions on the AWS documentation to unlock access to the Mistral model of your choice.

Querying the model

AWS Bedrock models are accessible through the Converse API.

Before running the examples below, make sure to sure to :

  • Properly configure the authentication credentials for your development environment. The AWS documentation provides an in-depth explanation on the required steps.
  • Create a Python virtual environment with the boto3 package (version >= 1.34.131).
  • Set the following environment variables:
    • AWS_REGION: The region where the model is deployed (e.g. us-west-2),
    • AWS_BEDROCK_MODEL_ID: The model ID (e.g. mistral.mistral-large-2407-v1:0).
import boto3
import os

region = os.environ.get("AWS_REGION")
model_id = os.environ.get("AWS_BEDROCK_MODEL_ID")

bedrock_client = boto3.client(service_name='bedrock-runtime', region_name=region)

user_msg = "Who is the best French painter? Answer in one short sentence."
messages = [{"role": "user", "content": [{"text": user_msg}]}]
temperature = 0.0
max_tokens = 1024

params = {"modelId": model_id,
"messages": messages,
"inferenceConfig": {"temperature": temperature,
"maxTokens": max_tokens}}

resp = bedrock_client.converse(**params)

print(resp["output"]["message"]["content"][0]["text"])

Going further

For more details and examples, refer to the following resources: