Streaming Response with Converse
Overview
To demonstrate the text generation capability of Amazon Bedrock, we will explore the use of Boto3 client to communicate with Amazon Bedrock Converse API. We will demonstrate different configurations available as well as how simple input can lead to desired outputs.
Context
In this notebook we show you how to use a LLM to generate an email response to a customer who provided negative feedback on the quality of customer service that they received from the support engineer.
We will use Bedrock's Amazon Titan Text large model using the Boto3 API.
The prompt used in this example is called a zero-shot prompt because we are not providing any examples of text alongside their classification other than the prompt.
Pattern
We will simply provide the Amazon Bedrock API with an input consisting of a task, an instruction and an input for the model under the hood to generate an output without providing any additional example. The purpose here is to demonstrate how the powerful LLMs easily understand the task at hand and generate compelling outputs.
Use case
To demonstrate the generation capability of models in Amazon Bedrock, let's take the use case of email generation.
Implementation
To fulfill this use case, in this notebook we will show how to generate an email with a thank you note based on the customer's previous email.We will use the Amazon Titan Text Large model using the Amazon Bedrock API with Boto3 client.
Prerequisites
Before you can use Amazon Bedrock, you must carry out the following steps:
- Sign up for an AWS account (if you don't already have one) and IAM Role with the necessary permissions for Amazon Bedrock, see AWS Account and IAM Role.
- Request access to the foundation models (FM) that you want to use, see Request access to FMs.
Setup
Info
This notebook should work well with the Data Science 3.0 kernel (Python 3.10 runtime) in SageMaker Studio
Run the cells in this section to install the packages needed by this notebook.
!pip3 install boto3 --quiet
import json
import os
import sys
import boto3
import botocore
modelId = "amazon.titan-tg1-large"
region = 'us-east-1'
boto3_bedrock = boto3.client(
service_name = 'bedrock-runtime',
region_name = region,
)
Generate text
Following on the use case explained above, let's prepare an input for the Amazon Bedrock service to generate an email.
# create the prompt
prompt_data = """
Command: Write an email from Bob, Customer Service Manager, to the customer "John Doe"
who provided negative feedback on the service provided by our customer support
engineer
"""
Let's start by using the Amazon Titan Large model. The Amazon Titan family of models support a large context window of up to 32k tokens and accepts the following parameters: - messages
: Prompt to the LLM - inference_config
: These are the parameters that model will take into account while generating the output.
# Base inference parameters.
inference_config = {
"temperature": 0.1,
"maxTokens": 4096,
"topP": 0.95,
}
messages = [
{
"role": "user",
"content": [{"text": prompt_data}]
}
]
The Amazon Bedrock Converse API provides a consistent interface that works with all models that support messages. This allows you to write code once and use it with different models with an API .converse
accepts the following parameter in this example: - modelId
: This is the model ARN for the various foundation models available under Amazon Bedrock - inferenceConfig
: Inference parameters to pass to the model. Converse supports a base set of inference parameters. - messages
: A message consisting of the prompt
Check documentation for Available text generation model Ids
Invoke the Amazon Titan Text language model
First, we explore how the model generates an output based on the prompt created earlier.
Complete Output Generation
# Send the message.
try:
response = boto3_bedrock.converse(
modelId=modelId,
messages=messages,
inferenceConfig=inference_config,
)
outputText = response['output']['message']['content'][0]['text']
except botocore.exceptions.ClientError as error:
if error.response['Error']['Code'] == 'AccessDeniedException':
print(f"\x1b[41m{error.response['Error']['Message']}\
\nTo troubeshoot this issue please refer to the following resources.\
\nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
\nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")
else:
raise error
# The relevant portion of the response begins after the first newline character
# Below we print the response beginning after the first occurence of '\n'.
email = outputText[outputText.index('\n')+1:]
print(email)
Streaming Output Generation
Above is an example email generated by the Amazon Titan Large model by understanding the input request and using its inherent understanding of the different modalities. This request to the API is synchronous and waits for the entire output to be generated by the model.
Bedrock also supports that the output can be streamed as it is generated by the model in form of chunks. Below is an example of invoking the model with streaming option. converse_stream
returns a EventStream
which you can read from.
You may want to enable scrolling on your output cell below:
output = []
try:
response = boto3_bedrock.converse_stream(
modelId=modelId,
messages=messages,
inferenceConfig=inference_config,
)
stream = response['stream']
i = 1
if stream:
for event in stream:
if 'contentBlockDelta' in event:
streaming_text = event['contentBlockDelta']['delta']['text']
output.append(event['contentBlockDelta']['delta']['text'])
print(f'\t\t\x1b[31m**Chunk {i}**\x1b[0m\n{streaming_text}\n')
i+=1
except botocore.exceptions.ClientError as error:
if error.response['Error']['Code'] == 'AccessDeniedException':
print(f"\x1b[41m{error.response['Error']['Message']}\
\nTo troubeshoot this issue please refer to the following resources.\
\nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
\nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")
else:
raise error
The above helps to quickly get output of the model and let the service complete it as you read. This assists in use-cases where there are longer pieces of text that you request the model to generate. You can later combine all the chunks generated to form the complete output and use it for your use-case
Next Steps
You have now experimented with using boto3
SDK which provides a vanilla exposure to Amazon Bedrock API. Using this API you have seen the use case of generating an email responding to a customer due to their negative feedback.
- Adapt this notebook to experiment with different models available through Amazon Bedrock such as Anthropic Claude and AI21 Labs Jurassic models.
- Change the prompts to your specific usecase and evaluate the output of different models.
- Play with the token length to understand the latency and responsiveness of the service.
- Apply different prompt engineering principles to get better outputs.
Cleanup
There is no clean up necessary for this notebook.