Bring your own REST Predictor
(data-on-eks
version)¶
FMBench
now provides an example of bringing your own endpoint as a REST Predictor
for benchmarking. View this script
as an example. This script is an inference file for the NousResearch/Llama-2-13b-chat-hf
model deployed on an Amazon EKS cluster using Ray Serve. The model is deployed via data-on-eks
which is a comprehensive resource for scaling your data and machine learning workloads on Amazon EKS and unlocking the power of Gen AI. Using data-on-eks
, you can harness the capabilities of AWS Trainium, AWS Inferentia and NVIDIA GPUs to scale and optimize your Gen AI workloads and benchmark those models on FMBench with ease.