Skip to content

Bring your own REST Predictor (data-on-eks version)

FMBench now provides an example of bringing your own endpoint as a REST Predictor for benchmarking. View this script as an example. This script is an inference file for the NousResearch/Llama-2-13b-chat-hf model deployed on an Amazon EKS cluster using Ray Serve. The model is deployed via data-on-eks which is a comprehensive resource for scaling your data and machine learning workloads on Amazon EKS and unlocking the power of Gen AI. Using data-on-eks, you can harness the capabilities of AWS Trainium, AWS Inferentia and NVIDIA GPUs to scale and optimize your Gen AI workloads and benchmark those models on FMBench with ease.