Bring your own `REST Predictor` (`data-on-eks` version)¶

FMBench now provides an example of bringing your own endpoint as a REST Predictor for benchmarking. View this script as an example. This script is an inference file for the NousResearch/Llama-2-13b-chat-hf model deployed on an Amazon EKS cluster using Ray Serve. The model is deployed via data-on-eks which is a comprehensive resource for scaling your data and machine learning workloads on Amazon EKS and unlocking the power of Gen AI. Using data-on-eks, you can harness the capabilities of AWS Trainium, AWS Inferentia and NVIDIA GPUs to scale and optimize your Gen AI workloads and benchmark those models on FMBench with ease.

Bring your own REST Predictor (data-on-eks version)¶

Bring your own `REST Predictor` (`data-on-eks` version)¶