Releases¶
2.0.18¶
- Delete SageMaker endpoint as soon as the run finishes.
2.0.17¶
- Add support for embedding models through SageMaker jumpstart
- Add support for LLama 3.2 11b Vision Instruct benchmarking through FMBench
- Fix DJL Inference while deploying djl on EC2(424 Inference bug)
2.0.16¶
- Update to torch 2.4 for compatibility with SageMaker Notebooks.
2.0.15¶
2.0.14¶
Llama3.1-70b
config files and more.- Support for
fmbench-orchestrator
.
2.0.13¶
- Update
pricing.yml
additional config files.
2.0.11¶
Llama3.2-1b
andLlama3.2-3b
support on EC2 g5.Llama3-8b
on EC2g6e
instances.
2.0.9¶
- Triton-djl support for AWS Chips.
- Tokenizer files are now downloaded directly from Hugging Face (unless provided manually as before)
2.0.7¶
- Support Triton-TensorRT for GPU instances and Triton-vllm for AWS Chips.
- Misc. bug fixes.
2.0.6¶
- Run multiple model copies with the DJL serving container and an Nginx load balancer on Amazon EC2.
- Config files for
Llama3.1-8b
ong5
,p4de
andp5
Amazon EC2 instance types. - Better analytics for creating internal leaderboards.
2.0.5¶
- Support for Intel CPU based instances such as
c5.18xlarge
andm5.16xlarge
.
2.0.4¶
- Support for AMD CPU based instances such as
m7a
.
2.0.3¶
- Support for a EFA directory for benchmarking on EC2.
2.0.2¶
- Code cleanup, minor bug fixes and report improvements.
2.0.0¶
- 🚨 Model evaluations done by a Panel of LLM Evaluators 🚨
v1.0.52¶
- Compile for AWS Chips (Trainium, Inferentia) and deploy to SageMaker directly through
FMBench
. Llama3.1-8b
andLlama3.1-70b
config files for AWS Chips (Trainium, Inferentia).- Misc. bug fixes.
v1.0.51¶
FMBench
has a website now. Rework the README file to make it lightweight.Llama3.1
config files for Bedrock.
v1.0.50¶
Llama3-8b
on Amazon EC2inf2.48xlarge
config file.- Update to new version of DJL LMI (0.28.0).
v1.0.49¶
- Streaming support for Amazon SageMaker and Amazon Bedrock.
- Per-token latency metrics such as time to first token (TTFT) and mean time per-output token (TPOT).
- Misc. bug fixes.
v1.0.48¶
- Faster result file download at the end of a test run.
Phi-3-mini-4k-instruct
configuration file.- Tokenizer and misc. bug fixes.
v1.0.47¶
- Run
FMBench
as a Docker container. - Bug fixes for GovCloud support.
- Updated README for EKS cluster creation.
v1.0.46¶
- Native model deployment support for EC2 and EKS (i.e. you can now deploy and benchmark models on EC2 and EKS).
- FMBench is now available in GovCloud.
- Update to latest version of several packages.
v1.0.45¶
- Analytics for results across multiple runs.
Llama3-70b
config files forg5.48xlarge
instances.
v1.0.44¶
- Endpoint metrics (CPU/GPU utilization, memory utiliztion, model latency) and invocation metrics (including errors) for SageMaker Endpoints.
Llama3-8b
config files forg6
instances.
v1.0.42¶
- Config file for running
Llama3-8b
on all instance types exceptp5
. - Fix bug with business summary chart.
- Fix bug with deploying model using a DJL DeepSpeed container in the no S3 dependency mode.
v1.0.40¶
- Make it easy to run in the Amazon EC2 without any dependency on Amazon S3 dependency mode.
v1.0.39¶
- Add an internal
FMBench
website.
v1.0.38¶
- Support for running
FMBench
on Amazon EC2 without any dependency on Amazon S3. Llama3-8b-Instruct
config file forml.p5.48xlarge
.
v1.0.37¶
g5
/p4d
/inf2
/trn1
specific config files forLlama3-8b-Instruct
.p4d
config file for bothvllm
andlmi-dist
.
v1.0.36¶
- Fix bug at higher concurrency levels (20 and above).
- Support for instance count > 1.
v1.0.35¶
- Support for Open-Orca dataset and corresponding prompts for Llama3, Llama2 and Mistral.
v1.0.34¶
- Don't delete endpoints for the bring your own endpoint case.
- Fix bug with business summary chart.
v1.0.32¶
-
Report enhancements: New business summary chart, config file embedded in the report, version numbering and others.
-
Additional config files: Meta Llama3 on Inf2, Mistral instruct with
lmi-dist
onp4d
andp5
instances.