Benchmarking NVIDIA TensorRT-LLM - Jan - discu.eu

Reddit

Benchmarking NVIDIA's TensorRT-LLM https://jan.ai/post/benchmarking-nvidia-tensorrt-llm 8 comments 30/4/2024 nvidia

Linked pages

Grafikkarten der GeForce RTX 40-Serie | NVIDIA https://www.nvidia.com/en-us/geforce/graphics-cards/40-series/ 675 comments
GitHub - ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++ https://github.com/ggerganov/llama.cpp 286 comments
H100 Tensor Core GPU | NVIDIA https://www.nvidia.com/en-us/data-center/h100/ 3 comments
GitHub - janhq/jan: Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM) https://github.com/janhq/jan 3 comments
GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT https://github.com/NVIDIA/FasterTransformer/ 1 comment
GeForce-Gaming-Laptops der RTX 40-Serie | NVIDIA https://www.nvidia.com/en-us/geforce/laptops/ 1 comment
GeForce RTX 30 Series Graphics Card Overview | NVIDIA https://www.nvidia.com/en-eu/geforce/graphics-cards/30-series/ 0 comments
GitHub - ray-project/llmperf: LLMPerf is a library for validating and benchmarking LLMs https://github.com/ray-project/llmperf 0 comments

Related searches:

Search whole site: site:jan.ai

Search title: Benchmarking NVIDIA TensorRT-LLM - Jan

See how to search.

Submit link to: