From 8a0bb59cb58c539bfe9acfd9de05b2286436e50b Mon Sep 17 00:00:00 2001 From: CharlesCNorton <135471798+CharlesCNorton@users.noreply.github.com> Date: Tue, 24 Jun 2025 05:30:10 -0400 Subject: [PATCH] Update inference-providers-groq.md - Fix "ad many more" -> "and many more" - Fix "LLama 4" -> "Llama 4" (2 occurrences) --- inference-providers-groq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/inference-providers-groq.md b/inference-providers-groq.md index 35c3b34dfb..d9b38a8fb0 100644 --- a/inference-providers-groq.md +++ b/inference-providers-groq.md @@ -20,7 +20,7 @@ authors: We're thrilled to share that **Groq** is now a supported Inference Provider on the Hugging Face Hub! Groq joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers. -[Groq](https://groq.com) supports a wide variety of text and conversational models, including the latest open-source models such as [Meta's LLama 4](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct?inference_provider=groq), [Qwen's QWQ-32B](https://huggingface.co/Qwen/QwQ-32B?inference_provider=groq), ad many more. +[Groq](https://groq.com) supports a wide variety of text and conversational models, including the latest open-source models such as [Meta's Llama 4](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct?inference_provider=groq), [Qwen's QWQ-32B](https://huggingface.co/Qwen/QwQ-32B?inference_provider=groq), and many more. At the heart of Groq's technology is the Language Processing Unit (LPU™), a new type of end-to-end processing unit system that provides the fastest inference for computationally intensive applications with a sequential component, such as Large Language Models (LLMs). LPUs are designed to overcome the limitations of GPUs for inference, offering significantly lower latency and higher throughput. This makes them ideal for real-time AI applications. @@ -58,7 +58,7 @@ See the list of supported models [here](https://huggingface.co/models?inference_ #### from Python, using huggingface_hub -The following example shows how to use Meta's LLama 4 using Groq as the inference provider. You can use a [Hugging Face token](https://huggingface.co/settings/tokens) for automatic routing through Hugging Face, or your own Groq API key if you have one. +The following example shows how to use Meta's Llama 4 using Groq as the inference provider. You can use a [Hugging Face token](https://huggingface.co/settings/tokens) for automatic routing through Hugging Face, or your own Groq API key if you have one. Install `huggingface_hub` from source (see [instructions](https://huggingface.co/docs/huggingface_hub/installation#install-from-source)). Official support will be released soon in version v0.33.0.