What can you do with it?
Use Groq for high-speed AI inference with various open-source models. Ideal for applications requiring fast response times, high throughput, and efficient processing. The command provides access to multiple models with varying capabilities and context lengths for different use cases.How to use it?
Basic Command Structure
Parameters
Required:prompt
- Your instructions or questions
-
model
- Specific model to use (defaults to meta-llama/llama-4-maverick-17b-128e-instruct) -
system prompt
- Override the default system prompt -
files
- File URLs to include in the request (see LLM File Type Support for supported formats)
Response Format
The command returns:Examples
Basic Usage
Advanced Usage
Specific Use Case
Notes
Available models include meta-llama/llama-4-maverick-17b-128e-instruct (131K context, default), deepseek-r1-distill-llama-70b, llama-3.3-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768, and gemma2-9b-it. Default response format is JSON.Supported Models
Choose the appropriate Groq model based on your specific needs:meta-llama/llama-4-scout-17b-16e-instruct
- Ultra-fast exploration and search tasksmeta-llama/llama-4-maverick-17b-128e-instruct
(default) - Ultra-fast creative tasks, extended contextqwen/qwen3-32b
- Ultra-fast general purpose tasksgroq/compound
- AI system with web search and code executionmeta-llama/llama-prompt-guard-2-86m
- Groq Prompt Guard 2
The default response format is JSON. Models are optimized for high-speed inference on Groq’s specialized hardware.