Groq Models
A drag-and-drop component for integrating Groq's high-performance LLM inference into your workflow. Configure model parameters and connect inputs/outputs to other components.

Groq component interface and configuration
API Key Required: A valid Groq API key is required to use this component. Ensure you've registered for a Groq account and have generated an API key before using this component.
Component Inputs
- Input: Text input for the model
Example: "Explain the advantages of Groq's LPU architecture for inference."
- System Message: System prompt to guide model behavior
Example: "You are a helpful AI assistant specializing in hardware acceleration and machine learning."
- Stream: Toggle for streaming responses
Example: true (for real-time token streaming) or false (for complete response)
- Model: The Groq model to use
Example: "llama3-8b-8192", "mixtral-8x7b-32768", "gemma-7b-it"
- Groq API Key: Your API authentication key
Example: "gsk_abc123def456..."
- Groq API Base: API endpoint URL
Example: "https://api.groq.com/openai/v1" (Default)
Component Outputs
- Text: Generated text output
Example: "Groq's LPU (Language Processing Unit) architecture is specifically designed for LLM inference, offering several advantages..."
- Language Model: Model information and metadata
Example: model: llama3-8b-8192, usage: {prompt_tokens: 35, completion_tokens: 150, total_tokens: 185}
Model Parameters
Max Output Tokens
Maximum number of tokens to generate in the response
Default: Model-dependent
Range: 1 to model maximum
Recommendation: Set based on expected response length
Temperature
Controls randomness in the output - higher values increase creativity
Default: 0.1
Range: 0.0 to 1.0
Recommendation: Lower (0.0-0.3) for factual/consistent responses, Higher (0.7-1.0) for creative tasks
N
Number of completions to generate
Default: 1
Range: 1 to 5
Recommendation: Use 1 for most applications, higher values for generating multiple response options
Available Models
Llama 3
Meta's latest Llama models optimized for Groq's LPU architecture
Models:
- llama3-8b-8192 (8B parameters, 8K context window)
- llama3-70b-8192 (70B parameters, 8K context window)
Mixtral
Mixtral's mixture-of-experts models
Models:
- mixtral-8x7b-32768 (8x7B parameters, 32K context window)
Gemma
Google's lightweight, open models
Models:
- gemma-7b-it (7B parameters, instruction-tuned)
Implementation Example
// Basic configuration
const groqConfig = {
model: "llama3-8b-8192",
groqApiKey: process.env.GROQ_API_KEY,
systemMessage: "You are a helpful assistant."
};
// Advanced configuration
const advancedGroqConfig = {
model: "mixtral-8x7b-32768",
groqApiKey: process.env.GROQ_API_KEY,
groqApiBase: "https://api.groq.com/openai/v1",
maxOutputTokens: 2000,
temperature: 0.3,
n: 1,
stream: true
};
// Usage example
async function generateResponse(input) {
const response = await groqComponent.generate({
input: input,
systemMessage: "You are an AI assistant that explains complex concepts clearly.",
model: "llama3-8b-8192",
temperature: 0.2
});
return response.text;
}
Use Cases
- Real-time Applications: Leverage Groq's low-latency inference for chat and interactive systems
- Content Generation: Create articles, summaries, and creative content quickly
- Customer Support: Build responsive support bots with fast response times
- Development Tools: Integrate into development workflows for code generation and documentation
- Education: Create interactive learning experiences with minimal latency
Useful Resources
Best Practices
- Store your API key securely using environment variables
- Enable streaming for faster perceived response times
- Start with the default temperature (0.1) and adjust based on your use case
- Use a single completion (n=1) for most applications
- Monitor your API token usage through the Groq console
- Implement proper error handling for API failures
- Consider model size tradeoffs (larger models are more capable but may have higher latency)