Groq Models

A drag-and-drop component for integrating Groq's high-performance LLM inference into your workflow. Configure model parameters and connect inputs/outputs to other components.

Groq component interface and configuration

API Key Required: A valid Groq API key is required to use this component. Ensure you've registered for a Groq account and have generated an API key before using this component.

Component Inputs

Input: Text input for the model
Example: "Explain the advantages of Groq's LPU architecture for inference."
System Message: System prompt to guide model behavior
Example: "You are a helpful AI assistant specializing in hardware acceleration and machine learning."
Stream: Toggle for streaming responses
Example: true (for real-time token streaming) or false (for complete response)
Model: The Groq model to use
Example: "llama3-8b-8192", "mixtral-8x7b-32768", "gemma-7b-it"
Groq API Key: Your API authentication key
Example: "gsk_abc123def456..."
Groq API Base: API endpoint URL
Example: "https://api.groq.com/openai/v1" (Default)

Component Outputs

Text: Generated text output
Example: "Groq's LPU (Language Processing Unit) architecture is specifically designed for LLM inference, offering several advantages..."
Language Model: Model information and metadata
Example: model: llama3-8b-8192, usage: {prompt_tokens: 35, completion_tokens: 150, total_tokens: 185}

Model Parameters

Max Output Tokens

Maximum number of tokens to generate in the response

Default: Model-dependent
Range: 1 to model maximum
Recommendation: Set based on expected response length

Temperature

Controls randomness in the output - higher values increase creativity

Default: 0.1
Range: 0.0 to 1.0
Recommendation: Lower (0.0-0.3) for factual/consistent responses, Higher (0.7-1.0) for creative tasks

N

Number of completions to generate

Default: 1
Range: 1 to 5
Recommendation: Use 1 for most applications, higher values for generating multiple response options

Available Models

Llama 3

Meta's latest Llama models optimized for Groq's LPU architecture

Models: 
- llama3-8b-8192 (8B parameters, 8K context window)
- llama3-70b-8192 (70B parameters, 8K context window)

Mixtral

Mixtral's mixture-of-experts models

Models: 
- mixtral-8x7b-32768 (8x7B parameters, 32K context window)

Gemma

Google's lightweight, open models

Models: 
- gemma-7b-it (7B parameters, instruction-tuned)

Implementation Example

// Basic configuration
const groqConfig = {
  model: "llama3-8b-8192",
  groqApiKey: process.env.GROQ_API_KEY,
  systemMessage: "You are a helpful assistant."
};

// Advanced configuration
const advancedGroqConfig = {
  model: "mixtral-8x7b-32768",
  groqApiKey: process.env.GROQ_API_KEY,
  groqApiBase: "https://api.groq.com/openai/v1",
  maxOutputTokens: 2000,
  temperature: 0.3,
  n: 1,
  stream: true
};

// Usage example
async function generateResponse(input) {
  const response = await groqComponent.generate({
    input: input,
    systemMessage: "You are an AI assistant that explains complex concepts clearly.",
    model: "llama3-8b-8192",
    temperature: 0.2
  });
  
  return response.text;
}

Use Cases

Real-time Applications: Leverage Groq's low-latency inference for chat and interactive systems
Content Generation: Create articles, summaries, and creative content quickly
Customer Support: Build responsive support bots with fast response times
Development Tools: Integrate into development workflows for code generation and documentation
Education: Create interactive learning experiences with minimal latency

Useful Resources

Best Practices

Store your API key securely using environment variables
Enable streaming for faster perceived response times
Start with the default temperature (0.1) and adjust based on your use case
Use a single completion (n=1) for most applications
Monitor your API token usage through the Groq console
Implement proper error handling for API failures
Consider model size tradeoffs (larger models are more capable but may have higher latency)

Documentation