HuggingFace Embeddings
Generate embeddings using HuggingFace's extensive collection of open-source models. Supports both hosted inference endpoints and local model deployment for maximum flexibility.

HuggingFace Embeddings component interface and configuration
API Key Notice: When using HuggingFace's hosted inference API, you'll need a valid API token with appropriate permissions. Some models have rate limits that may affect high-volume embedding generation.
Component Inputs
- API Key: Your HuggingFace API token for authentication
Example: "hf_abcdefghijklmnopqrstuvwxyz"
- Model Name: The identifier of the embedding model to use
Example: "sentence-transformers/all-mpnet-base-v2", "BAAI/bge-large-en-v1.5"
- Inference Endpoint: Optional custom endpoint for hosted models
Example: "https://your-endpoint.huggingface.cloud"
- Input Text: The text to convert to embeddings
Example: "This is a sample text for embedding generation."
Component Outputs
- Embeddings: Vector representation of the input text
Example: [0.018, -0.032, 0.067, ...]
- Dimensions: Size of the embedding vector
Example: 384 for all-MiniLM-L6-v2, 768 for all-mpnet-base-v2, 1024 for bge-large-en
- Model Info: Details about the model used
Example: name: sentence-transformers/all-mpnet-base-v2, version: 1.0.0, type: huggingface
Model Comparison
sentence-transformers/all-mpnet-base-v2
High-quality general-purpose embedding model with strong performance across many tasks
Dimensions: 768
Performance: Excellent semantic understanding
Size: 420MB
Language Support: English-focused
Ideal for: High-quality general purpose embeddings
sentence-transformers/all-MiniLM-L6-v2
Lightweight and efficient model with good performance-to-size ratio
Dimensions: 384
Performance: Good semantic understanding
Size: 80MB
Language Support: English-focused
Ideal for: Resource-constrained environments or applications with speed requirements
BAAI/bge-large-en-v1.5
State-of-the-art embedding model specifically optimized for retrieval tasks
Dimensions: 1024
Performance: Superior for retrieval and similarity
Size: 1.3GB
Language Support: English
Ideal for: Retrieval augmented generation and semantic search applications
Implementation Example
// Using hosted inference API
const embedder = new HuggingFaceEmbeddor({
apiKey: process.env.HUGGINGFACE_API_KEY,
modelName: "sentence-transformers/all-mpnet-base-v2"
});
// Using custom inference endpoint
const customEmbedder = new HuggingFaceEmbeddor({
apiKey: process.env.HUGGINGFACE_API_KEY,
modelName: "BAAI/bge-large-en-v1.5",
inferenceEndpoint: "https://your-endpoint.huggingface.cloud"
});
// Generate embeddings
const result = await embedder.embed({
input: "Your text to embed"
});
// Batch processing
const batchResult = await embedder.embedBatch({
inputs: [
"First text to embed",
"Second text to embed"
]
});
console.log(result.embeddings);
Use Cases
- Open Source Deployments: Use high-quality embeddings with open source models
- Domain-Specific Applications: Select from specialized models for particular domains
- Resource-Constrained Environments: Choose smaller models for edge devices or limited compute
- On-premise RAG Systems: Deploy HuggingFace models entirely within your infrastructure
- Multi-language Support: Select from models trained on multiple languages
Best Practices
- Choose appropriate model size based on your performance needs and resource constraints
- Consider using Inference Endpoints for production deployments with high reliability requirements
- Implement caching for frequently used embeddings to reduce API calls
- Monitor API usage limits to avoid rate limiting in high-traffic applications
- Test different models to find the best performing one for your specific use case