HuggingFace Embeddor

Open Source Embeddings
HuggingFace Embeddor Diagram

Overview

Generate embeddings using HuggingFace's extensive collection of open-source models. Supports both hosted inference endpoints and local model deployment.

Popular Models

  • sentence-transformers/all-mpnet-base-v2
  • sentence-transformers/all-MiniLM-L6-v2
  • BAAI/bge-large-en-v1.5

Key Features

  • Hosted inference API
  • Custom endpoints
  • Model versioning
  • Batch processing

Configuration

Required Parameters

  • apiKeyHuggingFace API token
  • modelNameModel identifier

Optional Parameters

  • inferenceEndpointCustom endpoint URL

Example Usage

// Using hosted inference API
const embedder = new HuggingFaceEmbeddor({
  apiKey: "your-api-key",
  modelName: "sentence-transformers/all-mpnet-base-v2"
});

// Using custom inference endpoint
const customEmbedder = new HuggingFaceEmbeddor({
  apiKey: "your-api-key",
  modelName: "BAAI/bge-large-en-v1.5",
  inferenceEndpoint: "https://your-endpoint.huggingface.cloud"
});

// Generate embeddings
const result = await embedder.embed({
  input: "Your text to embed"
});

// Batch processing
const batchResult = await embedder.embedBatch({
  inputs: [
    "First text to embed",
    "Second text to embed"
  ]
});

Best Practices

  • Choose appropriate model size
  • Monitor API usage limits
  • Implement error handling
  • Cache common embeddings

Model Selection Tips

  • Consider model size vs. performance
  • Check language support
  • Verify model updates

Response Format

{
  "embeddings": {
    "vectors": number[][],
    "dimensions": number
  },
  "model_info": {
    "name": string,
    "version": string,
    "type": "huggingface"
  },
  "metadata": {
    "processing_time": number,
    "token_count": number
  },
  "status": {
    "success": boolean,
    "error": string | null
  }
}