#embeddings
24 results found
Together.ai Mcp
A Node.js Model Context Protocol (MCP) server that exposes Together AI's inference endpoints — chat completions, image generation, vision, and embeddings — as tools callable from Claude Desktop, Cursor, VS Code, and any other MCP-compatible client. I created this MCP due to an issue I was having accessing reasoning models through Together AI. Together AI's largest reasoning models (GLM-5, Qwen3.5-397B, MiniMax M2.5, Kimi K2.5) use a non-standard response format. During chain-of-thought generation, these models write their reasoning trace into choices[0].message.reasoning while leaving choices[0].message.content as an empty string. The final answer only appears in message.content once thinking is complete.
🚦 Branch-Thinking MCP Tool
Branch-Thinking MCP Tool A TypeScript-powered MCP server for managing parallel branches of thought, semantic cross-references, and persistent tasks. Features dynamic scoring, AI-generated insights, batch operations, and visual graph navigation for advanced agentic workflows.
Milvus
A Model Context Protocol (MCP) server for agentic retrieval and semantic search over unstructured and structured data using Milvus, a high-performance vector database. This server enables large language model (LLM) applications to efficiently index, store, and retrieve vector embeddings derived from diverse data sources—such as documents, images, metadata, or logs. By leveraging Milvus, the MCP server supports similarity search and contextual retrieval, enabling intelligent access to relevant information based on natural language queries. Designed for integration with MCP-compatible clients, this solution provides a scalable, low-latency foundation for building AI-powered applications with contextual awareness and retrieval-augmented generation (RAG) capabilities.
Oboyu (覚ゆ)
Self-hosted MCP Japanese text indexing & search—chunking+embeddings with BM25×vector rerank
Rust Docs MCP Server
🦀 Prevents outdated Rust code suggestions from AI assistants. This MCP server fetches current crate docs, uses embeddings/LLMs, and provides accurate context via a tool call.
ai-memory — Persistent Memory for Any AI
Persistent memory for any AI assistant. Zero token cost until recall. Stores memories locally in SQLite, ranks by 6-factor scoring, returns results in TOON compact format (79% smaller than JSON). 17 MCP tools, 20 HTTP endpoints, 25 CLI commands. 4 tiers from keyword to autonomous with local LLMs via Ollama. 97.8% recall accuracy on ICLR 2025 LongMemEval benchmark. Works with Claude, ChatGPT, Grok, Cursor, Windsurf, Continue.dev, OpenClaw, Llama, and any MCP client. Open source, MIT licensed, written in Rust.
VectorCode
A code repository indexing tool to supercharge your LLM experience.
Cursor Chat History Vectorizer & Dockerized Search MCP
API service to search vectorized Cursor IDE chat history using LanceDB and Ollama
eShopLite 🛒
eShopLite is a set of reference .NET applications implementing an eCommerce site with features like Semantic Search, MCP, Reasoning models and more.
Cursor History MCP 📜
API service to search vectorized Cursor IDE chat history using LanceDB and Ollama
eShopLite
eShopLite is a set of reference .NET applications implementing an eCommerce site with features like Semantic Search, MCP, Reasoning models and more.
Socraticode
Enterprise-grade (40m+ lines) codebase intelligence in a zero-setup, private and local MCP: managed indexing, hybrid semantic search, polyglot code dependency graphs, and DB/API/infra knowledge. Benchmark: 61% less tokens, 84% fewer calls, 37x faster than standard AI grep.
Strata
Strata is a self hosted AI memory server. Your AI remembers everything across every session, on your own hardware. Key features: . Semantic search (find memories by meaning, not keywords) . Per-agent API keys with granular permissions . 3D constellation viewer with live agent activity . File vault (attach real documents to memories) . CSV audit log (full transparency on every agent action) . Pre-Strata history import (your memory doesn't start at install day) . Global MCP kill switch (emergency brake, only a human can undo) . Automatic deduplication . 10 structured thought types . Backend using PostgreSQL . Runs on a Raspberry Pi Always on, always local, always yours.
deAPI Mcp Server
MCP server for deAPI — 35 AI tools: transcription, image/video generation, TTS, music, OCR, embeddings via deAPI.
CodeWeave Mcp
Save tokens while coding — your AI agent gets structured code context, not file dumps. CodeWeave is an MCP server that gives AI agents structured code intelligence instead of dumping entire files into context. Your agent queries local indexes — AST, call graph, type graph, and hybrid semantic search — and gets back only what it needs. The result: less token waste, more relevant context, better coding decisions. It supports 7 languages (Python, TypeScript, JavaScript, Go, Rust, Java, C#) with a 6-stage hybrid search pipeline combining vector embeddings, full-text search, and structural density scoring. Everything runs locally with zero external dependencies beyond Ollama for embeddings. Monorepo and git worktree support included. A file watcher keeps indexes current as code changes — no manual reindexing needed. ATTENTION: Run "npx @codeweave/mcp" in your project directory first. The setup wizard handles everything for you. At first, it may take some time. INFO: Exhaustively tested on large Java, C#, JavaScript, TypeScript, React codebases. Contributions and feedback are welcome.