Merge branch 'main' into codex/optimize-cli-server-with-langchaingo

This commit is contained in:
shenlan 2025-08-12 23:48:45 +08:00 committed by GitHub
commit 61d4dff7d3
13 changed files with 256 additions and 54 deletions

View File

@ -20,9 +20,22 @@ All UI components provide both Chinese and English interfaces.
| Framework | Go | 1.24 |
| Framework | Next.js | 14.1.0 |
| Gateway | OpenResty | 1.27.1.2 |
| Database | PostgreSQL + pgvector | 14.18 |
| Cache | Redis | 8.2.0 |
| Model | ollama/chutes.ai| baai/bge-m3, llama2:13b, moonshotai/Kimi-K2-Instruct |
| Database | PostgreSQL + pgvector | 14.18 |
| Model (Local) | HuggingFace Hub + Ollama | baai/bge-m3, llama2:13b |
| Model (Online) | Chutes.AI | baai/bge-m3, moonshotai/Kimi-K2-Instruct |
## LangChainGo 核心功能一览
XControl 通过 LangChainGo 统一接入多种大模型,并为 AskAI、CLI 与 Server 提供链式调用能力:
- **LLM 接口层Model I/O**:统一调用 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型接口。
- **Chains链式流程**:将 prompt、检索结果、工具调用等组合成完整流程支持 RAG、聊天、代码生成等场景。
- **工具与 Agent 体系**:定义 Web 搜索、Scraper、SQL 查询等工具,并集成到 LLM Agent实现 ReAct 风格的工具调用。
- **向量检索与数据接入**:适配 PGVector、Weaviate、Qdrant、MongoDB Atlas Vector Search、Chroma、Pinecone、Redis Vector 等向量存储。
- **文档加载与分块**:提供 Document Loaders 与 Text Splitters用于处理长文本与构建向量检索块。
- **Memory 与历史追踪**:支持 Conversation Buffer 等对话记忆机制,增强交互体验。
## LangChainGo 核心功能一览

View File

@ -3,22 +3,34 @@
使用 LangChainGo 框架优化 CLI、Server 以及 AskAI 接口的子任务规划:
1. **LLM 接口层Model I/O**
- 统一接入 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型。
- 支持在 CLI 与 Server 中通过配置切换不同模型提供商。
- [ ] 构建 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型的 provider registry。
- [ ] 在 CLI 与 Server 配置中暴露模型提供商切换能力。
- [ ] 编写单元测试验证不同 provider 间的切换。
- [ ] 补充配置和环境变量使用文档。
2. **Chains链式流程**
- 将 prompt、检索结果、工具调用组合成完整流程完善 RAG 与聊天能力。
- 为 AskAI 提供可组合的链式 API简化复杂任务编排。
- [ ] 将 prompt、检索结果、工具调用组合成 RAG 与聊天链。
- [ ] 为 AskAI 提供可复用的链式定义,支持复杂任务编排。
- [ ] 在 CLI 中提供链式调用示例。
- [ ] 编写链式流程的集成测试。
3. **工具与 Agent 体系**
- 定义常用工具Web 搜索、Scraper、SQL 查询等)并集成到 Agent。
- 在 CLI 中实现 ReAct 风格的工具调用示例。
- [ ] 实现 Web 搜索、Scraper、SQL 查询等常用工具。
- [ ] 将工具注册到 Agent 框架中,支持动态调用。
- [ ] 在 CLI 中演示 ReAct 风格的工具调用。
- [ ] 为工具与 Agent 交互添加测试用例。
4. **向量检索与数据接入**
- 接入 PGVector、Weaviate、Qdrant、Chroma、Pinecone、Redis Vector 等存储。
- 允许自定义向量维度和检索参数。
- [ ] 接入 PGVector、Weaviate、Qdrant、Chroma、Pinecone、Redis Vector 等存储。
- [ ] 支持自定义向量维度与检索参数。
- [ ] 为不同向量存储编写基准测试与比较。
- [ ] 提供检索参数调优的文档示例。
5. **文档加载与分块**
- 提供 Document Loaders 与 Text Splitters适配不同格式与长度的文本。
- 将分块结果统一存储并提供增量更新能力。
- [ ] 提供 Markdown、代码、HTML 等多格式的 Document Loader。
- [ ] 支持按 token 或递归策略的 Text Splitter。
- [ ] 统一存储分块结果并支持增量更新 API。
- [ ] 为 loader 与 splitter 编写测试。
6. **Memory 与历史追踪**
- 为 AskAI 增加对话记忆,如 conversation buffer。
- 在 Server 中持久化对话上下文,提升交互体验。
- [ ] 为 AskAI 增加 conversation buffer 等对话记忆。
- [ ] 在 Server 中持久化会话历史并提供配置项。
- [ ] 支持调整记忆长度与清理策略。
- [ ] 编写端到端测试验证记忆保留。
以上任务将逐步落实,以完成混合检索与多模型支持目标。

View File

@ -90,7 +90,7 @@ models:
provider: "ollama"
models:
- 'llama2:13b'
endpoint: "http://127.0.0.1:11434/v1/chat/completions"
endpoint: "http://127.0.0.1:11434"
```
For online services using Chutes:
@ -106,7 +106,7 @@ For online services using Chutes:
# provider: "chutes"
# models:
# - 'moonshotai/Kimi-K2-Instruct'
# endpoint: "https://llm.chutes.ai/v1/chat/completions"
# endpoint: "https://llm.chutes.ai/v1"
# token: "cpk_xxxx"
```

View File

@ -1,22 +1,17 @@
# Changelog
## Milestone 1: MVP (Completed)
Use default Redis port (#98) and establish PostgreSQL & Redis baseline.
Stream RAG sync progress for GitHub repository synchronization (#100).
Add client-side Markdown parsing to the CLI (#104).
Refactor RAG ingestion into the CLI with a server upsert endpoint (#103).
Perform RAG API functional tests and support per-file ingestion workflow in the CLI (#115).
Allow RAG upsert to migrate embedding dimensions (#119) and document pgvector database initialization (#120).
Ingest files automatically (#123).
## Milestone 2: Hybrid Search
- Use default Redis port (#98) and establish PostgreSQL & Redis baseline.
- Stream RAG sync progress for GitHub repository synchronization (#100).
- Add client-side Markdown parsing to the CLI (#104).
- Refactor RAG ingestion into the CLI with a server upsert endpoint (#103).
- Perform RAG API functional tests and support per-file ingestion workflow in the CLI (#115).
- Allow RAG upsert to migrate embedding dimensions (#119) and document pgvector database initialization (#120).
- Ingest files automatically (#123).
## Milestone 2: Hybrid Search (In Progress)
- Rename RAG 第二阶段优化规划为 `docs/Milestone-2.md` 并新增子任务列表。
- AskAI 接口与 CLI 规划使用 LangChainGo 框架以支持多模型与链式调用。
- Document local and Chutes model configurations for AskAI.
- CLI and server dynamically support 1024-dimensional embeddings.

View File

@ -20,11 +20,11 @@ sync:
provider:
- name: ollama
endpoint: http://localhost:11434/v1/chat/completions
endpoint: http://localhost:11434
models:
- 'gpt-oss:20b'
- name: chutes
endpoint: https://llm.chutes.ai/v1/chat/completions
endpoint: https://llm.chutes.ai/v1
token: "cpk_xxxxxxxxxxxxxxxxxx"
models:
- 'moonshotai/Kimi-K2-Instruct'

View File

@ -118,10 +118,15 @@ type Config struct {
Models struct {
Embedder ModelCfg `yaml:"embedder"`
Generator ModelCfg `yaml:"generator"`
Reranker ModelCfg `yaml:"reranker"`
} `yaml:"models"`
Embedding EmbeddingCfg `yaml:"embedding"`
Chunking ChunkingCfg `yaml:"chunking"`
API struct {
Retrieval struct {
Alpha float64 `yaml:"alpha"`
Candidates int `yaml:"candidates"`
} `yaml:"retrieval"`
API struct {
AskAI struct {
Timeout int `yaml:"timeout"`
Retries int `yaml:"retries"`

View File

@ -64,6 +64,11 @@ type Runtime struct {
Datasources []DataSource `yaml:"datasources"`
Proxy string `yaml:"proxy"`
Embedding RuntimeEmbedding
Reranker ModelCfg
Retrieval struct {
Alpha float64 `yaml:"alpha"`
Candidates int `yaml:"candidates"`
} `yaml:"retrieval"`
}
// ServerConfigPath points to the server configuration file.
@ -82,6 +87,8 @@ func LoadServer() (*Runtime, error) {
}
rt.Redis = cfg.Global.Redis
rt.Embedding = cfg.ResolveEmbedding()
rt.Reranker = cfg.Models.Reranker
rt.Retrieval = cfg.Retrieval
return rt, nil
}
@ -101,6 +108,8 @@ func (rt *Runtime) ToConfig() *Config {
if rt.Embedding.Model != "" {
c.Models.Embedder.Models = []string{rt.Embedding.Model}
}
c.Models.Reranker = rt.Reranker
c.Retrieval = rt.Retrieval
c.Embedding.Dimension = rt.Embedding.Dimension
c.Embedding.MaxBatch = rt.Embedding.MaxBatch
c.Embedding.MaxChars = rt.Embedding.MaxChars

View File

@ -0,0 +1,58 @@
package rerank
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/http"
"time"
)
// BGE implements a reranker backed by a bge-reranker service.
type BGE struct {
endpoint string
token string
client *http.Client
}
// NewBGE returns a new BGE reranker.
func NewBGE(endpoint, token string) *BGE {
return &BGE{
endpoint: endpoint,
token: token,
client: &http.Client{Timeout: 30 * time.Second},
}
}
// Rerank posts query and docs to the service and returns scores.
func (b *BGE) Rerank(ctx context.Context, query string, docs []string) ([]float32, error) {
payload := map[string]any{"query": query, "documents": docs}
body, _ := json.Marshal(payload)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, b.endpoint, bytes.NewReader(body))
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json")
if b.token != "" {
req.Header.Set("Authorization", "Bearer "+b.token)
}
resp, err := b.client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode >= 300 {
return nil, fmt.Errorf("rerank failed: %s", resp.Status)
}
var out struct {
Scores []float32 `json:"scores"`
}
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, err
}
if len(out.Scores) != len(docs) {
return nil, fmt.Errorf("unexpected scores length")
}
return out.Scores, nil
}

View File

@ -0,0 +1,8 @@
package rerank
import "context"
// Reranker scores a list of documents for a given query.
type Reranker interface {
Rerank(ctx context.Context, query string, docs []string) ([]float32, error)
}

View File

@ -3,12 +3,15 @@ package rag
import (
"context"
"encoding/json"
"fmt"
"sort"
"github.com/jackc/pgx/v5"
pgvector "github.com/pgvector/pgvector-go"
"xcontrol/internal/rag/config"
"xcontrol/internal/rag/embed"
"xcontrol/internal/rag/rerank"
"xcontrol/internal/rag/store"
)
@ -89,24 +92,107 @@ func (s *Service) Query(ctx context.Context, question string, limit int) ([]Docu
}
defer conn.Close(ctx)
rows, err := conn.Query(ctx, `SELECT repo, path, chunk_id, content, metadata FROM documents ORDER BY embedding <-> $1 LIMIT $2`,
pgvector.NewVector(vecs[0]), limit)
alpha := s.cfg.Retrieval.Alpha
if alpha < 0 || alpha > 1 {
alpha = 0.5
}
cand := s.cfg.Retrieval.Candidates
if cand <= 0 {
cand = 50
}
type scored struct {
Document
vscore float64
tscore float64
score float64
}
docsMap := map[string]*scored{}
vrows, err := conn.Query(ctx, `SELECT repo,path,chunk_id,content,metadata, embedding <-> $1 AS dist FROM documents ORDER BY embedding <-> $1 LIMIT $2`,
pgvector.NewVector(vecs[0]), cand)
if err != nil {
return nil, err
}
defer rows.Close()
var docs []Document
for rows.Next() {
var d Document
for vrows.Next() {
var d scored
var metaBytes []byte
if err := rows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes); err != nil {
var dist float64
if err := vrows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes, &dist); err != nil {
vrows.Close()
return nil, err
}
if len(metaBytes) > 0 {
_ = json.Unmarshal(metaBytes, &d.Metadata)
}
docs = append(docs, d)
d.vscore = -dist
key := fmt.Sprintf("%s|%s|%d", d.Repo, d.Path, d.ChunkID)
docsMap[key] = &d
}
return docs, rows.Err()
vrows.Close()
trows, err := conn.Query(ctx, `SELECT repo,path,chunk_id,content,metadata, ts_rank_cd(content_tsv, websearch_to_tsquery($1)) AS rank FROM documents WHERE content_tsv @@ websearch_to_tsquery($1) ORDER BY rank DESC LIMIT $2`,
question, cand)
if err != nil {
return nil, err
}
for trows.Next() {
var metaBytes []byte
var rank float64
key := ""
d := scored{}
if err := trows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes, &rank); err != nil {
trows.Close()
return nil, err
}
if len(metaBytes) > 0 {
_ = json.Unmarshal(metaBytes, &d.Metadata)
}
d.tscore = rank
key = fmt.Sprintf("%s|%s|%d", d.Repo, d.Path, d.ChunkID)
if exist, ok := docsMap[key]; ok {
exist.tscore = d.tscore
} else {
docsMap[key] = &d
}
}
trows.Close()
candidates := make([]*scored, 0, len(docsMap))
for _, d := range docsMap {
d.score = alpha*d.vscore + (1-alpha)*d.tscore
candidates = append(candidates, d)
}
sort.Slice(candidates, func(i, j int) bool { return candidates[i].score > candidates[j].score })
if len(candidates) > cand {
candidates = candidates[:cand]
}
// optional reranking
var rr rerank.Reranker
rCfg := s.cfg.Models.Reranker
if rCfg.Endpoint != "" {
rr = rerank.NewBGE(rCfg.Endpoint, rCfg.Token)
}
if rr != nil {
docs := make([]string, len(candidates))
for i, c := range candidates {
docs[i] = c.Content
}
if scores, err := rr.Rerank(ctx, question, docs); err == nil && len(scores) == len(candidates) {
for i := range candidates {
candidates[i].score = float64(scores[i])
}
sort.Slice(candidates, func(i, j int) bool { return candidates[i].score > candidates[j].score })
}
}
if limit > len(candidates) {
limit = len(candidates)
}
out := make([]Document, 0, limit)
for i := 0; i < limit; i++ {
out = append(out, candidates[i].Document)
}
return out, nil
}

View File

@ -64,6 +64,20 @@ func EnsureSchema(ctx context.Context, conn *pgx.Conn, dim int, migrate bool) er
return err
}
}
// ensure full-text search column
var hasTSV bool
err = conn.QueryRow(ctx, `SELECT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name='documents' AND column_name='content_tsv'
)`).Scan(&hasTSV)
if err != nil {
return err
}
if !hasTSV {
if _, err := conn.Exec(ctx, `ALTER TABLE documents ADD COLUMN content_tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED`); err != nil {
return err
}
}
// check dimension
var curDim int
err = conn.QueryRow(ctx, `SELECT atttypmod-4 FROM pg_attribute a JOIN pg_type t ON a.atttypid=t.oid WHERE a.attrelid='documents'::regclass AND a.attname='embedding'`).Scan(&curDim)
@ -82,6 +96,9 @@ func EnsureSchema(ctx context.Context, conn *pgx.Conn, dim int, migrate bool) er
if _, err := conn.Exec(ctx, `CREATE INDEX IF NOT EXISTS documents_embedding_idx ON documents USING hnsw (embedding vector_cosine_ops)`); err != nil {
return err
}
if _, err := conn.Exec(ctx, `CREATE INDEX IF NOT EXISTS documents_content_tsv_idx ON documents USING GIN (content_tsv)`); err != nil {
return err
}
return nil
}

View File

@ -12,7 +12,6 @@ import (
"github.com/gin-gonic/gin"
"github.com/tmc/langchaingo/llms"
"github.com/tmc/langchaingo/llms/ollama"
"github.com/tmc/langchaingo/llms/openai"
"gopkg.in/yaml.v3"
)
@ -122,10 +121,14 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
}
provider = strings.ToLower(provider)
endpoint = strings.TrimRight(endpoint, "/")
endpoint = strings.TrimSuffix(endpoint, "/chat/completions")
endpoint = strings.TrimRight(endpoint, "/")
switch provider {
case "ollama":
endpoint = strings.TrimSuffix(endpoint, "/v1")
endpoint = strings.TrimRight(endpoint, "/")
if endpoint == "" {
endpoint = "http://localhost:11434/v1/chat/completions"
endpoint = "http://localhost:11434"
}
if model == "" {
model = "llama2:13b"
@ -133,7 +136,7 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
return provider, token, model, endpoint, timeout, retries
case "chutes":
if endpoint == "" {
endpoint = "https://llm.chutes.ai/v1/chat/completions"
endpoint = "https://llm.chutes.ai/v1"
}
if model == "" {
model = "deepseek-ai/DeepSeek-R1"
@ -141,7 +144,7 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
return provider, token, model, endpoint, timeout, retries
default:
if endpoint == "" {
endpoint = "https://llm.chutes.ai/v1/chat/completions"
endpoint = "https://llm.chutes.ai/v1"
}
if model == "" {
model = "deepseek-ai/DeepSeek-R1"
@ -163,11 +166,7 @@ func callLLM(question string) (string, error) {
switch provider {
case "ollama":
llm, err = ollama.New(
ollama.WithModel(model),
ollama.WithServerURL(url),
ollama.WithHTTPClient(httpClient),
)
fallthrough
default:
llm, err = openai.New(
openai.WithToken(token),

View File

@ -29,7 +29,7 @@ models:
provider: "ollama"
models:
- 'llama2:13b'
endpoint: "http://127.0.0.1:11434/v1/chat/completions"
endpoint: "http://127.0.0.1:11434"
token: ""
# For PROD
#models:
@ -42,7 +42,7 @@ models:
#provider: "chutes"
#models:
# - 'moonshotai/Kimi-K2-Instruct'
#endpoint: "https://llm.chutes.ai/v1/chat/completions"
#endpoint: "https://llm.chutes.ai/v1"
#token: "cpk_xxxx"
embedding: