Merge branch 'main' into codex/optimize-cli-server-with-langchaingo
This commit is contained in:
commit
61d4dff7d3
17
README.md
17
README.md
@ -20,9 +20,22 @@ All UI components provide both Chinese and English interfaces.
|
||||
| Framework | Go | 1.24 |
|
||||
| Framework | Next.js | 14.1.0 |
|
||||
| Gateway | OpenResty | 1.27.1.2 |
|
||||
| Database | PostgreSQL + pgvector | 14.18 |
|
||||
| Cache | Redis | 8.2.0 |
|
||||
| Model | ollama/chutes.ai| baai/bge-m3, llama2:13b, moonshotai/Kimi-K2-Instruct |
|
||||
| Database | PostgreSQL + pgvector | 14.18 |
|
||||
| Model (Local) | HuggingFace Hub + Ollama | baai/bge-m3, llama2:13b |
|
||||
| Model (Online) | Chutes.AI | baai/bge-m3, moonshotai/Kimi-K2-Instruct |
|
||||
|
||||
## LangChainGo 核心功能一览
|
||||
|
||||
XControl 通过 LangChainGo 统一接入多种大模型,并为 AskAI、CLI 与 Server 提供链式调用能力:
|
||||
|
||||
- **LLM 接口层(Model I/O)**:统一调用 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型接口。
|
||||
- **Chains(链式流程)**:将 prompt、检索结果、工具调用等组合成完整流程,支持 RAG、聊天、代码生成等场景。
|
||||
- **工具与 Agent 体系**:定义 Web 搜索、Scraper、SQL 查询等工具,并集成到 LLM Agent,实现 ReAct 风格的工具调用。
|
||||
- **向量检索与数据接入**:适配 PGVector、Weaviate、Qdrant、MongoDB Atlas Vector Search、Chroma、Pinecone、Redis Vector 等向量存储。
|
||||
- **文档加载与分块**:提供 Document Loaders 与 Text Splitters,用于处理长文本与构建向量检索块。
|
||||
- **Memory 与历史追踪**:支持 Conversation Buffer 等对话记忆机制,增强交互体验。
|
||||
|
||||
|
||||
## LangChainGo 核心功能一览
|
||||
|
||||
|
||||
@ -3,22 +3,34 @@
|
||||
使用 LangChainGo 框架优化 CLI、Server 以及 AskAI 接口的子任务规划:
|
||||
|
||||
1. **LLM 接口层(Model I/O)**
|
||||
- 统一接入 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型。
|
||||
- 支持在 CLI 与 Server 中通过配置切换不同模型提供商。
|
||||
- [ ] 构建 OpenAI、Hugging Face、Ollama、Google AI、Cohere 等模型的 provider registry。
|
||||
- [ ] 在 CLI 与 Server 配置中暴露模型提供商切换能力。
|
||||
- [ ] 编写单元测试验证不同 provider 间的切换。
|
||||
- [ ] 补充配置和环境变量使用文档。
|
||||
2. **Chains(链式流程)**
|
||||
- 将 prompt、检索结果、工具调用组合成完整流程,完善 RAG 与聊天能力。
|
||||
- 为 AskAI 提供可组合的链式 API,简化复杂任务编排。
|
||||
- [ ] 将 prompt、检索结果、工具调用组合成 RAG 与聊天链。
|
||||
- [ ] 为 AskAI 提供可复用的链式定义,支持复杂任务编排。
|
||||
- [ ] 在 CLI 中提供链式调用示例。
|
||||
- [ ] 编写链式流程的集成测试。
|
||||
3. **工具与 Agent 体系**
|
||||
- 定义常用工具(Web 搜索、Scraper、SQL 查询等)并集成到 Agent。
|
||||
- 在 CLI 中实现 ReAct 风格的工具调用示例。
|
||||
- [ ] 实现 Web 搜索、Scraper、SQL 查询等常用工具。
|
||||
- [ ] 将工具注册到 Agent 框架中,支持动态调用。
|
||||
- [ ] 在 CLI 中演示 ReAct 风格的工具调用。
|
||||
- [ ] 为工具与 Agent 交互添加测试用例。
|
||||
4. **向量检索与数据接入**
|
||||
- 接入 PGVector、Weaviate、Qdrant、Chroma、Pinecone、Redis Vector 等存储。
|
||||
- 允许自定义向量维度和检索参数。
|
||||
- [ ] 接入 PGVector、Weaviate、Qdrant、Chroma、Pinecone、Redis Vector 等存储。
|
||||
- [ ] 支持自定义向量维度与检索参数。
|
||||
- [ ] 为不同向量存储编写基准测试与比较。
|
||||
- [ ] 提供检索参数调优的文档示例。
|
||||
5. **文档加载与分块**
|
||||
- 提供 Document Loaders 与 Text Splitters,适配不同格式与长度的文本。
|
||||
- 将分块结果统一存储并提供增量更新能力。
|
||||
- [ ] 提供 Markdown、代码、HTML 等多格式的 Document Loader。
|
||||
- [ ] 支持按 token 或递归策略的 Text Splitter。
|
||||
- [ ] 统一存储分块结果并支持增量更新 API。
|
||||
- [ ] 为 loader 与 splitter 编写测试。
|
||||
6. **Memory 与历史追踪**
|
||||
- 为 AskAI 增加对话记忆,如 conversation buffer。
|
||||
- 在 Server 中持久化对话上下文,提升交互体验。
|
||||
- [ ] 为 AskAI 增加 conversation buffer 等对话记忆。
|
||||
- [ ] 在 Server 中持久化会话历史并提供配置项。
|
||||
- [ ] 支持调整记忆长度与清理策略。
|
||||
- [ ] 编写端到端测试验证记忆保留。
|
||||
|
||||
以上任务将逐步落实,以完成混合检索与多模型支持目标。
|
||||
|
||||
@ -90,7 +90,7 @@ models:
|
||||
provider: "ollama"
|
||||
models:
|
||||
- 'llama2:13b'
|
||||
endpoint: "http://127.0.0.1:11434/v1/chat/completions"
|
||||
endpoint: "http://127.0.0.1:11434"
|
||||
```
|
||||
|
||||
For online services using Chutes:
|
||||
@ -106,7 +106,7 @@ For online services using Chutes:
|
||||
# provider: "chutes"
|
||||
# models:
|
||||
# - 'moonshotai/Kimi-K2-Instruct'
|
||||
# endpoint: "https://llm.chutes.ai/v1/chat/completions"
|
||||
# endpoint: "https://llm.chutes.ai/v1"
|
||||
# token: "cpk_xxxx"
|
||||
```
|
||||
|
||||
|
||||
@ -1,22 +1,17 @@
|
||||
# Changelog
|
||||
|
||||
## Milestone 1: MVP (Completed)
|
||||
Use default Redis port (#98) and establish PostgreSQL & Redis baseline.
|
||||
|
||||
Stream RAG sync progress for GitHub repository synchronization (#100).
|
||||
|
||||
Add client-side Markdown parsing to the CLI (#104).
|
||||
|
||||
Refactor RAG ingestion into the CLI with a server upsert endpoint (#103).
|
||||
|
||||
Perform RAG API functional tests and support per-file ingestion workflow in the CLI (#115).
|
||||
|
||||
Allow RAG upsert to migrate embedding dimensions (#119) and document pgvector database initialization (#120).
|
||||
|
||||
Ingest files automatically (#123).
|
||||
|
||||
## Milestone 2: Hybrid Search
|
||||
- Use default Redis port (#98) and establish PostgreSQL & Redis baseline.
|
||||
- Stream RAG sync progress for GitHub repository synchronization (#100).
|
||||
- Add client-side Markdown parsing to the CLI (#104).
|
||||
- Refactor RAG ingestion into the CLI with a server upsert endpoint (#103).
|
||||
- Perform RAG API functional tests and support per-file ingestion workflow in the CLI (#115).
|
||||
- Allow RAG upsert to migrate embedding dimensions (#119) and document pgvector database initialization (#120).
|
||||
- Ingest files automatically (#123).
|
||||
|
||||
## Milestone 2: Hybrid Search (In Progress)
|
||||
- Rename RAG 第二阶段优化规划为 `docs/Milestone-2.md` 并新增子任务列表。
|
||||
- AskAI 接口与 CLI 规划使用 LangChainGo 框架以支持多模型与链式调用。
|
||||
- Document local and Chutes model configurations for AskAI.
|
||||
- CLI and server dynamically support 1024-dimensional embeddings.
|
||||
|
||||
@ -20,11 +20,11 @@ sync:
|
||||
|
||||
provider:
|
||||
- name: ollama
|
||||
endpoint: http://localhost:11434/v1/chat/completions
|
||||
endpoint: http://localhost:11434
|
||||
models:
|
||||
- 'gpt-oss:20b'
|
||||
- name: chutes
|
||||
endpoint: https://llm.chutes.ai/v1/chat/completions
|
||||
endpoint: https://llm.chutes.ai/v1
|
||||
token: "cpk_xxxxxxxxxxxxxxxxxx"
|
||||
models:
|
||||
- 'moonshotai/Kimi-K2-Instruct'
|
||||
|
||||
@ -118,10 +118,15 @@ type Config struct {
|
||||
Models struct {
|
||||
Embedder ModelCfg `yaml:"embedder"`
|
||||
Generator ModelCfg `yaml:"generator"`
|
||||
Reranker ModelCfg `yaml:"reranker"`
|
||||
} `yaml:"models"`
|
||||
Embedding EmbeddingCfg `yaml:"embedding"`
|
||||
Chunking ChunkingCfg `yaml:"chunking"`
|
||||
API struct {
|
||||
Retrieval struct {
|
||||
Alpha float64 `yaml:"alpha"`
|
||||
Candidates int `yaml:"candidates"`
|
||||
} `yaml:"retrieval"`
|
||||
API struct {
|
||||
AskAI struct {
|
||||
Timeout int `yaml:"timeout"`
|
||||
Retries int `yaml:"retries"`
|
||||
|
||||
@ -64,6 +64,11 @@ type Runtime struct {
|
||||
Datasources []DataSource `yaml:"datasources"`
|
||||
Proxy string `yaml:"proxy"`
|
||||
Embedding RuntimeEmbedding
|
||||
Reranker ModelCfg
|
||||
Retrieval struct {
|
||||
Alpha float64 `yaml:"alpha"`
|
||||
Candidates int `yaml:"candidates"`
|
||||
} `yaml:"retrieval"`
|
||||
}
|
||||
|
||||
// ServerConfigPath points to the server configuration file.
|
||||
@ -82,6 +87,8 @@ func LoadServer() (*Runtime, error) {
|
||||
}
|
||||
rt.Redis = cfg.Global.Redis
|
||||
rt.Embedding = cfg.ResolveEmbedding()
|
||||
rt.Reranker = cfg.Models.Reranker
|
||||
rt.Retrieval = cfg.Retrieval
|
||||
return rt, nil
|
||||
}
|
||||
|
||||
@ -101,6 +108,8 @@ func (rt *Runtime) ToConfig() *Config {
|
||||
if rt.Embedding.Model != "" {
|
||||
c.Models.Embedder.Models = []string{rt.Embedding.Model}
|
||||
}
|
||||
c.Models.Reranker = rt.Reranker
|
||||
c.Retrieval = rt.Retrieval
|
||||
c.Embedding.Dimension = rt.Embedding.Dimension
|
||||
c.Embedding.MaxBatch = rt.Embedding.MaxBatch
|
||||
c.Embedding.MaxChars = rt.Embedding.MaxChars
|
||||
|
||||
58
internal/rag/rerank/bge.go
Normal file
58
internal/rag/rerank/bge.go
Normal file
@ -0,0 +1,58 @@
|
||||
package rerank
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"time"
|
||||
)
|
||||
|
||||
// BGE implements a reranker backed by a bge-reranker service.
|
||||
type BGE struct {
|
||||
endpoint string
|
||||
token string
|
||||
client *http.Client
|
||||
}
|
||||
|
||||
// NewBGE returns a new BGE reranker.
|
||||
func NewBGE(endpoint, token string) *BGE {
|
||||
return &BGE{
|
||||
endpoint: endpoint,
|
||||
token: token,
|
||||
client: &http.Client{Timeout: 30 * time.Second},
|
||||
}
|
||||
}
|
||||
|
||||
// Rerank posts query and docs to the service and returns scores.
|
||||
func (b *BGE) Rerank(ctx context.Context, query string, docs []string) ([]float32, error) {
|
||||
payload := map[string]any{"query": query, "documents": docs}
|
||||
body, _ := json.Marshal(payload)
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, b.endpoint, bytes.NewReader(body))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if b.token != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+b.token)
|
||||
}
|
||||
resp, err := b.client.Do(req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
if resp.StatusCode >= 300 {
|
||||
return nil, fmt.Errorf("rerank failed: %s", resp.Status)
|
||||
}
|
||||
var out struct {
|
||||
Scores []float32 `json:"scores"`
|
||||
}
|
||||
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(out.Scores) != len(docs) {
|
||||
return nil, fmt.Errorf("unexpected scores length")
|
||||
}
|
||||
return out.Scores, nil
|
||||
}
|
||||
8
internal/rag/rerank/rerank.go
Normal file
8
internal/rag/rerank/rerank.go
Normal file
@ -0,0 +1,8 @@
|
||||
package rerank
|
||||
|
||||
import "context"
|
||||
|
||||
// Reranker scores a list of documents for a given query.
|
||||
type Reranker interface {
|
||||
Rerank(ctx context.Context, query string, docs []string) ([]float32, error)
|
||||
}
|
||||
@ -3,12 +3,15 @@ package rag
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"sort"
|
||||
|
||||
"github.com/jackc/pgx/v5"
|
||||
pgvector "github.com/pgvector/pgvector-go"
|
||||
|
||||
"xcontrol/internal/rag/config"
|
||||
"xcontrol/internal/rag/embed"
|
||||
"xcontrol/internal/rag/rerank"
|
||||
"xcontrol/internal/rag/store"
|
||||
)
|
||||
|
||||
@ -89,24 +92,107 @@ func (s *Service) Query(ctx context.Context, question string, limit int) ([]Docu
|
||||
}
|
||||
defer conn.Close(ctx)
|
||||
|
||||
rows, err := conn.Query(ctx, `SELECT repo, path, chunk_id, content, metadata FROM documents ORDER BY embedding <-> $1 LIMIT $2`,
|
||||
pgvector.NewVector(vecs[0]), limit)
|
||||
alpha := s.cfg.Retrieval.Alpha
|
||||
if alpha < 0 || alpha > 1 {
|
||||
alpha = 0.5
|
||||
}
|
||||
cand := s.cfg.Retrieval.Candidates
|
||||
if cand <= 0 {
|
||||
cand = 50
|
||||
}
|
||||
|
||||
type scored struct {
|
||||
Document
|
||||
vscore float64
|
||||
tscore float64
|
||||
score float64
|
||||
}
|
||||
docsMap := map[string]*scored{}
|
||||
|
||||
vrows, err := conn.Query(ctx, `SELECT repo,path,chunk_id,content,metadata, embedding <-> $1 AS dist FROM documents ORDER BY embedding <-> $1 LIMIT $2`,
|
||||
pgvector.NewVector(vecs[0]), cand)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer rows.Close()
|
||||
|
||||
var docs []Document
|
||||
for rows.Next() {
|
||||
var d Document
|
||||
for vrows.Next() {
|
||||
var d scored
|
||||
var metaBytes []byte
|
||||
if err := rows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes); err != nil {
|
||||
var dist float64
|
||||
if err := vrows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes, &dist); err != nil {
|
||||
vrows.Close()
|
||||
return nil, err
|
||||
}
|
||||
if len(metaBytes) > 0 {
|
||||
_ = json.Unmarshal(metaBytes, &d.Metadata)
|
||||
}
|
||||
docs = append(docs, d)
|
||||
d.vscore = -dist
|
||||
key := fmt.Sprintf("%s|%s|%d", d.Repo, d.Path, d.ChunkID)
|
||||
docsMap[key] = &d
|
||||
}
|
||||
return docs, rows.Err()
|
||||
vrows.Close()
|
||||
|
||||
trows, err := conn.Query(ctx, `SELECT repo,path,chunk_id,content,metadata, ts_rank_cd(content_tsv, websearch_to_tsquery($1)) AS rank FROM documents WHERE content_tsv @@ websearch_to_tsquery($1) ORDER BY rank DESC LIMIT $2`,
|
||||
question, cand)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
for trows.Next() {
|
||||
var metaBytes []byte
|
||||
var rank float64
|
||||
key := ""
|
||||
d := scored{}
|
||||
if err := trows.Scan(&d.Repo, &d.Path, &d.ChunkID, &d.Content, &metaBytes, &rank); err != nil {
|
||||
trows.Close()
|
||||
return nil, err
|
||||
}
|
||||
if len(metaBytes) > 0 {
|
||||
_ = json.Unmarshal(metaBytes, &d.Metadata)
|
||||
}
|
||||
d.tscore = rank
|
||||
key = fmt.Sprintf("%s|%s|%d", d.Repo, d.Path, d.ChunkID)
|
||||
if exist, ok := docsMap[key]; ok {
|
||||
exist.tscore = d.tscore
|
||||
} else {
|
||||
docsMap[key] = &d
|
||||
}
|
||||
}
|
||||
trows.Close()
|
||||
|
||||
candidates := make([]*scored, 0, len(docsMap))
|
||||
for _, d := range docsMap {
|
||||
d.score = alpha*d.vscore + (1-alpha)*d.tscore
|
||||
candidates = append(candidates, d)
|
||||
}
|
||||
sort.Slice(candidates, func(i, j int) bool { return candidates[i].score > candidates[j].score })
|
||||
if len(candidates) > cand {
|
||||
candidates = candidates[:cand]
|
||||
}
|
||||
|
||||
// optional reranking
|
||||
var rr rerank.Reranker
|
||||
rCfg := s.cfg.Models.Reranker
|
||||
if rCfg.Endpoint != "" {
|
||||
rr = rerank.NewBGE(rCfg.Endpoint, rCfg.Token)
|
||||
}
|
||||
if rr != nil {
|
||||
docs := make([]string, len(candidates))
|
||||
for i, c := range candidates {
|
||||
docs[i] = c.Content
|
||||
}
|
||||
if scores, err := rr.Rerank(ctx, question, docs); err == nil && len(scores) == len(candidates) {
|
||||
for i := range candidates {
|
||||
candidates[i].score = float64(scores[i])
|
||||
}
|
||||
sort.Slice(candidates, func(i, j int) bool { return candidates[i].score > candidates[j].score })
|
||||
}
|
||||
}
|
||||
|
||||
if limit > len(candidates) {
|
||||
limit = len(candidates)
|
||||
}
|
||||
out := make([]Document, 0, limit)
|
||||
for i := 0; i < limit; i++ {
|
||||
out = append(out, candidates[i].Document)
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
|
||||
@ -64,6 +64,20 @@ func EnsureSchema(ctx context.Context, conn *pgx.Conn, dim int, migrate bool) er
|
||||
return err
|
||||
}
|
||||
}
|
||||
// ensure full-text search column
|
||||
var hasTSV bool
|
||||
err = conn.QueryRow(ctx, `SELECT EXISTS (
|
||||
SELECT 1 FROM information_schema.columns
|
||||
WHERE table_name='documents' AND column_name='content_tsv'
|
||||
)`).Scan(&hasTSV)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if !hasTSV {
|
||||
if _, err := conn.Exec(ctx, `ALTER TABLE documents ADD COLUMN content_tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED`); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
// check dimension
|
||||
var curDim int
|
||||
err = conn.QueryRow(ctx, `SELECT atttypmod-4 FROM pg_attribute a JOIN pg_type t ON a.atttypid=t.oid WHERE a.attrelid='documents'::regclass AND a.attname='embedding'`).Scan(&curDim)
|
||||
@ -82,6 +96,9 @@ func EnsureSchema(ctx context.Context, conn *pgx.Conn, dim int, migrate bool) er
|
||||
if _, err := conn.Exec(ctx, `CREATE INDEX IF NOT EXISTS documents_embedding_idx ON documents USING hnsw (embedding vector_cosine_ops)`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err := conn.Exec(ctx, `CREATE INDEX IF NOT EXISTS documents_content_tsv_idx ON documents USING GIN (content_tsv)`); err != nil {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
@ -12,7 +12,6 @@ import (
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/tmc/langchaingo/llms"
|
||||
"github.com/tmc/langchaingo/llms/ollama"
|
||||
"github.com/tmc/langchaingo/llms/openai"
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
@ -122,10 +121,14 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
|
||||
}
|
||||
provider = strings.ToLower(provider)
|
||||
endpoint = strings.TrimRight(endpoint, "/")
|
||||
endpoint = strings.TrimSuffix(endpoint, "/chat/completions")
|
||||
endpoint = strings.TrimRight(endpoint, "/")
|
||||
switch provider {
|
||||
case "ollama":
|
||||
endpoint = strings.TrimSuffix(endpoint, "/v1")
|
||||
endpoint = strings.TrimRight(endpoint, "/")
|
||||
if endpoint == "" {
|
||||
endpoint = "http://localhost:11434/v1/chat/completions"
|
||||
endpoint = "http://localhost:11434"
|
||||
}
|
||||
if model == "" {
|
||||
model = "llama2:13b"
|
||||
@ -133,7 +136,7 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
|
||||
return provider, token, model, endpoint, timeout, retries
|
||||
case "chutes":
|
||||
if endpoint == "" {
|
||||
endpoint = "https://llm.chutes.ai/v1/chat/completions"
|
||||
endpoint = "https://llm.chutes.ai/v1"
|
||||
}
|
||||
if model == "" {
|
||||
model = "deepseek-ai/DeepSeek-R1"
|
||||
@ -141,7 +144,7 @@ func loadConfig() (string, string, string, string, time.Duration, int) {
|
||||
return provider, token, model, endpoint, timeout, retries
|
||||
default:
|
||||
if endpoint == "" {
|
||||
endpoint = "https://llm.chutes.ai/v1/chat/completions"
|
||||
endpoint = "https://llm.chutes.ai/v1"
|
||||
}
|
||||
if model == "" {
|
||||
model = "deepseek-ai/DeepSeek-R1"
|
||||
@ -163,11 +166,7 @@ func callLLM(question string) (string, error) {
|
||||
|
||||
switch provider {
|
||||
case "ollama":
|
||||
llm, err = ollama.New(
|
||||
ollama.WithModel(model),
|
||||
ollama.WithServerURL(url),
|
||||
ollama.WithHTTPClient(httpClient),
|
||||
)
|
||||
fallthrough
|
||||
default:
|
||||
llm, err = openai.New(
|
||||
openai.WithToken(token),
|
||||
|
||||
@ -29,7 +29,7 @@ models:
|
||||
provider: "ollama"
|
||||
models:
|
||||
- 'llama2:13b'
|
||||
endpoint: "http://127.0.0.1:11434/v1/chat/completions"
|
||||
endpoint: "http://127.0.0.1:11434"
|
||||
token: ""
|
||||
# For PROD
|
||||
#models:
|
||||
@ -42,7 +42,7 @@ models:
|
||||
#provider: "chutes"
|
||||
#models:
|
||||
# - 'moonshotai/Kimi-K2-Instruct'
|
||||
#endpoint: "https://llm.chutes.ai/v1/chat/completions"
|
||||
#endpoint: "https://llm.chutes.ai/v1"
|
||||
#token: "cpk_xxxx"
|
||||
|
||||
embedding:
|
||||
|
||||
Loading…
Reference in New Issue
Block a user