release: v2.2.6 [2026-06-13 01:02:05]

### 新功能
- **翰林院向量化质量升级**:
  - **边界感知切块**:替换四个来源(聊天记录/小说/世界书/手动)的纯字符硬切——优先在段落边界断开,其次句末标点(含中文引号闭合),极端长串才硬切;句子/对话不再被拦腰截断,embedding 质量同步受益。仅影响新录入,已有向量无需重建
  - **注入时序重排**:检索结果注入提示词前按时序重排(聊天记录按楼层、小说按卷/章/节——中文数字章节号可解析),rerank 只决定"选哪些块",不再决定呈现顺序;修复"不打不相识的剧情之后紧跟关系亲密"这类因按相关度排序导致的认知时间错乱
  - **断层提示**:聊天记录相邻块楼层跳跃时自动插入"与上文相隔约 N 楼,并非连续发生"提示行,消除中间剧情缺失造成的割裂感
  - **时间标识**:新录入的聊天记录块在来源标识中带上消息发送时间(ST 向量存储不持久化元数据,时间必须写入块文本才能在检索后取回;旧格式块兼容解析)
- **记忆块工作流(memory-blocks)**:剧情优化新增"自定义记忆块"体系——占位符驱动的并发工作流框架
  - 在剧情优化面板「匹配替换 (sulv)」下方可增删自定义块:每个块定义一个占位符,执行剧情优化时主/拦截提示词中的占位符会被块的产出替换
  - **静态块**:直接输出固定内容;**AI 调用块**:用所选 API 功能槽独立请求一次,把回复(或其中指定 `<标签>` 的内容)作为替换值
  - 原有 sulv1-4 速率占位符迁入同一框架,行为与旧版逐字节一致
  - 块定义为纯 JSON、随设置持久化,为后续导入导出与战斗系统接入预留扩展点
  - 框架层新增**顺序拼接式 Chain**(`composeChain`):与占位符替换并列的第二种组合范式——同链的块并发执行后按 `order` 排序、以 `separator` 拼接并可选 `header/footer` 包裹,产出一个完整注入块;为记忆注入合成块与战斗系统"底部战报块"预留的承载结构,本版本暂无 UI 入口
- **API 连接配置**:
  - 角色世界书(cwb)与一键生卡(autoCharCard)纳入旧配置自动迁移:老用户首次加载会把旧 URL / Key / 模型自动迁移为连接配置并分配槽位(一键生卡仅在规划者与执行者配置一致或规划者为空时迁移,避免悄悄改变行为)
  - **profile 已分配时参数控件 informational 化**:主面板 / 并发剧情优化 / 角色世界书 / 术语表的温度、maxTokens 控件在槽位分配 profile 后自动禁用并显示"由连接配置控制"提示,消除"改了没效果"的用户陷阱
  - **profile 状态卡新增"本设备无 Key"警示**:API Key 仅保存在最初填写它的设备/浏览器上(安全设计,不随云端设置同步),换设备后状态卡会直接亮出警示徽标,不必等到调用报错才发现
### 修复
- **独立聊天记忆从摆设变真功能(原作遗留坑)**:此前向量数据"随卡不随聊天"——开启"独立聊天记忆"后录入仍存进角色库、查询却去查一个从未被写入过的聊天集合、计数恒为 0,整体静默失效。现已重构为聊天级分桶:
  - 独立模式下,聊天记录类向量按当前聊天隔离存储与检索,同一张卡开多个聊天(不同剧情线)的记忆互不污染
  - 小说 / 世界书 / 手动录入属于"知识",仍随角色卡跨聊天共享;全局库不受影响
  - 知识管理列表为聊天专属库显示"聊天级"徽标;聊天级库禁止移动到全局
  - 统一模式(默认关闭独立记忆)的存量数据与行为完全不变
  - 已知限制:聊天专属记忆跟随聊天文件,重命名聊天文件会使其失联(与 ST 官方向量扩展同等限制)
- **超级排序截断顺序修正**:开启"超级排序"时,时序重排发生在 top_n 截断之前,导致保留的是"时序最早"而非"最相关"的块,检索结果长期偏向最旧的聊天记录。现改为先按相关度截取 top_n、再做时序排序
- **翰林院向量化失败("向量化块数量不识别"反馈)**:
  - 一次性清洗 profile-sync 历史污染:`retrieval/rerank.apiKey` 中的掩码占位符在持久层根治(此前仅读取侧防御);`apiEndpoint` / `rerank.apiMode` 的非法值(如被旧版写入的空字符串)归一化为 `custom`
  - 修复 `apiEndpoint` 为空/非法时请求被硬定向到 `api.openai.com`、无视用户自定义 URL 的问题(CSP 拦截 / 401 的元凶)
  - 修复**本地代理(LM Studio/Ollama)模式**自始就缺少 URL 分支、同样被错误定向到 openai.com 的问题
  - API 模式下拉补全 `OpenAI 官方` / `Azure` 选项;默认 API 模式改为 `custom`(与默认 URL 配套),新用户不再因选项缺失导致首次保存写入空值
  - profile-sync 给下拉框赋不存在选项值的污染源头修复(影响所有模块面板,不止翰林院)
- **Rerank "API Key 未提供"报错升级**:当原因是"连接配置在本设备没有可用 Key"时,报错会直接说明 Key 的设备本地性并指引到 API 连接配置重新填写(向量化 Google 直连、获取模型列表同步处理)
- **旧配置迁移**:一键生卡迁移时排除掩码占位符,避免把历史污染的假 Key 迁入新连接配置
- **超级记忆稳定性专项**(针对"工作不大稳定"反馈,4 处根因一次修复):
  - **切聊天竞态污染**:CHAT_CHANGED 时超级记忆立即全量同步,而表格系统延迟 100ms 才加载新聊天的表格,导致【旧聊天】的表格内容被写进【新角色】的记忆世界书;两边表名不同时旧表条目无 GC 兜底会**永久残留**("记忆串台"元凶)。现 CHAT_CHANGED 只确保世界书存在,新状态同步交由 `loadTables()` 完成后的自动推送,单次且时序正确
  - **死代码双轨存储拆除**:`saveStateToMetadata` / `tryRestoreStateFromMetadata` 把表格状态写到 `msg.metadata`——该字段非 ST 持久化位(同 v2.2.5 二次填表修过的坑),写入即蒸发、恢复永远为空,且每次同步还白调一次 `saveChat()`。整条链路删除,表格状态唯一信源为表格系统的 `msg.extra.amily2_tables_data`
  - **`awaitSync()` 穿透**:同步队列正忙时 `pushUpdate` 会用一个立即 resolve 的空 Promise 覆盖 `_syncPromise`,Pipeline Stage 4 等待形同虚设、后续阶段在同步未完成时被放行。现忙时不覆盖,正在运行的 drain 循环自然吃掉新入队项
  - **开关打开不生效**:启动时若总开关为关,初始化早退且不注册监听器;此后在 UI 勾选开关只写设置,超级记忆直到刷新页面前都是死的。现勾选即触发初始化(幂等)
  - 附带:`forceSyncAll` 的表格角色推断改为复用 `events-schema.inferTableRole`,消除两处重复逻辑漂移风险;每次切聊天的双倍全量同步(restore 路径一次 + 显式一次)随死代码移除归一
### 重构
- 表格核心 `manager.js` 瘦身(约 1050 → 600 行):19 个 UI 突变操作拆分至 `actions/ui-mutations.js`,SuperMemory 事件分发拆分至 `events-dispatch.js`;全部经 re-export 保持兼容,外部调用路径零改动
- 角色世界书最后 2 处散乱的厂商 URL 判断迁移至 `detectVendor` 统一入口,业务路径上不再有硬编码的 URL substring 判断
This commit is contained in:
Jenkins CI
2026-06-13 01:02:05 +08:00
parent 1a4a10d42d
commit 0d7e3b799e
35 changed files with 1981 additions and 1034 deletions

View File

@@ -152,6 +152,7 @@ function initialize() {
return;
}
migrateLegacyRagSettings();
sanitizeProfilePollution();
settings = getSettings();
if (!window.hanlinyuanRagProcessor) {
window.hanlinyuanRagProcessor = {};
@@ -219,20 +220,27 @@ async function ingestTextToHanlinyuan(text, source = 'manual', metadata = {}, pr
break;
}
// 独立聊天记忆模式:聊天记录类向量按聊天分桶(剧情线隔离),
// 其余来源(小说/世界书/手动)属于"知识",仍随角色卡共享
const independentChatId = (source === 'chat_history' && settings.retrieval.independentChatMemoryEnabled)
? getChatId()
: null;
const existingKbs = Object.values(getKnowledgeBases());
const foundKb = existingKbs.find(kb => kb.name === kbName);
// 同名合并需限定在同一聊天命名空间内,避免独立模式下不同聊天的同名楼层段互相串库
const foundKb = existingKbs.find(kb => kb.name === kbName && (kb.chatId ?? null) === independentChatId);
if (foundKb) {
taskId = foundKb.id;
logCallback(`[翰林院-核心] 检测到同名知识库 "${kbName}",将数据合并入库。`, 'info');
} else {
logCallback(`[翰林院-核心] 准备为任务 "${kbName}" 创建专属知识库...`, 'info');
const newKb = addKnowledgeBase(kbName, source);
const newKb = addKnowledgeBase(kbName, source, independentChatId);
taskId = newKb.id;
}
const charId = getCharacterStableId();
const collectionId = `${charId}_${taskId}`;
const collectionId = independentChatId ? `${independentChatId}_${taskId}` : `${charId}_${taskId}`;
logCallback(`[翰林院-核心] 已创建并锁定知识库: ${kbName} (集合ID: ${collectionId})`, 'success');
logCallback(`[翰林院-核心] 已锁定忆识宝库ID: ${collectionId}`, 'info');
@@ -410,6 +418,49 @@ function migrateLegacyRagSettings() {
saveSettingsDebounced();
}
/**
* 一次性清洗 profile-sync 历史污染2.2.5 之前的版本遗留)。
*
* 旧版 saveSettingsFromUI 会把被 Profile 接管的隐藏字段值写回 settings
* - apiKey 被写成掩码 '••••••••'rag-api 已有读侧防御,这里根治持久层)
* - apiEndpoint 的 select 被 _fillLegacyFields 赋了不存在的 option 值
* profile.provider 如 'custom_oai')后 value 变 '''' 被写回 settings
* '' 在 getApiEndpointUrl 落 default 分支,请求被错误定向 → 向量化全失败
*
* 2.2.5 修复了"继续污染",本函数清理已污染的存量数据。
*/
function sanitizeProfilePollution() {
const s = getSettings();
const MASKED = '••••••••';
let cleaned = [];
if (s.retrieval?.apiKey === MASKED) {
s.retrieval.apiKey = '';
cleaned.push('retrieval.apiKey 掩码');
}
if (s.rerank?.apiKey === MASKED) {
s.rerank.apiKey = '';
cleaned.push('rerank.apiKey 掩码');
}
// 合法值与 UI select 选项及 rag-api 的 switch 分支保持一致
const validEndpoints = ['custom', 'google_direct', 'local_proxy', 'openai', 'azure'];
if (s.retrieval && !validEndpoints.includes(s.retrieval.apiEndpoint)) {
cleaned.push(`retrieval.apiEndpoint 非法值 "${s.retrieval.apiEndpoint}"`);
s.retrieval.apiEndpoint = 'custom';
}
const validRerankModes = ['custom', 'local_proxy'];
if (s.rerank && !validRerankModes.includes(s.rerank.apiMode)) {
cleaned.push(`rerank.apiMode 非法值 "${s.rerank.apiMode}"`);
s.rerank.apiMode = 'custom';
}
if (cleaned.length > 0) {
console.warn(`[翰林院] 已清洗 profile-sync 历史污染字段: ${cleaned.join('、')}`);
saveSettings();
}
}
function showNotification(message, type = 'info') {
toastr[type](message);
}
@@ -431,6 +482,71 @@ function getTagForSource(source) {
}
/**
* 边界感知切分:把 content 切成不超过 chunkSize 的片段,尽量在自然边界断开。
*
* 三级回退策略(替代旧的纯字符硬切,避免句子/对话被拦腰截断):
* 1. 段落边界(最后一个换行符)
* 2. 句末边界(。!?!?… 及其后跟随的闭合引号/括号)
* 3. 都找不到(极端长串)才硬切
* 边界切点过于靠前(< 40% 块长)时视为无效,降级到下一策略——防止
* 一个超长段落开头的短句导致块碎片化。
*
* @param {string} content
* @param {number} chunkSize - 单块最大字符数
* @param {number} overlap - 相邻块重叠字符数(语义衔接),从上一块尾部回看
* @returns {string[]}
*/
function splitBySemanticBoundary(content, chunkSize, overlap) {
const pieces = [];
if (!content || chunkSize <= 0) return pieces;
const minCut = Math.floor(chunkSize * 0.4);
const sentenceEndRegex = /[。!?!?…][”"』」))】]?/g;
let pos = 0;
while (pos < content.length) {
let end = Math.min(pos + chunkSize, content.length);
if (end < content.length) {
const slice = content.substring(pos, end);
// 1. 段落边界:最后一个换行(切点含换行符本身)
let cut = slice.lastIndexOf('\n') + 1;
// 2. 段落边界无效时找最后一个句末边界
if (cut <= minCut) {
let lastSentenceEnd = -1;
sentenceEndRegex.lastIndex = 0;
let m;
while ((m = sentenceEndRegex.exec(slice)) !== null) {
lastSentenceEnd = m.index + m[0].length;
}
if (lastSentenceEnd > minCut) cut = lastSentenceEnd;
}
// 3. 有效边界则收缩切点,否则保持硬切
if (cut > minCut) end = pos + cut;
}
const piece = content.substring(pos, end);
if (piece.trim().length > 0) pieces.push(piece);
if (end >= content.length) break;
// overlap 回看Math.max 防止 overlap >= 块长时死循环
pos = Math.max(end - overlap, pos + 1);
}
return pieces;
}
/** 把 ISO/任意时间值格式化为写入块 prefix 的紧凑标识(不含逗号,便于正则反解) */
function formatChunkTimeLabel(timestamp) {
const d = new Date(timestamp);
if (isNaN(d.getTime())) return '';
const pad = n => String(n).padStart(2, '0');
return `${d.getFullYear()}-${pad(d.getMonth() + 1)}-${pad(d.getDate())} ${pad(d.getHours())}:${pad(d.getMinutes())}`;
}
function splitIntoChunks(text, source, metadata = {}) {
switch (source) {
case 'novel':
@@ -465,30 +581,22 @@ function _chunkForNovel(text, metadata) {
function processBuffer() {
if (contentBuffer.length === 0) return;
const content = contentBuffer.join('\n');
let start = 0;
let section = 1;
while (start < content.length) {
const end = Math.min(start + chunkSize, content.length);
const chunkText = content.substring(start, end);
if (chunkText.trim().length > 0) {
const chunkMetadata = {
source: 'novel',
sourceName: sourceName,
timestamp: new Date().toISOString(),
globalIndex: globalChunkIndex++,
volume: currentVolumeTitle,
chapter: currentChapterTitle,
section: section,
};
const tagName = getTagForSource('novel');
const prefix = `[来源: ${sourceName}, ${currentVolumeTitle}, ${currentChapterTitle}, 第${section}节]`;
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({ text: wrappedText, metadata: chunkMetadata });
section++;
}
start += (chunkSize - overlap);
if (start >= content.length) break;
}
const tagName = getTagForSource('novel');
splitBySemanticBoundary(content, chunkSize, overlap).forEach((chunkText, idx) => {
const section = idx + 1;
const chunkMetadata = {
source: 'novel',
sourceName: sourceName,
timestamp: new Date().toISOString(),
globalIndex: globalChunkIndex++,
volume: currentVolumeTitle,
chapter: currentChapterTitle,
section: section,
};
const prefix = `[来源: ${sourceName}, ${currentVolumeTitle}, ${currentChapterTitle}, 第${section}节]`;
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({ text: wrappedText, metadata: chunkMetadata });
});
contentBuffer = [];
}
@@ -508,11 +616,9 @@ function _chunkForNovel(text, metadata) {
processBuffer();
if (allChunks.length === 0 && text.length > 0) {
let start = 0;
let section = 1;
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
const chunkText = text.substring(start, end);
const tagName = getTagForSource('novel');
splitBySemanticBoundary(text, chunkSize, overlap).forEach((chunkText, idx) => {
const section = idx + 1;
const chunkMetadata = {
source: 'novel',
sourceName: sourceName,
@@ -522,13 +628,10 @@ function _chunkForNovel(text, metadata) {
chapter: "第1章",
section: section,
};
const tagName = getTagForSource('novel');
const prefix = `[来源: ${sourceName}, 第1卷, 第1章, 第${section}节]`;
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({ text: wrappedText, metadata: chunkMetadata });
section++;
start += (chunkSize - overlap);
}
});
}
return allChunks;
}
@@ -540,15 +643,15 @@ function _chunkForChatHistory(text, metadata) {
const allChunks = [];
if (!text || chunkSize <= 0) return allChunks;
let part = 1;
let start = 0;
// 时间写进 prefix 才能在检索后被反解回来ST 向量存储不持久化 metadata
const timeLabel = formatChunkTimeLabel(timestamp);
const tagName = getTagForSource('chat_history');
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
const chunkText = text.substring(start, end);
const prefix = `[来源: 聊天记录, 楼层: #${floor}, 第${part}部分]`;
const tagName = getTagForSource('chat_history');
splitBySemanticBoundary(text, chunkSize, overlap).forEach((chunkText, idx) => {
const part = idx + 1;
const prefix = timeLabel
? `[来源: 聊天记录, 楼层: #${floor}, 时间: ${timeLabel}, 第${part}部分]`
: `[来源: 聊天记录, 楼层: #${floor}, 第${part}部分]`;
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({
@@ -562,11 +665,7 @@ function _chunkForChatHistory(text, metadata) {
timestamp: timestamp,
}
});
part++;
start += (chunkSize - overlap);
if (start >= text.length) break;
}
});
return allChunks;
}
@@ -577,15 +676,11 @@ function _chunkForLorebook(text, metadata) {
const allChunks = [];
if (!text || chunkSize <= 0) return allChunks;
let part = 1;
let start = 0;
const tagName = getTagForSource('lorebook');
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
const chunkText = text.substring(start, end);
splitBySemanticBoundary(text, chunkSize, overlap).forEach((chunkText, idx) => {
const part = idx + 1;
const prefix = `[来源: ${bookName}, 条目: ${entryName}, 第${part}部分]`;
const tagName = getTagForSource('lorebook');
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({
@@ -599,11 +694,7 @@ function _chunkForLorebook(text, metadata) {
timestamp: new Date().toISOString(),
}
});
part++;
start += (chunkSize - overlap);
if (start >= text.length) break;
}
});
return allChunks;
}
@@ -615,16 +706,12 @@ function _chunkForManual(text, metadata) {
if (!text || chunkSize <= 0) return allChunks;
const timestamp = new Date();
const readableTime = timestamp.toLocaleString('zh-CN');
let part = 1;
let start = 0;
const readableTime = formatChunkTimeLabel(timestamp);
const tagName = getTagForSource('manual');
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
const chunkText = text.substring(start, end);
splitBySemanticBoundary(text, chunkSize, overlap).forEach((chunkText, idx) => {
const part = idx + 1;
const prefix = `[来源: ${sourceName}, 向量化录入时间: ${readableTime}, 第${part}部分]`;
const tagName = getTagForSource('manual');
const wrappedText = `<${tagName}>\n${prefix}\n${chunkText}\n</${tagName}>`;
allChunks.push({
@@ -636,11 +723,7 @@ function _chunkForManual(text, metadata) {
timestamp: timestamp.toISOString(),
}
});
part++;
start += (chunkSize - overlap);
if (start >= text.length) break;
}
});
return allChunks;
}
@@ -708,7 +791,13 @@ function getKnowledgeBases() {
return { ...globalBases, ...localBases };
}
function addKnowledgeBase(name, source = 'manual') {
/**
* @param {string} name
* @param {string} source
* @param {string|null} chatId - 非空时该库为"聊天级":向量集合按 `${chatId}_${taskId}`
* 命名空间隔离(独立聊天记忆模式下的聊天记录库),查询时只对该聊天可见
*/
function addKnowledgeBase(name, source = 'manual', chatId = null) {
if (!name || !name.trim()) {
throw new Error('知识库名称不能为空');
}
@@ -721,17 +810,28 @@ function addKnowledgeBase(name, source = 'manual') {
name: name.trim(),
enabled: true,
createdAt: new Date().toISOString(),
owner: charId,
source: source,
owner: charId,
source: source,
...(chatId ? { chatId } : {}),
};
bases[taskId] = newBase;
saveSettings();
console.log(`[翰林院-核心] 已为角色 ${charId} 添加新知识库: ${name} (ID: ${taskId})`);
console.log(`[翰林院-核心] 已为角色 ${charId} 添加新知识库: ${name} (ID: ${taskId}${chatId ? `, 聊天级: ${chatId}` : ''})`);
return newBase;
}
/**
* 计算知识库的向量集合 ID单一事实来源
* 聊天级库kb.chatId按聊天命名空间其余按 owner/角色命名空间。
*/
function getKbCollectionId(kb, scope = 'local') {
if (kb.chatId) return `${kb.chatId}_${kb.id}`;
if (scope === 'global') return `${kb.owner || GLOBAL_SCOPE_ID}_${kb.id}`;
return `${getCharacterStableId()}_${kb.id}`;
}
async function removeKnowledgeBase(taskId, scope) {
const charId = getCharacterStableId();
const bases = scope === 'global' ? getGlobalKnowledgeBases() : getLocalKnowledgeBases();
@@ -743,9 +843,8 @@ async function removeKnowledgeBase(taskId, scope) {
return;
}
const ownerId = scope === 'global' ? (base.owner || GLOBAL_SCOPE_ID) : charId;
const collectionIdToPurge = `${ownerId}_${taskId}`;
const collectionIdToPurge = getKbCollectionId(base, scope);
console.log(`[翰林院-核心] 准备删除知识库 ${taskId},将清空集合: ${collectionIdToPurge}`);
const purged = await purgeStorage(collectionIdToPurge);
@@ -792,30 +891,38 @@ async function queryVectors(queryText, options = {}) {
}
else if (settings.retrieval.independentChatMemoryEnabled) {
console.log('[翰林院-日志] 独立聊天记忆模式开启...');
const chatId = getChatId();
if (chatId) {
console.log(`[翰林院-日志] 添加当前聊天宝库: ${chatId}`);
basesToQuery.push({ id: chatId, name: `当前聊天 (${chatId})`, scope: 'chat' });
} else {
console.warn('[翰林院-日志] 无法获取当前聊天ID跳过聊天宝库。');
if (!chatId) {
console.warn('[翰林院-日志] 无法获取当前聊天ID聊天级知识库将被跳过。');
}
const globalBases = getGlobalKnowledgeBases();
const enabledGlobalBases = Object.values(globalBases).filter(b => b.enabled);
// 本地库过滤规则:知识类库(无 chatId照常可查
// 聊天级库(有 chatId只对所属聊天可见——这就是"独立"的含义
const localBases = Object.values(getLocalKnowledgeBases())
.filter(b => b.enabled && (!b.chatId || b.chatId === chatId));
if (localBases.length > 0) {
const chatScoped = localBases.filter(b => b.chatId).length;
console.log(`[翰林院-日志] 添加 ${localBases.length} 个本地知识库(其中 ${chatScoped} 个为当前聊天专属)。`);
basesToQuery.push(...localBases.map(b => ({ ...b, scope: b.chatId ? 'chat' : 'local' })));
}
const enabledGlobalBases = Object.values(getGlobalKnowledgeBases()).filter(b => b.enabled);
if (enabledGlobalBases.length > 0) {
console.log(`[翰林院-日志] 添加 ${enabledGlobalBases.length} 个已启用的全局知识库。`);
basesToQuery.push(...enabledGlobalBases.map(b => ({ ...b, scope: 'global' })));
}
}
}
else {
console.log('[翰林院-日志] 统一角色卡模式开启...');
const localBases = getLocalKnowledgeBases();
const globalBases = getGlobalKnowledgeBases();
const enabledLocalBases = Object.values(localBases).filter(b => b.enabled);
const enabledGlobalBases = Object.values(globalBases).filter(b => b.enabled);
basesToQuery.push(...enabledLocalBases.map(b => ({ ...b, scope: 'local' })));
// 聊天级库(独立模式期间产生)在统一模式下也可见,但需用 'chat' scope
// 才能拼出正确的集合 ID${chatId}_${taskId}
basesToQuery.push(...enabledLocalBases.map(b => ({ ...b, scope: b.chatId ? 'chat' : 'local' })));
basesToQuery.push(...enabledGlobalBases.map(b => ({ ...b, scope: 'global' })));
if (basesToQuery.length === 0) {
@@ -879,7 +986,9 @@ async function _executeQueryForBase(base, queryText, queryEmbedding = null) {
collectionId = await getDynamicCollectionId();
break;
case 'chat':
collectionId = base.id;
// 聊天级库:${chatId}_${taskId} 命名空间(独立聊天记忆)。
// 旧语义的裸 chatId 集合从未被任何录入路径写入过,无存量兼容负担
collectionId = base.chatId ? `${base.chatId}_${base.id}` : base.id;
break;
case 'global':
const ownerId = base.owner || GLOBAL_SCOPE_ID;
@@ -945,10 +1054,12 @@ async function _executeQueryForBase(base, queryText, queryEmbedding = null) {
switch (sourceTag) {
case '聊天记录':
newMetadata.source = 'chat_history';
const chatMatch = item.text.match(/楼层:\s*#(\d+),\s*第(\d+)部分/);
if (chatMatch && chatMatch[1] && chatMatch[2]) {
// 时间段为可选:兼容旧格式 [楼层: #X, 第Y部分] 与新格式 [楼层: #X, 时间: ..., 第Y部分]
const chatMatch = item.text.match(/楼层:\s*#(\d+)(?:,\s*时间:\s*([^,\]]+))?,\s*第(\d+)部分/);
if (chatMatch && chatMatch[1] && chatMatch[3]) {
newMetadata.floor = parseInt(chatMatch[1], 10);
newMetadata.part = parseInt(chatMatch[2], 10);
if (chatMatch[2]) newMetadata.timeLabel = chatMatch[2].trim();
newMetadata.part = parseInt(chatMatch[3], 10);
newMetadata.sourceName = `聊天记录 #${newMetadata.floor}`;
}
break;
@@ -1051,43 +1162,40 @@ async function getVectorCount(taskId = null, scope = 'local') {
console.warn(`[翰林院-计数] 在作用域 '${scope}' 中未找到ID为 ${taskId} 的知识库。`);
return 0;
}
const ownerId = scope === 'global' ? (base.owner || GLOBAL_SCOPE_ID) : charId;
const collectionId = `${ownerId}_${taskId}`;
return await countVectorsInCollection(collectionId);
// 聊天级库按 ${chatId}_${taskId} 命名空间计数getKbCollectionId 统一处理)
return await countVectorsInCollection(getKbCollectionId(base, scope));
} else {
if (settings.retrieval.independentChatMemoryEnabled) {
const chatId = getChatId();
if (!chatId) return 0;
const totalCount = await countVectorsInCollection(chatId);
console.log(`[翰林院-日志] 独立聊天记忆模式开启,聊天 ${chatId} 的向量总数: ${totalCount}`);
return totalCount;
}
// 总数统计与查询侧保持同一可见性规则:
// 独立模式 → 本地知识库 + 当前聊天的聊天级库 + 全局库
// 统一模式 → 全部本地库(含聊天级)+ 全局库 + legacy 宝库
const independent = settings.retrieval.independentChatMemoryEnabled;
const chatId = independent ? getChatId() : null;
console.log(`[翰林院-日志] 开始获取${independent ? '当前聊天可见的' : '所有'}知识库向量总数...`);
console.log('[翰林院-日志] 开始获取所有知识库的向量总数...');
const localBases = Object.values(getLocalKnowledgeBases());
const localBases = Object.values(getLocalKnowledgeBases())
.filter(base => !independent || !base.chatId || base.chatId === chatId);
const globalBases = Object.values(getGlobalKnowledgeBases());
const countPromises = [];
localBases.forEach(base => {
const collectionId = `${charId}_${base.id}`;
countPromises.push(countVectorsInCollection(collectionId));
countPromises.push(countVectorsInCollection(getKbCollectionId(base, 'local')));
});
globalBases.forEach(base => {
const ownerId = base.owner || GLOBAL_SCOPE_ID;
const collectionId = `${ownerId}_${base.id}`;
countPromises.push(countVectorsInCollection(collectionId));
countPromises.push(countVectorsInCollection(getKbCollectionId(base, 'global')));
});
const legacyCollectionId = await getDynamicCollectionId();
countPromises.push(countVectorsInCollection(legacyCollectionId));
if (!independent) {
const legacyCollectionId = await getDynamicCollectionId();
countPromises.push(countVectorsInCollection(legacyCollectionId));
}
const counts = await Promise.all(countPromises);
const totalCount = counts.reduce((total, count) => total + count, 0);
console.log(`[翰林院-日志] 所有知识库统计完成,总向量数: ${totalCount}`);
console.log(`[翰林院-日志] 知识库统计完成,总向量数: ${totalCount}`);
return totalCount;
}
}
@@ -1202,20 +1310,23 @@ async function processCondensation(messages, logCallback = () => {}, range = nul
kbName = `聊天记录: ${timestamp}`;
}
const existingKbs = Object.values(getLocalKnowledgeBases());
const foundKb = existingKbs.find(kb => kb.name === kbName);
// 独立聊天记忆模式下凝识结果按聊天分桶,与 ingestTextToHanlinyuan 的语义一致
const independentChatId = settings.retrieval.independentChatMemoryEnabled ? getChatId() : null;
const existingKbs = Object.values(getLocalKnowledgeBases());
const foundKb = existingKbs.find(kb => kb.name === kbName && (kb.chatId ?? null) === independentChatId);
if (foundKb) {
taskId = foundKb.id;
logCallback(`[翰林院-核心] 检测到同名知识库 "${kbName}",将数据合并入库。`, 'info');
} else {
logCallback(`[翰林院-核心] 准备为任务 "${kbName}" 创建专属知识库...`, 'info');
const newKb = addKnowledgeBase(kbName, 'chat_history');
const newKb = addKnowledgeBase(kbName, 'chat_history', independentChatId);
taskId = newKb.id;
}
const charId = getCharacterStableId();
const collectionId = `${charId}_${taskId}`;
const collectionId = independentChatId ? `${independentChatId}_${taskId}` : `${charId}_${taskId}`;
logCallback(`[翰林院-核心] 凝识任务已锁定知识库: ${kbName} (集合ID: ${collectionId})`, 'success');
const allChunks = [];
@@ -1478,18 +1589,102 @@ async function rerankResults(allResults, queryText, settings) {
finalScoredResults.sort((a, b) => (b.final_score || 0) - (a.final_score || 0));
console.log('[翰林院-Rerank] 元数据加权排序完成。');
let finalResults = finalScoredResults;
// 先按相关度截断 top_n再做时序排序——顺序反了会让"时序最早"而非"最相关"
// 的块占据名额超级排序把最旧楼层排最前slice 会扔掉高相关的靠后结果)
let finalResults = finalScoredResults.slice(0, settings.rerank.top_n);
if (settings.rerank.superSortEnabled) {
finalResults = superSort(finalScoredResults);
finalResults = superSort(finalResults);
}
return {
results: finalResults.slice(0, settings.rerank.top_n),
results: finalResults,
reranked: rerankedSuccessfully
};
}
/**
* 从"第十二章"/"第3卷"/"4"等字符串中解析序数,用于注入前的时序排序。
* 支持阿拉伯数字与常见中文数字(至万级);解析失败返回 MAX_SAFE_INTEGER排最后
*/
function _parseOrdinal(value) {
if (typeof value === 'number') return value;
if (!value) return Number.MAX_SAFE_INTEGER;
const str = String(value);
const arabic = str.match(/\d+/);
if (arabic) return parseInt(arabic[0], 10);
const cnDigit = { : 0, : 1, : 2, : 2, : 3, : 4, : 5, : 6, : 7, : 8, : 9 };
const m = str.match(/[零一二两三四五六七八九十百千万]+/);
if (!m) return Number.MAX_SAFE_INTEGER;
let total = 0, current = 0;
for (const ch of m[0]) {
if (cnDigit[ch] !== undefined) {
current = cnDigit[ch];
} else if (ch === '十') {
total += (current || 1) * 10;
current = 0;
} else if (ch === '百') {
total += (current || 1) * 100;
current = 0;
} else if (ch === '千') {
total += (current || 1) * 1000;
current = 0;
} else if (ch === '万') {
total = (total + current) * 10000;
current = 0;
}
}
return total + current;
}
/**
* 注入前的组内时序重排 + 断层提示。
*
* rerank/相似度负责"选哪些块",本函数负责"按什么顺序呈现"
* - chat_history 按楼层+部分升序;相邻块楼层跳跃时插入断层提示行,
* 避免 LLM 把"不打不相识"和"关系亲密"两个远隔的片段读成连续剧情
* - novel 按卷/章/节序数升序(中文数字章节号可解析)
* - lorebook / manual 按来源聚合 + part 升序,碎块归位
* 元数据缺失的块排在末尾、保持彼此原有顺序sort 稳定性)。
*/
function _composeInjectionText(source, results) {
const sorted = [...results];
const ord = (v) => (Number.isFinite(v) ? v : Number.MAX_SAFE_INTEGER);
if (source === 'chat_history') {
sorted.sort((a, b) =>
ord(a.metadata?.floor) - ord(b.metadata?.floor)
|| (a.metadata?.part ?? 0) - (b.metadata?.part ?? 0));
const parts = [];
let prevFloor = null;
for (const r of sorted) {
const floor = r.metadata?.floor;
if (prevFloor !== null && Number.isFinite(floor) && floor - prevFloor > 1) {
parts.push(`〔提示:以下内容与上文相隔约 ${floor - prevFloor} 楼,期间的剧情未被检索到,两段内容并非连续发生〕`);
}
parts.push(r.text);
if (Number.isFinite(floor)) prevFloor = floor;
}
return parts.join('\n\n');
}
if (source === 'novel') {
sorted.sort((a, b) =>
_parseOrdinal(a.metadata?.volume) - _parseOrdinal(b.metadata?.volume)
|| _parseOrdinal(a.metadata?.chapter) - _parseOrdinal(b.metadata?.chapter)
|| _parseOrdinal(a.metadata?.section) - _parseOrdinal(b.metadata?.section));
return sorted.map(r => r.text).join('\n\n');
}
// lorebook / manual同源聚合 + part 升序
sorted.sort((a, b) =>
String(a.metadata?.sourceName ?? '').localeCompare(String(b.metadata?.sourceName ?? ''), 'zh')
|| (a.metadata?.part ?? 0) - (b.metadata?.part ?? 0));
return sorted.map(r => r.text).join('\n\n');
}
async function rearrangeChat(chat, contextSize, abort, type) {
const injectionKeys = {
novel: 'HANLINYUAN_RAG_NOVEL',
@@ -1704,7 +1899,8 @@ async function rearrangeChat(chat, contextSize, abort, type) {
continue;
}
const formattedText = results.map(r => r.text).join('\n\n');
// 组内按时序重排 + 断层提示rerank 决定选哪些块,时序决定呈现顺序)
const formattedText = _composeInjectionText(source, results);
const placeholder = `{{${source.replace('_history', '')}_text}}`;
let injectionContent = injectionSettings.template.replace(placeholder, formattedText);
@@ -1751,6 +1947,13 @@ async function moveKnowledgeBase(taskId, fromScope) {
return;
}
// 聊天级库(独立聊天记忆产物)专属于单个聊天,移到全局会让所有角色
// 检索到某个特定聊天的记忆,语义矛盾,禁止
if (kbData.chatId && toScope === 'global') {
toastr.warning(`知识库【${kbData.name}】是聊天专属记忆,不能移动到全局。`);
return;
}
if (fromScope === 'local' && toScope === 'global' && !kbData.owner) {
console.log(`[翰林院-配置] 为旧版知识库 ${taskId} 补充所有者ID: ${charId}`);
kbData.owner = charId;