====== chat_llm: LLM-backed chat engine ======

''chat_llm'' is the JavaScript chat engine behind Synchronet's LLM-powered
Guru.  It turns a large language model (running locally under Ollama, or any
OpenAI-compatible endpoint) into a conversational Guru that can answer
questions, search your message and file bases for grounding (RAG), call
tools, and remember each caller across sessions.

It is **transport-agnostic** and serves three callers from one engine:

  * **private** — the classic 1:1 Guru page from a terminal session.
  * **multinode** — Guru participation in multi-node channel chat.
  * **irc** — the [[module:chat_llm_irc|IRC bot adapter]].

The engine lives in ''exec/chat_llm.js''.  Its companions are
[[config:chat_llm.ini]] (configuration), the [[module:llm_tools|tool
registry]], and the [[module:llm_index|RAG index builder]].  For a
start-to-finish setup walk-through, see [[howto:llm-guru|Setting up the LLM
Guru]].

===== Enabling it =====

Two steps, no recompile:

  - In [[util:SCFG]] → [[config:chat_features#artificial_gurus|Chat Features → Artificial Gurus]], edit your Guru and set the **Module** field to ''chat_llm''.  An empty Module field keeps the legacy ELIZA-style pattern engine driven by the Guru's ''.dat'' answer file.
  - Configure the backend (endpoint, model, prompts) in [[config:chat_llm.ini]].

That is all that is required for a working LLM Guru.  Retrieval (RAG) and the
bundled tools are optional layers configured separately.

===== Entry points =====

A chat module is a JavaScript file under ''exec/'' that exposes the functions
the BBS calls.  ''chat_llm'' exposes two **high-level** entries — these are
the contract every chat module must satisfy:

  * ''chat_session(input, ctx)'' — process one caller turn; returns the reply string (or ''null'' to stay silent).  Loads the caller's memory, dispatches to the model, persists the updated memory.
  * ''open_session(ctx)'' — produce an opening greeting at the start of a session (or ''null'' for none).

It also exposes lower-level helpers callers may use directly:

  * ''llm_chat(input, ctx, opts)'' / ''llm_open(ctx, opts)'' — pure dispatch, **no** memory load/save.  For unusual flows that manage their own history.
  * ''ctx_from_user(useron, persona_code, persona_name, supports_utf8)'' — the standard way to build a ''ctx'' (below) from a Synchronet ''User''.

===== The ctx object =====

Every entry point takes a ''ctx'' (chat context) object.  It carries who is
talking, the transport mode, the conversation so far, and a few output knobs.
A custom module receives this same object.

There are three **modes**: ''private'' (1:1 Guru page), ''multinode'' (channel
chat), and ''irc'' (IRC bot).  Most fields are shared; a few only matter in
channel modes.

==== Caller-supplied fields ====

^ Field ^ Type ^ Meaning ^
| ''persona'' | ''{code, name}'' | Bot identity.  ''code'' selects the [[config:chat_llm.ini]] section **and** the memory-file namespace; ''name'' is the display name. |
| ''speaker'' | ''{id, alias, attrs}'' | Who is talking.  ''id'' is namespaced (''%%"user:42"%%'', ''%%"irc:vert/frosty"%%'') and becomes the memory filename.  ''alias'' is the display name.  ''attrs'' holds ''{real_name, level, location, lang}'' for BBS users, or ''{}''. |
| ''mode'' | string | ''%%"private"%%'', ''%%"multinode"%%'', or ''%%"irc"%%''.  Channel modes prefix each turn with the speaker's name so the model can attribute multi-party lines. |
| ''transcript'' | array | Conversation history, newest last; each turn is ''{who, text, ts, bot}'' (''bot:true'' marks the bot's own turns).  ''chat_session''/''open_session'' auto-load this from memory, so callers normally pass ''%%[]%%''. |
| ''participants'' | array | ''{id, alias}'' roster for **multinode**; empty otherwise. |
| ''addressed'' | boolean | In channel modes, ''false'' makes the engine return ''null'' (stay silent).  Always ''true'' in private mode. |
| ''supports_utf8'' | boolean | Gates the language directive and output charset; forces English to non-UTF-8 terminals for non-Latin scripts. |
| ''channel'' | string | Channel name (channel modes); used to validate relay recipients. |
| ''channel_context'' | string | Recent room chatter, injected so the model can follow cross-conversations. |
| ''channel_members'' | object | IRC: current channel roster, used to validate relay recipients. |
| ''seen_members'' | object | IRC: nicks seen before, used for deferred ("tell them when you see them") relays. |
| ''typing_speed_factor'' | number | Per-character output speed for the terminal typing animation; ''0'' disables it (IRC uses ''0''). |
| ''simulate_typos'' | boolean | Whether the terminal typing animation includes fat-finger/transposition typos. |

==== Engine-set fields (read-only) ====

The engine writes a few fields **back** onto ''ctx'' during a call.  A module
author treats these as outputs — do not set them:

  * ''ctx._profile'' — a one-line diagnostic summary (persona, mode, retrieval stats), suitable for a chat log.  Read it **after** ''chat_session()'' returns.
  * ''ctx._streamed'' — ''true'' when the engine already streamed its reply to the terminal itself, so the caller should not re-emit the returned string.
  * The engine also attaches retrieval diagnostics (''ctx._rag_*'') and other internal scratch fields; treat anything beginning with ''_'' as read-only engine state.

==== Examples ====

**Private (1:1 Guru page)** — the standard BBS path:

<code javascript>
var ctx = ctx_from_user(user, "guru", "The Guru",
                        console.term_supports(USER_UTF8));
var greeting = open_session(ctx);            // optional opening line, or null
var reply    = chat_session("what subs cover C programming?", ctx);
log(ctx._profile);                           // read AFTER the call
</code>

**Multinode channel turn:**

<code javascript>
var ctx = ctx_from_user(user, "guru", "The Guru", true);
ctx.mode         = "multinode";
ctx.addressed    = true;
ctx.participants = [{id:"user:42", alias:"Frosty"},
                    {id:"user:7",  alias:"Digital Man"}];
ctx.channel_context = "...recent channel lines...";
var reply = chat_session("anyone know the FidoNet zone for Europe?", ctx);
</code>

**IRC (built by hand, no User object):**

<code javascript>
var ctx = {
    persona:       { code: "guru:irc", name: "The Guru" },
    speaker:       { id: "irc:vert/frosty", alias: "Frosty", attrs: {} },
    participants:  [],
    transcript:    [],
    mode:          "irc",
    supports_utf8: true,
    addressed:     true,
    typing_speed_factor: 0,            // no terminal: no per-character animation
    simulate_typos:      false,
    channel:         "#synchronet",
    channel_context: format_channel_context("#synchronet"),
    channel_members: chan_members("#synchronet"),
    seen_members:    chan_seen("#synchronet")
};
var reply = chat_session("Frosty: what's new?", ctx);
</code>

To test from the command line without a live session, run the engine under
[[util:jsexec]] and build a ''ctx'' from a stored user — pass ''new User(N)''
to ''ctx_from_user'', or ''null'' for an anonymous tester context.

===== How a turn is processed =====

For each caller turn, ''chat_session'':

  - **Checks for a "forget me" command** (e.g. ''%%/forget me%%'', ''%%forget everything%%'') and, if matched, wipes that caller's stored memory and replies with a canned confirmation.
  - **Classifies intent** to decide cheaply whether retrieval and tools are even worth invoking for this input.
  - **Retrieves grounding (RAG)** from the configured [[module:llm_index|BM25 index]], injecting the top hits — but only when they clear a relevance gate, so off-topic questions don't pull in noise.
  - **Runs the tool loop**, offering the model the registered [[module:llm_tools|tools]] and feeding their results back until it produces a final answer.
  - **Post-processes** the reply: strips control codes that could trigger hang-up/quit, and appends a verified wiki URL when the answer was grounded in a wiki page.

===== Persistent memory =====

''chat_session''/''open_session'' keep a per-(speaker, persona) memory file
under ''data/chat/''.  Each file holds a rolling window of recent turns plus
an LLM-compressed long-term summary, and survives BBS recycles.  A caller can
erase their own memory at any time by saying ''%%/forget me%%''.

Memory behavior is tuned in [[config:chat_llm.ini]] (''memory_persist'',
''history_window'', ''summarize_threshold'', ''memory_max_age_days'', and
related knobs).  Set ''memory_persist = false'' for public contexts where you
don't want to retain anything about strangers.

===== Writing your own chat module =====

You are not limited to ''chat_llm''.  Any JavaScript file under ''exec/'' that
defines ''chat_session(input, ctx)'' and ''open_session(ctx)'' — and accepts
the ''ctx'' object above — can be named in the Guru **Module** field.  This
lets you back a Guru with a different engine, a scripted persona, or an
entirely custom integration, without touching Synchronet's C/C++ code.

===== See Also =====

  * [[config:chat_llm.ini]] — engine and chat configuration
  * [[module:llm_tools]] — the tool registry and bundled tools
  * [[module:llm_index]] — building the RAG index
  * [[module:chat_llm_irc]] — the IRC bot adapter
  * [[howto:llm-guru]] — start-to-finish setup
  * [[config:chat_features#artificial_gurus|Artificial Gurus]] — the SCFG menu

{{tag>chat guru llm chat_llm ai}}