Advanced chat: streaming cancel, long-chat compression, book-vs-cross-book memory
Subtle behaviours that come up after a while of using the Coach — streaming cancel, long-chat compression, cross-book memory. Here's how they work.

1 · Cancel a stream
AI replies stream (typed one chunk at a time) — you can stop mid-stream.
| How | When |
|---|---|
Esc |
The AI is writing and you see it's going the wrong way |
| ⏸ button next to the message | Mouse nearby |
| Start typing a new message | Auto-cancels the current stream |
After cancel:
- What's already produced stays (AI wrote 3 paragraphs, you stopped at 4 → 3 stay)
- No charge for the remaining tokens (cancelled tokens don't count)
2 · Long-chat auto-compression
When a chat exceeds ~50 messages or ~150K tokens of content, Slima auto-compresses older turns:
- Early turns → the AI writes a "chat summary" pinned at the top
- Summary preserves decisions, conclusions, context
- Original messages aren't deleted, just collapsed — expandable
Trigger notice
The panel shows a banner: "Chat auto-compressed — 12 early messages summarised" + an "Expand original" button.
Avoid being compressed: pin or summarise manually
If you know a chat will be long and want certain turns preserved:
- Right-click → Pin any important message → won't be compressed (up to 5 pins)
- Or, at chat end, "Summarize so far" → the AI writes a summary at the top
3 · Book vs cross-book memory
Default: the Coach's memory is per-book:
- You told the Coach "the protagonist loves the sea" in Book A → Coach in Book B doesn't know
- Privacy / clean context
Enable cross-book memory
Chat settings (gear) → "Cross-book memory" → pick source books.
With it on:
- The Coach can read content from other books in the current chat
- Useful for: screenplay adaptation referring to source novel
- Or: writing a sequel, staying consistent with the prior book
AI Memory (account-level)
Even more general: AI Memory crosses chats too — your preferences, style, recurring phrasing. Configured in Account → AI Memory.
See: AI Memory: the AI's impression of you
4 · Rate limits / reasoning timeouts
Occasionally you'll see:
- "Please retry in a moment" → vendor-side rate limit, wait 5–10 seconds
- "Timed out" → AI thinking exceeded 60 seconds with no response, auto-cancelled
In these cases:
- Nothing is lost (your prompt remains)
- Click "Retry" to resend
Related
Was this helpful?