During an interesting zoom call with @andreas we explored the notion of writing for an AI. The AI’s stated preference was for consuming aTbRef as XML.
Prompt: Do you prefer to look using Tinderbox using the Tinderbox skill or by directly reading the XML?
Honestly, direct XML reading is faster and more reliable for most tasks:
- No dependency on Tinderbox being open
- Can search/parse the full document instantly
- Better for bulk queries and exploration
The Tinderbox skill (JXA) is better when you need to write changes back to a live, open document — creating notes, modifying attributes, etc.
So: read via XML, write via the skill. What are you looking to do?
Prompt: Why is it quicker or more effective to use the XML
A few practical reasons:
- Direct file access — I read the
.tbxfile with theReadorGreptool instantly, no round-trips throughosascriptand the Tinderbox app.- Bulk search — I can
Grepacross the entire XML for any attribute, note name, or value in one shot. JXA would require iterating note by note.- No app dependency — the file can be read even if Tinderbox isn’t running or the document isn’t open.
- Full structure visible — the raw XML exposes everything: attributes, prototypes, links, agents, adornments — nothing is hidden behind an API.
The JXA skill is more of a necessity for writes, since you shouldn’t edit a live Tinderbox document’s XML directly while it’s open (you risk corruption or having changes overwritten when Tinderbox saves).
Prompt: Could the XML be written more? Plainly to a to aid you in reading the structure of the document
Now I have a sense of it. The format is fairly readable but verbose — lots of attributes with many XML properties (
editable,visibleInEditor,lines,type,default) that are mostly boilerplate for system attributes.To answer your question: yes, it could be simplified for my purposes. A few approaches:
- Strip boilerplate attributes — for system
<attrib>definitions, most of the XML attributes (editable,visibleInEditor,lines) are noise. OnlyName,parent,type, anddefaultare semantically interesting.- Export a summary — Tinderbox can export note outlines or tab-delimited data that is far more compact than the full XML.
- Preprocess to a plain outline — a simple script could walk the XML and produce a flat or indented text representation of just notes, their key attributes, and their values, stripping the schema/definition boilerplate entirely.
How large is the file? That would tell me how much the verbosity is actually a problem in practice.
<checks>
200,000 lines and ~19MB — that’s substantial. Reading it raw would consume a lot of context. So yes, simplification would genuinely help.The most practical approach would be a Python script that parses the XML and outputs a compact representation — just note names, paths, key user-defined attributes, and text content, skipping all the system schema definitions. That could reduce it by an order of magnitude.
Would you like me to write such a script?
So we did. The plan was…
Plan: Tinderbox XML Simplifier (simplify_tbx.py)
A Python script to strip noise from the 19MB .tbx XML and produce a compact plain-text outline for LLM reading. Key decisions:
- Keep: note names, hierarchy, proto, semantic attributes (URL, Rule, Edict, AgentQuery, AgentAction, IsPrototype), text content, resolved links
- Strip:
<rtfd>blobs, visual/positional attributes,<preferences>,<windows>, schema definitions - Output format: indented plain-text outline (not XML or JSON)
- CLI options:
--max-text,--no-text,--output
The result: atbref-xml-ai.txt (2.6 MB).
Disclaimer: the txt file is collateral reference to the thread and not intended for use. Experiment with it (with AI) at your risk.
I’m not skilled enough yet to gauge the context size difference of the 2.6MB TXT file vs. the 19.2MB TBX file. It’s nice that the AI version is still readable to a human.
Given how an AI doesn’t ‘understand’ in a human sense I suspect that it has thrown out info on features it hasn’t needed or been asked to use. So as a general resource int might be limited. But for a (process) agent, this might be an interesting approach for a down-scaled reference for an agent with limited implementation flexibility.
A clear takeaway is that review and iteration are required for active use, not least as aTbRef reflects constant change in Tinderbox itself so the content and structure of the TBX is not settled.
Still I think this is investing in the context of AI+TB use, not least as smaller files mean a smaller context, which uses fewer tokens, which in aggregate is less cost be it to our pocket or destruction of the environment to build electrical plant and data centres.