After a considerable time as a fairly vocal skeptic of LLMs, I have changed my mind. Large Language Models (LLMs) can be an extraordinary Tinderbox companion.
I’ve now had about two weeks experience with a new, experimental Tinderbox build that “talks” to Anthropic’s Claude Desktop ($20/mo, and should also be able to communicate with many other AI models). Some observations:
-
Claude learns to use Tinderbox with remarkable facility. At first, it just guesses — and it has an unfortunate confidence that its guesses must be right! That’s annoying, but not very harmful. After supplying Claude with a one-page “cheat sheet” to get started, it became quite good at routine Tinderbox manipulations.
-
I spent about half an hour explaining Posters to Claude, and showed it the Mermaid example from the Poster demo. Claude got it almost immediately. (It still occasionally forgets to set $PosterTemplate or confuses it with $HTMLExportTemplate. Who doesn’t?)
-
Claude is really good as a research assistant, finding excellent suggestions. I had it gather planned reading from my own very ill-sorted Book Notes and it built a nice list. I asked for further reading on a variety of topics, and its suggestions were remarkably good. (If Claude were making stuff up, it wouldn’t matter for this application as I’d catch it right away.)
-
Claude was less good at discussing a tricky text. I asked it to read Emerson’s The American Scholar, and it went straight to Cliff’s Notes and to a term paper mill. Sigh. Pushing a little bit did help.
-
A key element to all this is that I leave notes for Claude (in
/Hints/AI/Claude/Readings
) that I expect it to read before each session, and Claude leaves notes to itself (in/Hints/AI/Claude/Notes
) for later reference. -
Claude is also quite good at boring work that sometimes comes up in research, like reformatting tables or finding syntax errors.
A key is to think of the AI as an undergraduate research assistant whom you don’t know very well. Claude is overconfident. It takes shortcuts. It’s not always honest, and it’s terrible at introspection. Claude is a shameless flatterer. Still, Claude is extraordinarily well read.
This experiment build is currently available to backstage users. I’d love to hear from other folks using AppleScript-based approaches to integrating AIs about what works and does not work, and what they wish they had known sooner.