How to find word frequency in document

Hi, a few simple Google searches didn’t get me very far… what (if any) is the simplest way to find the frequency of words used in a TBX document?

I’ve seen the word cloud hidden away in the “Get Info” menu item. That seemed like a good place to start, but I’d like to see a list showing exactly how many times words are used, not an image scaling words up/down.

Have you looked at the Get Info repetition tab?

Thanks @mwra the Repetition tab definitely gets me what I asked for.

It doesn’t seem like there’s any way to export or copy/paste the information in that tab, is there?

It seems not. But most likely because no-one ever asked for that. Recall, Tinderbox is a toolbox rather than a utility for a specific purpose. I suggest dropping a line to Eastgate with the above as a feature request. It would be worth clarifying exactly what data you expect the end up on the clipboard and in what format (e.g. as a tab-delim table or whatever).

I’ve updated my article on the repetition tab tab to make current behaviour a bit clearer.

@mwra That’s a good change to your documentation, thanks!

What’s the best way to send feature requests to Eastgate?

Just FYI, one reason I couldn’t find this by searching was that I was using search terms like “word frequency” and “word concordance”. DEVONthink’s documentation has the exact feature I was looking for under “Concordance”.

1 Like

Write to info@eastgate.com, with what you’re after. Detail helps; i.e. actually write down all the ‘obvious’ assumptions about how you imagine the feature will work as the assumptions help in figuring out if it’s a simple/difficult/impossible task.

As to the word frequency bit, I’ll look at Help (which i don’t write, but contribute to) and I can probably tweak my aTbRef page so the relevant phrase is there.

Some way to export the word frequencies should not be difficult; I’ll add it to the roadmap.

1 Like

Being able to export or copy the list of words and their respective number of occurrences would be fantastic. Any other processing or sorting of the list could just be outsourced to a spreadsheet.

I think this will be in Tinderbox 8 – and probably in a backstage release in a week or so. Looks fairly straightforward.