How to Grep all your tinderbox documents


(Michael Prenez-Isbell) #1

find . -iname “*.tbx” -print0 | xargs -0 grep -Eo ‘.{0,40}Peter.{0,40}’ *.tbx

grep is the unix command line tool that lets you search for text in a set of documents.

Run this command in terminal mode in the directory where your tbx documents are (or if you want a REAL sweep do find \homedirectory to find .tbx files in all your subdirectories.

the -print0 and -0 switches are to get rid of characters that might interfere with the search.

the | is the unix pipe command, sends the output of the find command (the file names and paths) to the grep command

xargs is how you tell find to execute grep next

the grep -Eo tells grep to use the new Perl regular expression engine

.{0,40} means show me 40 characters on either side of ‘Peter’

this is known as a Bash Shell command and you can run it on any Mac in a terminal window. All kinds of places to look up bash shell on the Internet; I strongly recommend the two books from O’Reilly, Learning and the Cookbook.

output looks like this

23%20PM

@mwra @JFallows thought you might enjoy this hack. The idea is, find a string in some forgotten .tbx file set aside a couple of years ago…but you KNOW it’s there somewhere…for when spotlight and mdfind juuuuust aren’t good enough…

and thanks to @eastgate for not zipping your files like OmniOutliner does and creating more work than should be necessary to perform this simple task.


Agent searching / summarising multiple Tinderbox Files?
(Mark Anderson) #3

Neat. The point about knowing a value is in a TBX, but which, rings so true.


(Michael Prenez-Isbell) #4

Foxtrot shows content with Omni Outliner, but not with Tinderbox.


(Michael Prenez-Isbell) #5

strange and wonderful how a little line of code like this can make me happy.

remember case sensitivity.

I posted the long bash script I had to write to dump omnioutliner on the omnigroup forum. there went two days of my life. I would have just bought foxtrot if I’d known about it.


(Michael Prenez-Isbell) #7

one other point about grep–foxtrot requires you to build and maintain an index for specific locations to work. grep just works, no indexing required.

pathfinder doesn’t do what I want. unless I’m just not seeing it, didn’t see any content searching at all.


(Paul Walters) #8

Yep. I’m going thru that now – every time FoxTrot builds an index it corrupts it the next day. Piece of junk.


(David) #9

You might try support at CTM. I’ve used FoxTrot Pro for years and years and rarely had a problem. When I’ve had one, I’ve had excellent support. It is a product with a unique feature set. The searching is incredibly fast (much faster than grep) and it has operators for complicated searches on par with DevonThink. There are a few other tools for refining searches. My favorite tool–unique as far as I know–is the ability to preview a PDF with each search term in a different highlight colour. This is useful when you have one high frequency term and one low frequency term–but you need both–because you can find the low frequency hits fast. I use this to manage a large collection of PDFs, many book-length.

FoxTrot depends on a QuickLook importer to read non-standard file types like OmniOutliner, though in this case it just uses the raw data as if it were plaintext. Since, so far as I know, there is no QuickLook importer for Tinderbox this explains the difference between hits for OmniOutliner and Tinderbox in FoxTrot.

Unfortunately, this design makes FoxTrot vulnerable to poorly written Quicklook importers–of which there are many. This can cause crashes or corruption. You can manage which importers are used by launching FoxTrot with cmd-option held down. There is a “blacklist” facility which gives clues about rogue importers too, found in the Manage Indices dialog.

Index updates can be scheduled as frequently as hourly. There is an iOS companion app that satisfies a narrow range of uses.

(Just a happy customer, no affiliation with CTM the makers of FoxTrot.)

David.


(Paul Walters) #10

Thanks. Believe me, CTM support and I are old friends.


(Michael Prenez-Isbell) #11

well, for better or worse, I just bought the professional license. I’ll keep an eye out for problems. It looks like it’s artisanal software as well, and the author is Jerome–do I have that right?

However, Foxtrot doesn’t seem to peer into Tinderbox files. It finds them just fine, but you don’t get a string preview. Is that the Quicklook functionality?


(Michael Prenez-Isbell) #13

Yeah, I just confirmed QL on Tbx, you actually get preview too (which you don’t get on OmniOutliner, although you do get it on OmniGraffle…c’mon, Omnigroup). I just dropped Jerome a note asking him to get us .tbx support, we’ll see. I think the ideal quicklook for Tbx would be text without the xml tags, rather than a graphical representation. Outline view, in other words.

https://developer.apple.com/library/archive/documentation/UserExperience/Conceptual/Quicklook_Programming_Guide/Introduction/Introduction.html

in case anyone is feeling ambitious…you know, I have code that reads and parses a .tbx file from a couple years back, I should turn it into a library and distribute it on github so people can easily write things. Note to self, make it so. From my aeonxml parser, which I stopped working on as I got less interested in aeonxml :slight_smile:

oh right, it’s in Swift 1.2, groan. and we’re up to Swift 4.1 with big changes in the language sigh.

Swift 4.2. Yikes.


(Michael Prenez-Isbell) #15

I just read the manual. It’s not that hard to write a QL plugin for finder, and no one gives a holler if it was written by the original developer or not as long as it works.


(David) #16

Yes, interesting. There are a lot of filetypes on the Mac whose underlying format is XML. It seems a shame there is no flag that says, “I’m XML really,” so that a system-wide XML previewer (or maybe fancier parser) got a bite at it. I had thought the Universal Type Identifier (UTI) system that allowed multiple declared filetypes would permit this, but maybe not. I don’t know. A simple Quicklook importer that did nothing but passover plaintext XML would be a generally useful asset.


(Michael Prenez-Isbell) #18

that’s interesting, never used it. I’ve seen it mentioned on the other board, and by Mark. Worth looking at? Using?


(David) #19

I believe it is true to say that if a filetype has a QuickLook extension associated with it then FoxTrot can be configured to index it. Some of these may be XML-based, e.g. docx.


(David) #21

I understand what you have written about the design and original intent of Quicklook plug-ins. However, my understanding from CTM is that Quicklook plug-ins can be used for other purposes than previewing. They can be used to retrieve a representation of the content in the file that Foxtrot can use for building its index and thus for subsequent search. In some cases, when you look at the search results in Foxtrot what you see is the “raw” output of the Quicklook plug-in that was used to add the content into the index. For this reason, it is sometimes solely possible to index some files in Foxtrot if a Quicklook plug-in is available. If this were not so, there would be little need for Foxtrot to provide a facility for managing its use of Quicklook plug-ins.

The tenuous, remaining link to Tinderbox is that it appears possible to create a Quicklook plug-in for Tinderbox files that might provide content from those files in a representation usable by Foxtrot for its indexing, and thus subsequent searching.

You are quite right that Foxtrot has some indexing capability that is not dependent on Quicklook plug-ins, filenames being one example.

[At a technical level, there is no reason why Foxtrot cannot make use of the Quicklook architecture to load (qlgenerator) plug-ins, since these plug-ins in ordinary use provide functions for providing content with different representations. These functions are used by the Quicklook daemon to provide the functionality invoked, e.g., by pressing space in the Finder. A cursory examination of the Quicklook developer documentation makes this clear.]


(Paul Walters) #22

As usual, you’re right. Removed my pointless opinions.


(jmm) #24

HoudahSpot is capable of limiting searches to Tinderbox files and locate them by searching inside their content. Since TB doesn’t yet have a quicklook plugin, HS preview shows only the same icon as Foxtrot. However, HS has an unmodifiable text view as well, in which a new search can be performed inside the raw tbx file.

I would be very interested to know what other common tools you refer to. I am not satisfied with linking to DT from TB, and have been thinking of substituting those links with fail-proof searches. Perhaps stamping a UUID in a modifiable file property that can be searched afterwards. HS would be fast at finding such files and, according to its User Guide, it has a url that allows searches houdahspot4://search?q=* and it is scriptable with Applescript.


(Michael Prenez-Isbell) #25

Another neat thing you can do is use regular expression syntax to find ‘near misses’

For example

find . -iname “*.tbx” -print0 | xargs -0 grep -Eo ‘.{0,40}sep[ea]r[ea]te.{0,40}’ *.tbx

will find common misspellings of the word ‘separate’ as well as the correct spelling.


(Andreas Grimm) #26

And DEVONthink wouldn’t work in this regard? i’ve been using DEVONthink for such search tasks almost exclusively for the last few months. And it works flawlessly.

Of course, one should keep the tinderbox containing folder always indexed. But that‘s the case anyway, isn’t it?

Yet thanks for the grep-code. Makes a nice Textexpander Snippet.


(David) #27

In relation to Foxtrot it seems I was not right. A recent message on the Foxtrot discussion list indicates that they use “Spotlight metadata importers” to index various files. (I put it in quotes because that is their description.) So it is true they use third party plug-ins, evidently of a different kind to the Quicklook plug-ins. Perhaps one of these Spotlight metadata importers would help.

However, apparently there is a simpler solution according to CTM support. If Tinderbox files were to have a UTI declaring the file as of XML type, then their built-in xml importer would work. (I suppose this explains the OmniOutliner success, since it has an XML UTI.)

There is an immediate solution according to CTM support, using a hidden preference. If one were to enter this command at the terminal, Foxtrot should index Tinderbox files:

defaults write com.ctmdev.foxtrot Aliases -array-add "{type='tbx'; as='xml';}"

I have not tested this yet, but if it works it would be a boon for those who find Foxtrot useful.

(It was a fortuitous coincidence that someone asked just the right question on the Foxtrot discussion list to bring these answers to light.)