How do I prevent TB to fetch the contents of PDFs when dragging 'n dropping from Devonthink

pat · January 2, 2018, 5:26pm

I was really hoping that this would somehow parse the RTF link and put that in the $URL, but it doesn’t for me. If the line of text is “tinderbox forum” and the RTF link is http://forum.eastgate.com then $URL gets set to “tinderbox forum” and is not clickable.

Am I doing something wrong?

PaulWalters · January 2, 2018, 5:43pm

No, I was doing something wrong by suggesting that stamp.

What actually works is to drag the RTF link from $Text to $URL – which loads the x-devonthink-item:// link into $URL.

I need to go back to the drawing board on the the stamp. Something using RunCommand will work. More later.

pat · January 2, 2018, 6:00pm

Ah yeah using something like nokogiri to parse the URLs would work nicely.

Tinderbox is not rendering the RTF links as HTML links when setting up an HTML export template. I thought it used to… but I also haven’t imported much RTF into Tinderbox, so maybe I’ve just used Make Web Link...

Do we expect the links from imported RTF text to be converted to HTML links on HTML export?

PaulWalters · January 2, 2018, 6:16pm

We might be wandering into another topic and I suggest posting this as a separate thread – just so other readers can locate it later.

I don’t know the answer, but I don’t know that I’ve ever tried this either.

PaulWalters · January 2, 2018, 6:19pm

For those who have TextSoap (IMO, everyone should have TextSoap ) – select the text of the imported TOC document, and use TextSoap’s Extract URLs by Replacing cleaner to convert RTF links (as in the TOC document) to their original URLs.

rtalexander · April 25, 2018, 10:14pm

These, of course, are all effective at getting rid of the content after it’s been imported. But, they do not prevent the content from being imported itself. This is not much of a problem when the PDFs are small, but is if they are large. It also seems like a waste to have the same information (i.e., the PDF content) ending up i two places (i.e., DT and TB). Would there perhaps be a way to only create a note that has the name of the DT item and its url when, for example, dropping with the option key pressed?

eastgate · April 25, 2018, 10:27pm

You can always make a note yourself and assign the DEVONthink URL to the note’s $URL.

eastgate · April 26, 2018, 4:19pm

Upon further consideration, I’m inclined to agree:

It makes sense for Tinderbox to import the text of short pdf items.
It is not particularly useful to import the text of very long pdf items.

Two design problems now confront us:

Where do we draw the line between “short” and “long?” (My suggestion: 2000 words. Other ideas?)
What (if anything) do we do to indicate that we chose not to import a long text. (For example, might we import just the opening paragraphs?)

rantanplan · April 26, 2018, 4:24pm

I am so happy to see this is took into consideration. It’s amazing how dragging “a few” DTP items with approx. 5 pages long PDFs or Websites printed as PDFs can render Tinderbox on my new iMAC unusable for quite some time.

But to answer your question: Do you really want to walk the path to let the user define the threshold between long and short? I would consider a switch PDF import yes/no as more than I dared to ask. Of course zero<value<unlimited is also ok, as long as zero is an option?

eastgate · April 26, 2018, 5:34pm

How long is “quite some time”?

rantanplan · April 26, 2018, 5:43pm

I took 32 notes with PDF & Text drag and dropped from DTP into Tinderbox-Container. Size of each DTP item ist between 75 and 270kB. It took 80 seconds until Tinderbox became responsive again (beachball shown until then) And being in the container, switching from one note to the other it takes 1-2 seconds until the text-window is updated.

Model Name:	iMac
Model Identifier:	iMac18,3
Processor Name:	Intel Core i7
Processor Speed:	4,2 GHz
Number of Processors:	1
Total Number of Cores:	4
L2 Cache (per Core):	256 KB
L3 Cache:	8 MB
Memory:	32 GB

eastgate · April 26, 2018, 7:51pm

Please email a copy of these 32 notes.

rtalexander · May 17, 2018, 10:35pm

Perhaps I am missing something fundamental here, but I don’t really see the utility in having TB import the contents of a PDF. The text of resulting import, being the raw content of the PDF, is essentially gibberish. I never care about the raw content, only the rendered document is of value to me. What am I supposed to do with something like the following?

I would much prefer to just get a note that links back to the corresponding DEVONthink item, and then use a collection of imported notes to build a map within TB, sort of a layer of abstraction that sits over the DT database.

ChemBob · May 18, 2018, 2:46pm

I used this and it worked, but it didn’t import the metadata for any but the first of the group of DT records I had it grab. Has something changed since this was posted? FWIW, I’m no scripter, so I dare not try to work on it; I’d wreck it.

ChemBob · May 19, 2018, 11:35am

I added the script to DTPro, opened a group (folder) and selected all 9 files in the folder. These were a mix of PDF+Text, RTFD, Bookmark, HTML, and RTF files. I ran the script and it saved the results as a .tsv file to my desktop. I then dragged that file into TB, making sure I didn’t add it to an adornment that assigns the prototype, etc. It created a group containing the nine TB notes. Only the topmost (first selected) file from DTPro had the metadata when checked in TB. The others had the fields, but nothing was entered into them. I don’t know if this helps at all, so don’t worry about it if it doesn’t. I’ve got to keep plugging away at my work anyway.

abusch · January 6, 2022, 5:12pm

It’s been a while since this was posted, but I am having the same problem (how do I drag information from DEVONthink to Tinderbox without having lots of stuff (esp. from pdfs) imported when I don’t necessarily want that but am mainly interested in the meta data and a link to the original document?

In case there has been a solution found yet that I have failed to find, can I suggest to answer the questions you posed (do you want the $Text to be filled or not) by using a modifier key with drag’n’drop? Say that “normal” drag’n’drop imports the $Text, while Cmd-drag’n’drop leaves it out?

That would get around the problem you mention and prevent the users from having to delete any imported $Text they don’t want in the first place…

eastgate · January 6, 2022, 7:06pm

Some users will want the text. Some won’t. It’s easier to delete unwanted text than to import text that isn’t imported.

It should be easy to write an agent or an edict that deletes the text of newly-imported DEVONthink notes…

abusch · January 6, 2022, 8:46pm

Thanks for the quick reply!

To your suggestion: I’d probably best do it with a stamp as there are several containers in my Tinderbox file that I want to drag records from DEVONthink into.

However, two remarks:
a) it seems to be difficult to delete $Text imported from DEVONthink as it is marked “read only”. Had Tinderbox become unresponsive for about 30 seconds (spinning beach ball) when I tried it earlier today a couple of times; will check whether / how it works with a stamp.
b) the problem you mention (users want different things) is solved if a modifier key is used. (When importing records from Bookends, modifier keys are also used / useful). So I repeat: couldn’t that be an idea?

jjvornov · January 6, 2022, 9:15pm

The “read only” problem is easy to fix. It’s checked in the built in prototype, so if you uncheck in the prototype the notes will become editable. I leave it on, but have the attribute visible so I can uncheck it on a note by note basis if I want to edit. It makes sense in most cases for me to not edit what I’ve brought in.

James

abusch · January 6, 2022, 11:05pm

Yup, that’s right. All I can say that in spite of unchecking the “Read Only” boolean, Tinderbox would still refuse to delete the $Text if I tried it manually (at least more often than not).
Can’t quite see why there shouldn’t be several distinct ways of dragging in things from DEVONthink - which is where I (like many others) keep the documents that we want to analyse in Tinderbox. Summarising them, extracting things like quotations etc. and integrating them into an analysis is best done there; but importing the metadata is most easily done through drag’n’drop.