How do I prevent TB to fetch the contents of PDFs when dragging 'n dropping from Devonthink

(Pat Maddox) #21

I was really hoping that this would somehow parse the RTF link and put that in the $URL, but it doesn’t for me. If the line of text is “tinderbox forum” and the RTF link is then $URL gets set to “tinderbox forum” and is not clickable.

Am I doing something wrong?

(Paul Walters) #22

No, I was doing something wrong by suggesting that stamp.

What actually works is to drag the RTF link from $Text to $URL – which loads the x-devonthink-item:// link into $URL.

I need to go back to the drawing board on the the stamp. Something using RunCommand will work. More later.

(Pat Maddox) #23

Ah yeah using something like nokogiri to parse the URLs would work nicely.

Tinderbox is not rendering the RTF links as HTML links when setting up an HTML export template. I thought it used to… but I also haven’t imported much RTF into Tinderbox, so maybe I’ve just used Make Web Link...

Do we expect the links from imported RTF text to be converted to HTML links on HTML export?

(Paul Walters) #24

We might be wandering into another topic and I suggest posting this as a separate thread – just so other readers can locate it later.

I don’t know the answer, but I don’t know that I’ve ever tried this either.

(Paul Walters) #25

For those who have TextSoap (IMO, everyone should have TextSoap :smile:) – select the text of the imported TOC document, and use TextSoap’s Extract URLs by Replacing cleaner to convert RTF links (as in the TOC document) to their original URLs.

(Roger Alexander) #26

These, of course, are all effective at getting rid of the content after it’s been imported. But, they do not prevent the content from being imported itself. This is not much of a problem when the PDFs are small, but is if they are large. It also seems like a waste to have the same information (i.e., the PDF content) ending up i two places (i.e., DT and TB). Would there perhaps be a way to only create a note that has the name of the DT item and its url when, for example, dropping with the option key pressed?

(eastgate) #27

You can always make a note yourself and assign the DEVONthink URL to the note’s $URL.

(eastgate) #28

Upon further consideration, I’m inclined to agree:

  1. It makes sense for Tinderbox to import the text of short pdf items.
  2. It is not particularly useful to import the text of very long pdf items.

Two design problems now confront us:

  • Where do we draw the line between “short” and “long?” (My suggestion: 2000 words. Other ideas?)
  • What (if anything) do we do to indicate that we chose not to import a long text. (For example, might we import just the opening paragraphs?)

(Mitch) #29

I am so happy to see this is took into consideration. It’s amazing how dragging “a few” DTP items with approx. 5 pages long PDFs or Websites printed as PDFs can render Tinderbox on my new iMAC unusable for quite some time.

But to answer your question: Do you really want to walk the path to let the user define the threshold between long and short? I would consider a switch PDF import yes/no as more than I dared to ask. Of course zero<value<unlimited is also ok, as long as zero is an option?

(eastgate) #30

How long is “quite some time”?

(Mitch) #31

I took 32 notes with PDF & Text drag and dropped from DTP into Tinderbox-Container. Size of each DTP item ist between 75 and 270kB. It took 80 seconds until Tinderbox became responsive again (beachball shown until then) And being in the container, switching from one note to the other it takes 1-2 seconds until the text-window is updated.

Model Name: iMac
Model Identifier: iMac18,3
Processor Name: Intel Core i7
Processor Speed: 4,2 GHz
Number of Processors: 1
Total Number of Cores: 4
L2 Cache (per Core): 256 KB
L3 Cache: 8 MB
Memory: 32 GB

(eastgate) #32

Please email a copy of these 32 notes.

(Roger Alexander) #33

Perhaps I am missing something fundamental here, but I don’t really see the utility in having TB import the contents of a PDF. The text of resulting import, being the raw content of the PDF, is essentially gibberish. I never care about the raw content, only the rendered document is of value to me. What am I supposed to do with something like the following?

I would much prefer to just get a note that links back to the corresponding DEVONthink item, and then use a collection of imported notes to build a map within TB, sort of a layer of abstraction that sits over the DT database.

(Robert Powell) #34

I used this and it worked, but it didn’t import the metadata for any but the first of the group of DT records I had it grab. Has something changed since this was posted? FWIW, I’m no scripter, so I dare not try to work on it; I’d wreck it.

(Robert Powell) #36

I added the script to DTPro, opened a group (folder) and selected all 9 files in the folder. These were a mix of PDF+Text, RTFD, Bookmark, HTML, and RTF files. I ran the script and it saved the results as a .tsv file to my desktop. I then dragged that file into TB, making sure I didn’t add it to an adornment that assigns the prototype, etc. It created a group containing the nine TB notes. Only the topmost (first selected) file from DTPro had the metadata when checked in TB. The others had the fields, but nothing was entered into them. I don’t know if this helps at all, so don’t worry about it if it doesn’t. I’ve got to keep plugging away at my work anyway.