I have a working solution based on the following approach. Within DevonThink
- go through the PDF document and highlight the text passages of interest
- optionally (I do it most of the times) add a description of the highlighted text (I use this to set the note name)
- Use the
Summarize Highlights
menu function and select as Sheet format to create a tabular summary of all highlights - Right-click on the newly created summary file and select
Convert
and thento Markdown
What you should end up is something similar to this text below drawn from the same example case (UN Paris Agreement 2015).
UNParisAgreement Summary.txt (2.4 KB)
which looks something like this in BBEdit
Copy the text to a Tinderbox Note $Text field and apply Explode
to create one note per line. Delete the first two of the new notes which are the headers and apply the following stamp to the rest of the notes
var offset=0;
$MyList=$Text.split("\|");
$MyNumber=$MyList.count;
if($MyNumber==5){offset=1;};
$Body=$MyList.at(3);
if($MyNumber==6){
$Title=$MyList.at(4);}else{$Title="Note";};
$MyString=$MyList.at(5-offset);
$MyString2=$MyString.skipTo("(");
$ReferenceURL=substr($MyString2,0,-1);
$Name=$Title; $Text=$Body;
Note the use of offset to address cases in which I’ve highlighted text but not assigned a description. Also for lack of a counter I just give the note a title "Note"
in such cases. I realise my code is not going to win any beauty contests here and you are welcome to suggest improvements.
It almost works flawlessly. Here below an extract. You’ll see that the stamp did not extract the information on the final note. I think this is linked to the semi-column in the body text - I’m not sure what the best way is to deal with these cases.
Below the full TB file as a reference. Let me know if you have any ideas to improve the solution and deal with special characters such as semi-column.
ParisAgreementExample.tbx (131.8 KB)
11 March 2022: A final update to my procedure above. If I search and replace the character “;” with “-” in the text before Exploding the text into individual highlights I get the desired output.
I consider the new approach to importing DevonThink notes an improvement over the method elaborated in 2020 found in this thread as there is no intermediate step via spreadsheets programme, editing attribute names or exporting the document outside of DevonThink.