Reading plaintext data from file into Tinderbox

The recent investigation of re-working imported RIS data (due to weaknesses in format specifications) had me thinking more widely. I can drag/drop plaintext (text or data) files into Tinderbox and generally, these arrive as $Name of the filename (including the extension) and $Text as the file contents.

But if we don’t want to—or can’t easily— drag drop, e.g. working on a small laptop screen on a train or plane, it might help to use the source file’s path (via $File), to get the data. By adding $File to a note’s Displayed Attributes we can then use a macOS File Chooser dialog to find our target file and store its OS path, so as then to be able to work on/with it using a function:

function fReadFileTextToNote(iPath:string,){

	runCommand("cat '"+$File+"' | pbcopy");
	$Text = runCommand("pbpaste");
	$Name =  iPath.split("/").at(-1);

}; // END FUNCTION

Called by code (e.g. in a stamp, rule, etc.):

fReadFileTextToNote($File);

Note that as the function returns nothing, we don’t need a left-side object to hold a result. So far, so good but perhaps whilst the actual filename is a good longterm note title ($Name) the file extension isn’t really meaningful. We can thus improve our function, even if at ‘cost’ of needing one extra input argument:

function fReadFileTextToNote(iPath:string, iNoExt:boolean){

	var:string: vFilename = iPath.split("/").at(-1);

	if(iNoExt){
		vFilename = vFilename.split("\.").at(0);
	}

	runCommand("cat '"+$File+"' | pbcopy");
	$Text = runCommand("pbpaste");
	$Name = vFilename;

}; // END FUNCTION

Called thus:

// true boolean to remove filename extension
fReadFileTextToNote($File, true);
// false to retain
fReadFileTextToNote($File, false);
// or use a stored boolean
fReadFileTextToNote($File,$MyBoolean);
// or variable
fReadFileTextToNote($File,vDropExt);

After all that, so what? I think the possible use of this pattern is more to do with making it easier to get an do things with the data before the $Text is written. IOW, use this pattern as a basis for grabbing the contents of a plain-text based file (TXT, XML, HTML, OPML, CSV, Tab-delim, etc.) to do something with it en route to forming a note’s $text, or even just populate attribute(s).

Thus we might have a function along these lines:

function fReadTextFileContents(iPath:string,){

	runCommand("cat '"+$File+"' | pbcopy");
	var:string vSource = runCommand("pbpaste");
	// do whatever is needed with the imported data ...

}; // END FUNCTION

Called by code (e.g. in a stamp, rule, etc.):

fReadTextFileContents($File);

In summary, this offers a simple pattern for reading a text file into Tinderbox action code using the command line, which will likely not be familiar to non-programmers. You could do this same task via AppleScript, for instance, or other automation methods: there is no single ‘right’ way.

Here, in a few lines of action code is something you can put in a library file against the day when you might need to do this very task, and build off it. So it is part of a solution as opposed to a full solution to a closed-ended task.

Above, I’ve used code reflecting some recent changes (as recently as the current v9.5.0). Thus I’ve included some notes below covering such requirements and also some of the operators used.


Notes

Tinderbox action code
Operators used:

  • runCommand() is a Tinderbox action code operator that lets you interact with the macOS Unix shell’s command line and do all sorts of neat stuff.
  • String.split(). This allows a single string to be split into a list using a regular expression [sic] or ‘regex’ pattern: note the matched pattern’s character(s) are deleted from the list items arising. Note the regex angle, this is why to split on a period the pattern is \. and not . as the latter—in regex terms—means 'any character, so something quite different!
  • List.at(). a way of calling a list item by sequential order number, noting that it is ‘zero-based’ meaning it numbers from zero and not one (i.e. list item #1 is .at(0), list item #2 isis at(1), etc.). Back in early computing days, starting at zero saved storage and thus cost/complexity. Nowadays such otherwise mathematically confusing conventions aren’t needed but the methods live on.
  • function requires v9.0.0+.
    Note that data-typing of [function arguments]. If stuck on an older version, the code could be moved into a stamp. Note also, that in such a case the boolean argument in the second function would need to be saved in a boolean attribute, e.g. $MyBoolean and read from there.
    • Data-typing Function arguments—as used above—needs Tinderbox v9.5.0+. Such typing is not always needed (I was using it mainly to show new techniques.features). As otherwise everything is imported as a string, some extra in-function coding may be needed to ensure to get what you expect when using a function arguments value within the function.

Unix Command Line
The code makes mention of two Unix commands that interact with the ‘pasteboard’, an old term for what most would now call the clipboard (i.e. for copy/paste work):

  • cat. A command line utility to concatenate data (though it has wider uses, as here). See documentation (via ss64.com).
  • pbcopy. A command line utility to copy data to the Mac’s (Unix shell) clipboard. See documentation (via ss64.com).
  • pbpaste. A command line utility to paste data to the Mac’s (Unix shell) clipboard. See documentation (via ss64.com).
6 Likes

Ha - I missed out the cat command. Now added to the Notes section in the post above.

@mwra Nice one! A minor simplification is that you can avoid the round trip through the clipboard (and overwriting the user’s last copied item) by storing the result of the first command:

        $Text = runCommand("cat '" + $File + "'");
2 Likes

Yes, this is very cool.

BTW, quick question, when I drag a file from my HD to $File , rather than getting the path to the file, e.g. /Users/myname/…, i get ~/… Is there a way to have the mac copy the full and not relative path with drafting in or do we need to copy and past the path?

It’s not a relative link (as in a web relative link) just a Unix alias. Both /Users/myname/ and ~/ do the same on any given Ma c. Plus it’s more portable as ~ will open the location in the current OS upsers home directory.

So even if you, for instance, have a different user account at home vs. work (or different workplaces) a ~/ path will point to the same place on the Mac you are on. It also makes it easy to make demos for other people as a TBX with $File ~/Downloads/demo.tbx would open the file ‘Demo.tbx’ in the Downloads folder of anyone using that TBX. Plus, your username—as seen in the ‘full’ path—remains private.

That’s the upside, anyway. What problem is the ~ path-shortening causing?

You could always create an agent which would canonicalize paths. The typical way to do this is to use readlink -f "$FilePath"; you can invoke it in the same way that @mwra used cat above. For example:

/tmp » readlink -f ~/Downloads
/Users/nick/Downloads
Terminal Tip

General tip for those new to the command line: You can learn more about the readlink command, and every other Terminal command, by typing man readlink at the Terminal command prompt. Substitute readlink with the name of the command you’re curious about.

3 Likes

I could not get your script to work. $File would open the file or past the content to the clipboard. I get an error “…: No such file or directory.” It only works when I have the complete path.

BTW, that is the “-1” in this context?

See List.at(). It basically means the last item in the list. Minus-counts start at the end of this list.

Cool, thanks.

1 Like

One other thing to bear in mind is that Unix paths with spaces need quotes. this is because in Terminal-world a space is a delimiter between successive commands. So, file ‘My Test.txt’ needs quotes when passed via the command line whilst 'MyTest.txt does not.

What was the actual path that was failing (if it’s safe to share)?

Ah—was your file on an external drive? ~ is the current users home Directory which will be on the boot drive. You see /Users/myname/ is actually—on most Macs—actually at Macintosh HD/Users/myname/File.txt. By comparison, a file on a mounted external drive will be at /SomeDriveName/Path/File.txt. That said, a path like the latter shouldn’t parse to give a ~ shortening. But, you may have file name (or folder in the path) that needs escaping in Unix. So, a ‘bad’ example might help.

This path does not work
cat: ~/MyGDrive/Doctorate/DBA2023/Transcripts/NAME_Cut_otter_ai.txt: No such file or directory

this one does not work
/Users/myname/MyGDrive/Doctorate/DBA2023/Transcripts/Name_Cut_otter_ai.txt

There are no spaces in the path

No, my file is NOT on an external drive.

1 Like

Thanks, doesn’t look to be a character-in-path problem.

Ah, I see ‘GDrive’. I wonder if the issue is, as with Dropbox, iCloud, etc. that the file might not literally be on your h/d but pulled from the cloud when needed. IOW, it might now be quite where the supposed path implies. But this cloudy stuff is beyond my immediate expertise.

GDrive is NOT Google. It is just a name.

1 Like

My bad. OK, now I’m out of ideas. :slight_smile:

Note that the filename has different case in your example above.

Yes, that is not the issue. I change the name when posting here for confidentiality reasons.

It’s because tilde expansion doesn’t occur in quoted paths. The ~ character is shorthand for the user’s home directory in most Unix shells. It’s the shell that expands it into the full path, not commands like cat, so when it’s quoted (directing the shell to use the path exactly as given) it confuses the command.

Seems like canonicalizing the path first would be a good fix!

1 Like

How would one do this?

See my earlier post, here: Reading plaintext data from file into Tinderbox - #6 by ndpi

1 Like