Is it possible to parse and manipulate Tinderbox's XML using Ruby?

I am curious to know if anyone here has experience accessing and manipulating notes and notes’ attributes using Ruby – in particular, using the Nokogiri gem (but not necessarily). I started playing around with this but didn’t make it far. The documentation isn’t particularly helpful, so I only managed to access some attributes, such as name and text, but not the other ones. I fear I might not be fully grasping Tbx’s XML structure and how it is supposed to work.

A minimal example:


#!/usr/bin/env ruby
# frozen_string_literal: false

Encoding.default_external = Encoding::UTF_8

require 'nokogiri'

file_path = ""
doc = Nokogiri::XML(File.open(file_path))

tbx = doc.children
items = tbx.xpath('//item')
p items[10] # just a random note, see below the result I am getting.

# attribute values
# items[0].xpath('//attribute').children
Random note
#<Nokogiri::XML::ElementElement:0x410 name="item" attributes=[
#<Nokogiri::XML::ElementAttr:0x3c name="ID" value="1624131850">, 
#<Nokogiri::XML::ElementAttr:0x50 name="Creator" value="Bernardo Vasconcelos">] children=[
#<Nokogiri::XML::ElementText:0x64 "\n">, 
#<Nokogiri::XML::ElementElement:0xa0 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x78 name="name" value="Created">] children=[
#<Nokogiri::XML::ElementText:0x8c "2021-06-19T15:17:51-03:00">]>, 
#<Nokogiri::XML::ElementText:0xb4 "\n">, 
#<Nokogiri::XML::ElementElement:0xf0 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0xc8 name="name" value="DominantLanguage">] children=[
#<Nokogiri::XML::ElementText:0xdc "pt">]>, 
#<Nokogiri::XML::ElementText:0x104 "\n">, 
#<Nokogiri::XML::ElementElement:0x140 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x118 name="name" value="Modified">] children=[
#<Nokogiri::XML::ElementText:0x12c "2021-06-19T15:17:51-03:00">]>, 
#<Nokogiri::XML::ElementText:0x154 "\n">, 
#<Nokogiri::XML::ElementElement:0x190 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x168 name="name" value="MyString">] children=[
#<Nokogiri::XML::ElementText:0x17c "/DA/DA I/DA I 2 Opiniões dos predecessores/Predecessores/">]>, 
#<Nokogiri::XML::ElementText:0x1a4 "\n">, 
#<Nokogiri::XML::ElementElement:0x1e0 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x1b8 name="name" value="NLOrganizations">] children=[
#<Nokogiri::XML::ElementText:0x1cc "Alma">]>, 
#<Nokogiri::XML::ElementText:0x1f4 "\n">, 
#<Nokogiri::XML::ElementElement:0x230 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x208 name="name" value="NLTags">] children=[
#<Nokogiri::XML::ElementText:0x21c "κίνησις">]>, 
#<Nokogiri::XML::ElementText:0x244 "\n">, 
#<Nokogiri::XML::ElementElement:0x280 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x258 name="name" value="Name">] children=[
#<Nokogiri::XML::ElementText:0x26c "Heráclito">]>, 
#<Nokogiri::XML::ElementText:0x294 "\n">, 
#<Nokogiri::XML::ElementElement:0x2d0 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x2a8 name="name" value="SelectionCount">] children=[
#<Nokogiri::XML::ElementText:0x2bc "17">]>, 
#<Nokogiri::XML::ElementText:0x2e4 "\n">, 
#<Nokogiri::XML::ElementElement:0x320 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x2f8 name="name" value="Xpos">] children=[
#<Nokogiri::XML::ElementText:0x30c "4">]>, 
#<Nokogiri::XML::ElementText:0x334 "\n">, 
#<Nokogiri::XML::ElementElement:0x370 name="attribute" attributes=[
#<Nokogiri::XML::ElementAttr:0x348 name="name" value="Ypos">] children=[
#<Nokogiri::XML::ElementText:0x35c "3.5575">]>, 
#<Nokogiri::XML::ElementText:0x384 "\n">, 
#<Nokogiri::XML::ElementElement:0x3ac name="text" children=[
#<Nokogiri::XML::ElementText:0x398 " Alma é o vapor. \n O que está em movimento percebe o que está em movimento. ">]>, 
#<Nokogiri::XML::ElementText:0x3c0 "\n">, 
#<Nokogiri::XML::ElementElement:0x3e8 name="rtfd" children=[
#<Nokogiri::XML::ElementText:0x3d4 "cnRmZAAAAAADAAAAAgAAAAcAAABUWFQucnRmAQAAAC4gAgAAKwAAAAEAAAAYAgAAe1xydGYxXGFu\nc2lcYW5zaWNwZzEyNTJcY29jb2FydGYyNjM2Clxjb2NvYXRleHRzY2FsaW5nMFxjb2NvYXBsYXRm\nb3JtMHtcZm9udHRibFxmMFxmbmlsXGZjaGFyc2V0MCBHZW50aXVtUGx1czt9CntcY29sb3J0Ymw7\nXHJlZDI1NVxncmVlbjI1NVxibHVlMjU1O1xyZWQxOTlcZ3JlZW4xOThcYmx1ZTE4NztccmVkMjU1\nXGdyZWVuMjU1XGJsdWUyNTU7fQp7XCpcZXhwYW5kZWRjb2xvcnRibDs7XGNzcHRocmVlXGM4MTg5\nMVxjODE1ODJcYzc4MzcxO1xjc3B0aHJlZVxjMTAwMDAwXGMxMDAwMDBcYzEwMDAwMDt9ClxwYXJk\nXHR4NTYwXHR4MTEyMFx0eDE2ODBcdHgyMjQwXHR4MjgwMFx0eDMzNjBcdHgzOTIwXHR4NDQ4MFx0\neDUwNDBcdHg1NjAwXHR4NjE2MFx0eDY3MjBccGFyZGlybmF0dXJhbFxwYXJ0aWdodGVuZmFjdG9y\nMAoKXGYwXGZzNDggXGNmMiAgQWxtYSBcJ2U5IG8gdmFwb3IuIApcZnMzMiBcY2YzIFwKClxmczQ4\nIFxjZjIgIE8gcXVlIGVzdFwnZTEgZW0gbW92aW1lbnRvIHBlcmNlYmUgbyBxdWUgZXN0XCdlMSBl\nbSBtb3ZpbWVudG8uIH0BAAAAIwAAAAEAAAAHAAAAVFhULnJ0ZhAAAADXprNhtgEAAAAAAAAAAAAA\n\n">]>, 
#<Nokogiri::XML::ElementText:0x3fc "\n">]>

Salvete,

I found an interesting example. Inspired by this script for R in Zettelkasten compatibility with Markdown apps - #5 by jpl), I managed to extract the main attributes from the notes. I still could not, however, get some of them, such as $Container and $Path.

#!/usr/bin/env ruby
# frozen_string_literal: false

# bcdav 2021-12-11-23-36

Encoding.default_external = Encoding::UTF_8

require 'nokogiri'
require 'terminal-table'

doc = Nokogiri::XML(File.open('path/to/file'))

links = doc.xpath('//link')
links_rows = []

parse_links = true

if parse_links == true
  links.each do |node|
    links_rows << [node.attr('name'), node.attr('sourceid'), node.attr('destid'), node.attr('sstart'), node.attr('slen')]
  end

  headings = %w[name sourceid destid sstart slen]
  links_table = Terminal::Table.new rows: links_rows, headings: headings

  puts links_table

end

# parse_notes = false
parse_notes = true
elems = doc.xpath('//tinderbox//item')
notes_rows = []

if parse_notes == true

  elems.each do |node|
    id = node.attr('ID') || ''
    name = node.at('.//attribute[@name="Name"]').text || ''
    nl_tags = node.at('.//attribute[@name="NLTags"]') ? node.at('.//attribute[@name="NLTags"]').text : ''
    my_string = node.at('.//attribute[@name="MyString"]') ? node.at('.//attribute[@name="MyString"]').text : ''
    text = node.at('//text') ? node.at('//text').text : ''
    prototype = node.attr('proto') || ''
    # p path = node.css_path

    notes_rows << [id, name, nl_tags, my_string, text, prototype]
  end

  puts notes_rows.join(';')

end

Hopefully, this will be useful to someone at some point. Not that I have not seen many Ruby enthusiasts here, though.

2 Likes

I have been working on a new Ruby class that should now be useful enough to be worth sharing.

This ruby script will parse attributes, notes, and links belonging to a Tinderbox document. It can be used, for instance, to backup notes (with all attribute values) to a TSV file. (Wiki Links – [[link]] – can optionally be added to the text.) This could be useful to recover data from files that refuse to open.

I might add the option to retrieve other elements, such as link types, preferences, filters, colors, macros, badges, windows, searches, and gallery. But this is for a later time. All feedback and suggestions are welcome.


Dependencies

This was developed using Ruby 3.1, and it relies on Nokogiri.

I recommend installing Ruby with rbenv via Homebrew.

Homebrew

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Rbenv

brew install rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.zshrc

Ruby 3.1

Restart the terminal, install version 3.1.0 and, optionally set it as the global Ruby version for your system.

rbenv install 3.1.0
rbenv global 3.1.0

Nokogiri

Finally, install Nokogiri:

$ gem install nokogiri

Edited: Thanks, @WAKAMATSU for spoting an error here. This was previously writen Nokogiri whereas all the letters should be in lowercase.


Github gist

3 Likes

Wow! Gotta try this :slight_smile:

@bmgphd this might be a stellar option for you to fly a “spreadsheet-type” TSV file in and out of Tinderbox.

Caveat - use with care. Version control becomes critical in these scenarios - it’s possible to mess up months of work if an earlier file overwrites important recently-added content.

I’d think Tinderbox’s built-in TSV/CSV import would likely be easier. What am I missing?

Easier than what exactly? I ask because there is nothing in the script about importing notes, only exporting. The built-in import is ideal, the built-in export to TSV/CSV, on the other hand, less so.

To be clear - I’d been referring to the export TBX>TSV aspect, combined with Tinderbox’s own TSV/CSV import.

In the meantime, I also found these - it’s along the lines of advice given to me last Sunday on creating a CSV/TSV export template - and will experiment accordingly:

1 Like

Love your script, it works perfectly!

1 Like

Dear Bernardo Vasconcelos,
Thank you for teaching us how to use your valuable Ruby.
I will definitely give it a try.
Please forgive my meddling.
I am a little concerned about a minor detail,
[gem install Nokogiri] is not correct.
It is written that the installation using brew etc.
Nokogiri is not spelled out in uppercase
If you do not write “nokogiri” starting with a lower case letter
– Message Quote from Terminal
Could not find a valid gem ‘Nokogiri’ (>= 0) in any repository
– End Quote
error message will come up.
If we are going to retry or follow up on your suggestion here
Do not forget to write “gem install nokogiri”.
With kind regards, WAKAMATSU

2 Likes

Dear Bernardo Vasconcelos,
Thank you for making a correction.
BTW,
– qte
Installing Nokogiri - Nokogiri
[[ Installing Nokogiri - Nokogiri ]]
MacOS
If you’re using homebrew:
brew install libxml2 libxslt
gem install nokogiri --platform=ruby – --use-system-libraries
– unqte

There is a note about installing Nokogiri like this
Is it OK to simply install a single package in your case?

I am not familiar with Ruby myself, so I do not know much about it.
I am learning Tinderbox-Ruby.rb as you suggested, but I do not really know how to use it.
I am wondering if the reason why it does not work is due to “lack of packages”?
I would be very grateful if you could explain in more detail how to use your Tinderbox-Ruby.rb.
Yesterday’s experiment failed, so I’ll try again today.
I had already installed
brew install libxml2 libxslt and
gem install nokogiri --platform=ruby – --use-system-libraries
and re-run
I also tried $brew link --force libxml2 libxslt libiconv, etc.
Now I will try again this evening.
Faithfully, WAKAMATSU

Try installing rbenv first, than Ruby 3.1, then nokogiri, as I outlined in the previous post. If not, then just try sudo gem install nokogiri and it should work.