Size (in bytes) of notes and .tbx


(Mitch) #1

Dear all,

I would like to get a better understanding which notes contribute how much to the overall file size. Looking at http://www.acrobatfaq.com/atbref7/aTbRefSiteMap.html for “size” didn’t help.

I have a file which 56 is 2.2MB on disk. Looks reasonable, but as I tended to import a lot of .pdf at the beginning I would like to figure out if there are some remnants which take space which could be released.

Best regards
Michael


(Mark Anderson) #2

Under-the-hood a TBX file is just plain text XML - within that the biggest contributors of data are those aspects above adding most data. If you use a lot of embedded images or picture adornments those will be adding a data. For any given note or agent’s text ($Text), more text equals more data per note and (I’d assume) more/more complex RTF formatting of the text creates more data than text with little or no formatting.

In data terms one note with lots of $Text won’t take more space than the same text divided across several smaller notes. However, your TBX may run more smoothly. I think Tinderbox’s design started out thinking in terms of small notes - i.e. at worst it doesn’t optimise for large notes as you can always split them into smaller more manageable size

Put in perspective, this is aTbRef’s source TBX:

atbref-counts1

… is 10.8 MB on disk (1.9MB zipped). The TBX uses no images, the latter being stored externally as being used mainly with HTML (as per my original design when embedding images was harder) The images folder of 261 items comes to 7.6MB, which zips to 7.4MB as the images are already well-optimised. If I added them to the TBX I think it would likely add about 7-8 MB in size, noting that some images are re-used and therefore might need embedding in multiple places.

However, please don’t misread the above as an argument against images. If you like them in your doc, please use them, that’s why they are supported.

[edit: typo]


(Mitch) #3

Hi Mark, thats quite impressive how small TbRef’s source TBX is. I am very satisfied with the snappiness and file size of my tbx here. I just wondered if there are some really large notes buried somewhere which I don’t need anymore. But sorting all notes by word count showed only a few larger ones in my TRASH-container.