Tinderbox Forum

Size (in bytes) of notes and .tbx

Dear all,

I would like to get a better understanding which notes contribute how much to the overall file size. Looking at http://www.acrobatfaq.com/atbref7/aTbRefSiteMap.html for “size” didn’t help.

I have a file which 56 is 2.2MB on disk. Looks reasonable, but as I tended to import a lot of .pdf at the beginning I would like to figure out if there are some remnants which take space which could be released.

Best regards
Michael

Under-the-hood a TBX file is just plain text XML - within that the biggest contributors of data are those aspects above adding most data. If you use a lot of embedded images or picture adornments those will be adding a data. For any given note or agent’s text ($Text), more text equals more data per note and (I’d assume) more/more complex RTF formatting of the text creates more data than text with little or no formatting.

In data terms one note with lots of $Text won’t take more space than the same text divided across several smaller notes. However, your TBX may run more smoothly. I think Tinderbox’s design started out thinking in terms of small notes - i.e. at worst it doesn’t optimise for large notes as you can always split them into smaller more manageable size

Put in perspective, this is aTbRef’s source TBX:

atbref-counts1

… is 10.8 MB on disk (1.9MB zipped). The TBX uses no images, the latter being stored externally as being used mainly with HTML (as per my original design when embedding images was harder) The images folder of 261 items comes to 7.6MB, which zips to 7.4MB as the images are already well-optimised. If I added them to the TBX I think it would likely add about 7-8 MB in size, noting that some images are re-used and therefore might need embedding in multiple places.

However, please don’t misread the above as an argument against images. If you like them in your doc, please use them, that’s why they are supported.

[edit: typo]

Hi Mark, thats quite impressive how small TbRef’s source TBX is. I am very satisfied with the snappiness and file size of my tbx here. I just wondered if there are some really large notes buried somewhere which I don’t need anymore. But sorting all notes by word count showed only a few larger ones in my TRASH-container.

I have a similar problem to solve. As part of maintaining my main work file I would like to track down those notes contributing the most to the file size. As far as I can make out there is no system attribute that tracks this aspect of a note e.g. no attribute such as $NoteSize or similar ?

In my case images in the $Text field will be the no 1 contributor to the file size. My goal is to track down those notes with high resolution images inside and resample these.

Any tips on how to accomplish this are welcome.

If you think notes with embedded files† are the likely issue, then test $ImageCount.

† I did a quick test and adding a text PDF to a note raised $ImageCount by 1 so I’d assume ‘image’ actually refers to any embedded file.

Sorry, in case that was a bit terse as advice… an agent query:

$ImageCount > 0

…will find all notes with one or more embedded files (images, PDFs, etc.), which you can then check and edit as appropriate.

Thanks for the feedback - for my purposes this should do fine as I don’t have too many images to sift through and can check the size individually.

The functionality to track down large individual notes (or indeed sort by note size in bytes) is still a nice to have from my perspective.

The ‘size’ of the note is rather notional. For instance, a note is stored as an <item> which varying content depending on how many attributes use non-default values (defaults are inherited and therefore aren’t stored but derived on the fly). A note’s text is stored as both a plain-text <text> tag and a rich-text <rtfd> tag, etc…

TL;DR, with respect, I think the ‘size’ concept is a flawed approach. As well as $ImageCount there is $WordCount which will give you the word count of a note and can be used as a filter for large text-only notes. You can sort notes on $WordCount. Testing either/both of the two attributes gives a fairly good test for large files. An edge case is where the TBX—as in much of my research—uses little $Text but a lot of (user) attribute values. There, a good tip is that if using action code to collect lists of note paths (in order to act on that list) and storing them in a List or Set variable , e.g. $MyList, write the action so you reset all the ‘variable’ list on completion; that saves a big (XML text) list being saved unnecessarily in the TBX.

As @mwra mentioned, images are stored inside <rtfd></rtfd> tags within the XML contents of a .tbx document. They are binary encoded – the pixels in the image are encoded into a string of ASCII characters. The number of characters (and thus the “size” of the Tinderbox document file) depends on the pixel dimensions, color depth, and other factors in the image. You can open the file in a XML editor such as Xmplify of the free instance of BBEdit and search for the <rtfd></rtfd> tags, see how many lines in the file occur between the open and closing tag, and get an idea of the “size” of the image.

Again, the image is merely rendered by Tinderbox when you look at the file in Tinderbox, otherwise the image is not really there – just the binary encoding is there.

Sample of Tinderbox data in a .tbx XML file:

The file itself contains one note, with an image of 1200x800 pixel dimensions, and at 7.4MB is too large to load into this forum.

1 Like

Are you concerned with this because your files are extremely large? Or because you need them to be smaller? Or just in terms of general hygiene?

A bit large file syndrome and a bit of hygiene. File is about 110 Mbytes so getting bigger with - an assumption - figures taking most of the space.

I use TB for recording notes during videoconference meetings. Sometimes when a figure is displayed during on-line presentations I take a screenshot from the screen and use it within my notes as an illustration. The figure doesn’t need to be HiRes - only sufficient to recognise the point made and jog my memory when I revise my notes. If I have enough time I resample on-the-fly to decrease size. Sometimes meetings and my notes go fast and I simply dump the hi-res screenshot. I guess I’ve reached the point where I would like to revisit my notes and resample those hi-res images that I quickly dumped to TB in the last year.

OK. I think we can perhaps devise an approximate measure of the size that would provide an indicator. Getting the exact size, on the other hand, could be quite slow.

Anyone have a suggestion for the name of this attribute?

To avoid the ‘size’ word, perhaps $DocumentWeight, a Number-type indicating the relative contribution of a note towards the overall TBX document size. Besides the number/size of per-note-embeds, the text $WordCount, my hunch is that massively styled text (IOW lots of differently styled passages of text) probably is a factor.

I’ve always felt the paste default - given Tinderbox’s role as an app—is the wrong way round. I’d much prefer paste pasted the clipboard data to match current note style. If as a user I want to ingest lots of likely unneeded style, e.g. just because that’s how the source looked, it seems better to make me have to deliberately ask for that. It would help avoid me ingesting styled data because I’m in a hurry to don’t know how to plain-paste it. Still, as regards such a choice sadly that ship has aleady sailed.

An individual atom in TB is a note so why not $NoteSize which describes in a nutshell the purpose of the attribute.

For relative sizes (string length of the innerXML of each <item> in the .tbx), one could write an xQuery expression.

The following Keyboard Maestro macro, for example, prompts us to choose a tbx file, and then copies to clipboard a tab-indented text outline of that file’s note names, with each name followed by an integer giving a measure of note size:

Text outline with innerXML note sizes.kmmacros.zip (11.9 KB)

XQuery expression
declare function local:outlineFromTbxForest(
  $indent as xs:string,
  $forest as node()*
) as xs:string {
    if (fn:empty($forest)) then '' else
       string-join(
          for $item in $forest
          return concat(
            $indent, $item/attribute[@name='Name']/text(),
            '\t' , string(string-length($item)) ,'\n',
            local:outlineFromTbxForest(
                concat('\t', $indent),
                $item/item
            )
         ),
         ''
      )
};

local:outlineFromTbxForest(
  "", /*/item
)
JXA to call NSXML method .objectsForXQuery
(() => {
    'use strict';

    // Querying Tinderbox files with XQuery 1.0
    //
    // (Name, tab, XML string length) of each node
    //
    // Rob Trew 2019

    ObjC.import('AppKit');

    const
        strTitle = 'Outline of Note sizes',
        strDefaultFolder = '~/Desktop',
        strXQuery =`declare function local:outlineFromTbxForest(
  $indent as xs:string,
  $forest as node()*
) as xs:string {
    if (fn:empty($forest)) then '' else
       string-join(
          for $item in $forest
          return concat(
            $indent, $item/attribute[@name='Name']/text(),
            '\t' , string(string-length($item)) ,'\n',
            local:outlineFromTbxForest(
                concat('\t', $indent),
                $item/item
            )
         ),
         ''
      )
};

local:outlineFromTbxForest(
  "", /*/item
)`;
    // main :: IO ()
    const main = () =>
        either(
            msg => 'User cancelled.' !== msg ? (
                alert(strTitle)(msg)
            ) : msg,
            outlineString => (
                copyText(outlineString),
                alert('Copied to clipboard')(
                    'A text outline with breakdown of ' +
                    'innerXML note sizes\n\n' +
                    '(for ' + lines(outlineString)[0] + ' bytes.)'
                ),
                outlineString
            ),
            bindLR(
                bindLR(
                    pathChoiceLR(strDefaultFolder)(
                        'Choose TBX file'
                    )('public.xml'),
                    readFileLR
                ),
                xQueryLR(strXQuery)
            )
        );

    // NSXML XQuery ---------------------------------------

    // xQueryLR :: String -> String -> Either String String
    const xQueryLR = strXQuery => strXML => {
        const
            uw = ObjC.unwrap,
            e = $(),
            node = $.NSXMLDocument.alloc
            .initWithXMLStringOptionsError(
                strXML, 0, e
            );

        return bindLR(
            undefined !== uw(node) ? (
                Right(node)
            ) : Left(uw(e.localizedDescription)),
            oNode => {
                const
                    e = $(),
                    xs = uw(oNode.objectsForXQueryError(
                        strXQuery, e
                    ));
                return undefined !== uw(xs) ? (
                    Right(unlines(map(uw, xs)))
                ) : Left(uw(e.localizedDescription));
            }
        );
    };

    // JXA ------------------------------------------------

    // alert :: String => String -> IO String
    const alert = title => s => {
        const
            sa = Object.assign(Application('System Events'), {
                includeStandardAdditions: true
            });
        return (
            sa.activate(),
            sa.displayDialog(s, {
                withTitle: title,
                buttons: ['OK'],
                defaultButton: 'OK'
            }),
            s
        );
    };

    // String copied to general pasteboard
    // copyText :: String -> IO Bool
    const copyText = s => {
        const pb = $.NSPasteboard.generalPasteboard;
        return (
            pb.clearContents,
            pb.setStringForType(
                $(s),
                $.NSPasteboardTypeString
            ),
            s
        );
    };

    // String -> String -> Either String FilePath
    const pathChoiceLR = fpDefault => strPrompt => strType => {
        const sa = Application('System Events');
        try {
            sa.activate();
            return Right(
                (sa.includeStandardAdditions = true, sa)
                .chooseFile({
                    withPrompt: strPrompt,
                    ofType: strType,
                    defaultLocation: filePath(fpDefault)
                }).toString()
            );
        } catch (e) {
            return Left(e.message)
        }
    };


    // GENERIC FUNCTIONS ----------------------------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });

    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });

    // bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
    const bindLR = (m, mf) =>
        undefined !== m.Left ? (
            m
        ) : mf(m.Right);

    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = (fl, fr, e) =>
        'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;

    // filePath :: String -> FilePath
    const filePath = s =>
        ObjC.unwrap(ObjC.wrap(s)
            .stringByStandardizingPath);

    // identity :: a -> a
    const identity = x => x;

    // lines :: String -> [String]
    const lines = s => s.split(/[\r\n]/);

    // map :: (a -> b) -> [a] -> [b]
    const map = (f, xs) =>
        (Array.isArray(xs) ? (
            xs
        ) : xs.split('')).map(f);

    // readFileLR :: FilePath -> Either String String
    const readFileLR = fp => {
        const
            e = $(),
            uw = ObjC.unwrap,
            s = uw(
                $.NSString.stringWithContentsOfFileEncodingError(
                    $(fp)
                    .stringByStandardizingPath,
                    $.NSUTF8StringEncoding,
                    e
                )
            );
        return undefined !== s ? (
            Right(s)
        ) : Left(uw(e.localizedDescription));
    };

    // Abbreviation for quick testing - any 2nd arg interpreted as indent size

    // showLog :: a -> IO ()
    const showLog = (...args) =>
        console.log(
            args
            .map(JSON.stringify)
            .join(' -> ')
        );

    // sj :: a -> String
    function sj() {
        const args = Array.from(arguments);
        return JSON.stringify.apply(
            null,
            1 < args.length && !isNaN(args[0]) ? [
                args[1], null, args[0]
            ] : [args[0], null, 2]
        );
    }

    // unlines :: [String] -> String
    const unlines = xs => xs.join('\n');

    // MAIN ---
    return main();
})();

One could also write, quite readily, a macro to return the names and innerXML sizes of the N largest notes in the tbx.

The following draft of a JavaScript for Automation script:

  • Prompts for selection of a .tbx file
  • displays, and copies to clipboard, a list of the N largest leaf-level <item> elements in the selected file.

For example, if we select TinderBox Help.tbx, it copies the following listing (in descending note size) to the clipboard:

- (29598197)   Tinderbox Help/Help/Release Notes/Tinderbox 7/7.5.0
- (3291681)    Tinderbox Help/Help/Release Notes/Tinderbox 6/6.5.0
- (104009)     Tinderbox Help/Help/Release Notes/Tinderbox 6/6.6.0
- (89598)      Tinderbox Help/Help/Release Notes/Tinderbox 7/7.0.0
- (84391)      Tinderbox Help/Help/Release Notes/8.0.0
- (50183)      Tinderbox Help/Help/Release Notes/Tinderbox 6/6.2.0
- (47579)      Tinderbox Help/Help/Release Notes/Tinderbox 7/7.2.0
- (43420)      Tinderbox Help/Help/Basic Concepts/Attributes
- (38874)      Tinderbox Help/Help/Release Notes/Tinderbox 6/6.0.1
- (31326)      Tinderbox Help/Help/Release Notes/Tinderbox 6/6.6.2
JS Source
(() => {
    'use strict';

    // Report N largest leaf notes (childless notes)

    // Prompts for selection of a TBX file,
    // and lists the 10 largest leaf-level notes
    // the selected file.

    // Rob Trew 2019
    // Ver 0.01

    ObjC.import('AppKit');

    const intLargest = 10;

    const
        strTitle = 'Largest leaf notes in TBX file',
        strDefaultFolder = '~/Desktop',
        strXQuery = `declare function local:outlineFromTbxForest(
  $indent as xs:string,
  $forest as node()*
) as xs:string {
    if (fn:empty($forest)) then '' else
       string-join(
          for $item in $forest
          return concat(
            $indent, $item/attribute[@name='Name']/text(),
            '\t' , string(string-length($item)) ,'\n',
            local:outlineFromTbxForest(
                concat('\t', $indent),
                $item/item
            )
         ),
         ''
      )
};

local:outlineFromTbxForest(
  "", /*/item
)`;
    // main :: IO ()
    const main = () =>
        either(
            msg => 'User cancelled.' !== msg ? (
                alert(strTitle)(msg)
            ) : msg
        )(
            report => (
                copyText(report),
                alert('Copied to clipboard')(
                    'Names of ' + str(intLargest) +
                    ' largest leaf notes, with innerXML note sizes:\n\n' +
                    report
                ),
                report
            )
        )(
            bindLR(
                bindLR(
                    pathChoiceLR(strDefaultFolder)(
                        'Choose TBX file'
                    )('public.xml'),
                )(readFileLR)
            )(
                strXML => bindLR(xQueryLR(strXQuery)(strXML))(
                    s => {
                        // A forest in which each tree node
                        // is a dictionary of type
                        // { text :: String, size :: Int }
                        const
                            largerNotes = take(intLargest)(
                                sortBy(
                                    flip(comparing(x => x.size))
                                )(leafPaths(Node('')(
                                    forestFromLineIndents(
                                        indentLevelsFromLines(lines(s))
                                    ).map(fmapTree(strLabel => {
                                        const tokens = strLabel.split(/\t/);
                                        return {
                                            text: tokens[0],
                                            size: parseInt(tokens[1])
                                        };
                                    }))
                                )))
                            ),
                            w = 4 + str(largerNotes[0].size).length;
                        return Right(unlines(map(
                            x => '- (' + justifyLeft(w)(' ')(
                                str(x.size) + ')'
                            ) + x.text
                        )(largerNotes)));
                    }
                )
            )
        );

    // NSXML XQuery ---------------------------------------

    // xQueryLR :: String -> String -> Either String String
    const xQueryLR = strXQuery => strXML => {
        const
            uw = ObjC.unwrap,
            e = $(),
            node = $.NSXMLDocument.alloc
            .initWithXMLStringOptionsError(
                strXML, 0, e
            );

        return bindLR(
            undefined !== uw(node) ? (
                Right(node)
            ) : Left(uw(e.localizedDescription))
        )(
            oNode => {
                const
                    e = $(),
                    xs = uw(oNode.objectsForXQueryError(
                        strXQuery, e
                    ));
                return undefined !== uw(xs) ? (
                    Right(unlines(map(uw)(xs)))
                ) : Left(uw(e.localizedDescription));
            }
        );
    };

    // JXA ------------------------------------------------

    // alert :: String => String -> IO String
    const alert = title => s => {
        const
            sa = Object.assign(Application('System Events'), {
                includeStandardAdditions: true
            });
        return (
            sa.activate(),
            sa.displayDialog(s, {
                withTitle: title,
                buttons: ['OK'],
                defaultButton: 'OK'
            }),
            s
        );
    };

    // String copied to general pasteboard
    // copyText :: String -> IO Bool
    const copyText = s => {
        const pb = $.NSPasteboard.generalPasteboard;
        return (
            pb.clearContents,
            pb.setStringForType(
                $(s),
                $.NSPasteboardTypeString
            ),
            s
        );
    };

    // String -> String -> Either String FilePath
    const pathChoiceLR = fpDefault => strPrompt => strType => {
        const sa = Application('System Events');
        try {
            sa.activate();
            return Right(
                (sa.includeStandardAdditions = true, sa)
                .chooseFile({
                    withPrompt: strPrompt,
                    ofType: strType,
                    defaultLocation: filePath(fpDefault)
                }).toString()
            );
        } catch (e) {
            return Left(e.message)
        }
    };


    // GENERIC FUNCTIONS ----------------------------------
    // https://github.com/RobTrew/prelude-jxa

    // Left :: a -> Either a b
    const Left = x => ({
        type: 'Either',
        Left: x
    });

    // Right :: b -> Either a b
    const Right = x => ({
        type: 'Either',
        Right: x
    });

    // Node :: a -> [Tree a] -> Tree a
    const Node = v => xs => ({
        type: 'Node',
        root: v, // any type of value (consistent across tree)
        nest: xs || []
    });

    // Tuple (,) :: a -> b -> (a, b)
    const Tuple = a => b => ({
        type: 'Tuple',
        '0': a,
        '1': b,
        length: 2
    });

    // bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
    const bindLR = m => mf =>
        undefined !== m.Left ? (
            m
        ) : mf(m.Right);

    // comparing :: (a -> b) -> (a -> a -> Ordering)
    const comparing = f =>
        x => y => {
            const
                a = f(x),
                b = f(y);
            return a < b ? -1 : (a > b ? 1 : 0);
        };

    // concat :: [[a]] -> [a]
    // concat :: [String] -> String
    const concat = xs =>
        0 < xs.length ? (() => {
            const unit = 'string' !== typeof xs[0] ? (
                []
            ) : '';
            return unit.concat.apply(unit, xs);
        })() : [];

    // compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
    const compose = (...fs) =>
        x => fs.reduceRight((a, f) => f(a), x);

    // div :: Int -> Int -> Int
    const div = x => y => Math.floor(x / y);

    // either :: (a -> c) -> (b -> c) -> Either a b -> c
    const either = fl => fr => e =>
        'Either' === e.type ? (
            undefined !== e.Left ? (
                fl(e.Left)
            ) : fr(e.Right)
        ) : undefined;

    // filePath :: String -> FilePath
    const filePath = s =>
        ObjC.unwrap(ObjC.wrap(s)
            .stringByStandardizingPath);

    // Lift a simple function to one which applies to a tuple,
    // transforming only the first item of the tuple

    // firstArrow :: (a -> b) -> ((a, c) -> (b, c))
    const firstArrow = f =>
        // A simple function lifted to one which applies
        // to a tuple, transforming only its first item.
        xy => Tuple(f(xy[0]))(
            xy[1]
        );

    // flip :: (a -> b -> c) -> b -> a -> c
    const flip = f =>
        x => y => f(y)(x);

    // fmapTree :: (a -> b) -> Tree a -> Tree b
    const fmapTree = f => tree => {
        const go = node => Node(f(node.root))(
            node.nest.map(go)
        );
        return go(tree);
    };

    // foldl1 :: (a -> a -> a) -> [a] -> a
    const foldl1 = f => xs =>
        1 < xs.length ? xs.slice(1)
        .reduce(uncurry(f), xs[0]) : xs[0];

    // foldTree :: (a -> [b] -> b) -> Tree a -> b
    const foldTree = f => tree => {
        const go = node => f(node.root)(
            node.nest.map(go)
        );
        return go(tree);
    };


    // forestFromLineIndents :: [(Int, String)] -> [Tree String]
    const forestFromLineIndents = tuples => {
        const go = xs =>
            0 < xs.length ? (() => {
                const [n, s] = Array.from(xs[0]);
                // Lines indented under this line,
                // tupled with all the rest.
                const [firstTreeLines, rest] = Array.from(
                    span(x => n < x[0])(xs.slice(1))
                );
                // This first tree, and then the rest.
                return [Node(s)(go(firstTreeLines))]
                    .concat(go(rest));
            })() : [];
        return go(tuples);
    };

    // fst :: (a, b) -> a
    const fst = tpl => tpl[0];

    // identity :: a -> a
    const identity = x => x;

    // indentLevelsFromLines :: [String] -> [(Int, String)]
    const indentLevelsFromLines = xs => {
        const
            indentTextPairs = xs.map(compose(
                firstArrow(length), span(isSpace)
            )),
            indentUnit = minimum(indentTextPairs.flatMap(pair => {
                const w = fst(pair);
                return 0 < w ? [w] : [];
            }));
        return indentTextPairs.map(
            firstArrow(flip(div)(indentUnit))
        );
    };

    // isSpace :: Char -> Bool
    const isSpace = c => /\s/.test(c);

    // justifyLeft :: Int -> Char -> String -> String
    const justifyLeft = n => cFiller => s =>
        n > s.length ? (
            s.padEnd(n, cFiller)
        ) : s;

    // Returns Infinity over objects without finite length.
    // This enables zip and zipWith to choose the shorter
    // argument when one is non-finite, like cycle, repeat etc

    // leafList :: Tree a -> [a]
    const leafList = tree =>
        foldTree(x => xs =>
            0 < xs.length ? (
                concat(xs)
            ) : [x]
        )(tree);

    // leafPaths :: Tree a -> [a]
    const leafPaths = tree =>
        foldTree(x => xs =>
            0 < xs.length ? (
                xs.flatMap(forest => {
                    return forest.map(
                        t => Boolean(x.text) ? ({
                            text: x.text + '/' + t.text,
                            size: t.size
                        }) : t
                    )
                })
            ) : [x]
        )(tree);

    // length :: [a] -> Int
    const length = xs =>
        (Array.isArray(xs) || 'string' === typeof xs) ? (
            xs.length
        ) : Infinity;

    // lines :: String -> [String]
    const lines = s =>
        s.split(/[\r\n]/);


    // map :: (a -> b) -> [a] -> [b]
    const map = f => xs =>
        (Array.isArray(xs) ? (
            xs
        ) : xs.split('')).map(f);

    // minimum :: Ord a => [a] -> a
    const minimum = xs =>
        0 < xs.length ? (
            foldl1(a => x => x < a ? x : a)(xs)
        ) : undefined;

    // readFileLR :: FilePath -> Either String String
    const readFileLR = fp => {
        const
            e = $(),
            uw = ObjC.unwrap,
            s = uw(
                $.NSString.stringWithContentsOfFileEncodingError(
                    $(fp)
                    .stringByStandardizingPath,
                    $.NSUTF8StringEncoding,
                    e
                )
            );
        return undefined !== s ? (
            Right(s)
        ) : Left(uw(e.localizedDescription));
    };

    // Abbreviation for quick testing - any 2nd arg interpreted as indent size

    // showLog :: a -> IO ()
    const showLog = (...args) =>
        console.log(
            args
            .map(JSON.stringify)
            .join(' -> ')
        );

    // sortBy :: (a -> a -> Ordering) -> [a] -> [a]
    const sortBy = f => xs =>
        xs.slice()
        .sort(uncurry(f));

    // str :: a -> String
    const str = x => x.toString();

    // take :: Int -> [a] -> [a]
    // take :: Int -> String -> String
    const take = n => xs =>
        'GeneratorFunction' !== xs.constructor.constructor.name ? (
            xs.slice(0, n)
        ) : [].concat.apply([], Array.from({
            length: n
        }, () => {
            const x = xs.next();
            return x.done ? [] : [x.value];
        }));

    // sj :: a -> String
    function sj() {
        const args = Array.from(arguments);
        return JSON.stringify.apply(
            null,
            1 < args.length && !isNaN(args[0]) ? [
                args[1], null, args[0]
            ] : [args[0], null, 2]
        );
    }

    // span, applied to a predicate p and a list xs, returns a tuple of xs of
    // elements that satisfy p and second element is the remainder of the list:
    //
    // > span (< 3) [1,2,3,4,1,2,3,4] == ([1,2],[3,4,1,2,3,4])
    // > span (< 9) [1,2,3] == ([1,2,3],[])
    // > span (< 0) [1,2,3] == ([],[1,2,3])
    //
    // span p xs is equivalent to (takeWhile p xs, dropWhile p xs)

    // span :: (a -> Bool) -> [a] -> ([a], [a])
    const span = p => xs => {
        const iLast = xs.length - 1;
        return splitAt(
            until(i => iLast < i || !p(xs[i]))(
                succ
            )(0)
        )(xs);
    };

    // splitAt :: Int -> [a] -> ([a], [a])
    const splitAt = n => xs =>
        Tuple(xs.slice(0, n))(
            xs.slice(n)
        );

    // succ :: Enum a => a -> a
    const succ = x =>
        1 + x;

    // uncurry :: (a -> b -> c) -> ((a, b) -> c)
    const uncurry = f =>
        function() {
            const
                args = Array.from(arguments),
                a = 1 < args.length ? (
                    args
                ) : args[0]; // Tuple object.
            return f(a[0])(a[1]);
        };

    // unlines :: [String] -> String
    const unlines = xs => xs.join('\n');

    // until :: (a -> Bool) -> (a -> a) -> a -> a
    const until = p => f => x => {
        let v = x;
        while (!p(v)) v = f(v);
        return v;
    };

    // MAIN ---
    return main();
})();

$NoteKBSize?
$NoteStorageSize?

Works nicely and I identified one note taking up 60 Mbytes of space.
I plan to compare the output of the script to results from the new $EstimatedNoteSize attribute for cross-validation purposes.
Thanks for your efforts.

2 Likes

Just to add to this thread belatedly, in case it’s of help to others, I found this extremely helpful to reduce the size of a TBX file. Half a dozen screenshots added nearly 100 MB to a 118 MB file. Identifying them was easy with $EstimatedNoteSize (either in outline or attribute browser) and converting them to links to DEVONthink means my database is now a sprightly 18 MB. Many thanks for the responsive and excellent solution, Mark @eastgate.