Pandoc Working Group

(Starting a new thread rather than posting to the ongoing monster thread that suggested this)

How might Tinderbox integrate more easily with pandoc? Can modest Tinderbox support make pandoc something that casual Tinderbox writers can use more easily?

1 Like

For example, pandoc is licensed under GPL, so we cannot integrate it in Tinderbox.

But is it permissible/legal/desirable to add a menu command to “install pandoc” that is enabled if Tinderbox doesn’t find pandoc already installed?

One aspect generally overlooked in this context of installing CL tools for the user is how this is effected. There is more than one way, all opaque to the non-tech user, yet those install methods all have foibles. For instance, macports and homebrew install frameworks don’t necessarily co-exist happily. IOW if you already use one you probably want to use the same one to install pandoc meaning supporting at least two methods (and autodetecting the current one, as the user affected by this choice is likely to have no meaningful knowledge over which to choose.

So, it might be better to indicate to the user to pandoc and then once installed outside Tinderbox then give assistance to locate/call it.

To be clear, this isn’t pushback about the overall idea, but is just pragmatism over some unsurfaced assumptions that might bite.

That’s what Scrivener (which has some compile defaults set up for Pandoc when it’s installed) does. From the Manual:

Scrivener also supports a few Pandoc export options. However given the size of Pandoc, we are unable to embed a copy of it within Scrivener, and you will not see any compile options for it, unless you install it yourself on the system. You will find download and installation instructions on the Pandoc website. Once installed, and Scrivener has been restarted, you should see the Pandoc entries at the bottom of the compile file type list. (Section 3.3.3).

Scrivener does provide a full version of MultiMarkdown in its download, though.

1 Like

I would be very much in favor of some additional pandoc-related conveniences. I can see where getting a proper installation might be a reach for some users not comfortable with the command line (although homebrew is pretty slick these days), but the idea of some options/templates being available if it is installed and then some pointer to a resource about how to install it might work.

2 Likes

Would is violate the letter or spirit of the GPL, for example, if a dialog had a button (Install pandoc) that would be enabled if pandoc was not at the expected location, and that performed runCommand("brew install pandoc","")?

I yield to those with more proximate experience, but if MacPorts is detected it might not be ideal to offer a Homebrew install (reading around it seems the two (unintentionally) don’t co-exist well.

Again, others may have better knowledge, but I think (at present) that Homebrew is the more popular of the two installers. So the less likely edge case is a Mac with MacPorts and no pandoc.

The complexity of pandoc:

  1. Learning the installation steps and setting the root path as par to the installation
  2. Understanding that pandoc is a headless application and is controlled in the command line, uses flags to tell it what to do (i.e., control inputs, outputs, and any configurations)’
  3. Passing values from attributes to to the command line

Honesty, however, it feels like people are still struggling with the concepts of templates in Tinderbox (how do do it, how do they work, etc.).

To address this, I’d recommend the following:

Create a Pandoc TBX Installer, which would

  1. Have a basic “PandocMSWord” Prototype with basic configuration defaults, e.g. $InputExtension, $OutputExtenion, $FileName, $PandocConverstionStr, $PandocMSWordTemplate, $HTMLPreviewCommand
  2. Pandoc function: in this function, there is some explanatory text (e.g., create a directory on your HD, configure attributes, etc.) and action code that gives an example of how to string the Pandoc commandline expression from the attributes and the use of pbcopy, touch, and pbpaste commands. Integrating voice could be fun, too.

For the first pass, you might want to limit the bells and whistles, e.g., conversion of citations, as this takes a few extra steps. Also, you may want to start with an HTML to DOCX setup first. The Installer could also include some instructional notes.

1 Like

FWIW, I’ve written a suite of portable Tinderbox functions for leveraging Pandoc to create Word, PDF, PowerPoint, and Google Docs. I’d be happy to post them here if you all would find that helpful.

6 Likes

This would be very helpful. To my knowledge, a working setup for PDF has not been shared here before. Thank you.

1 Like

That is a generous offer and I’m sure the community would find it useful.

If you do so, please start a new thread and cross link so the eventual (different) conversations—the functions (use) vs. the wider pandoc issue) can stay separate.

Not least, it will be interesting in the context of @satikusala’s point re understanding templates (I’d tend to agree there) and whether the functions route around that gap in some users knowledge of/interest in templates. I understand the root of such uninterest, so I’m not making a negative point in that last sentence.

1 Like

I would agree with Mark-let the user install, and point to the pandoc location.

FWIW, on the Pandoc install page they say " There is a package installer at pandoc’s download page." I used this package to install sucessfully, and it was a very simple experience.

My previous encounters over the years with macports and homebrew did not end in success, even though I have been using open source software on Mac, Linux, and Windows, for years… well, I guess decades, now. After all this time, I have lost patience with monkeying around trying to force things to install, I want to create rather than maintain.

And one more comment on integrating with other, possibly proprietary software. Plain text is the pinnacle of portability. When MSWord promised what it couldn’t deliver (anyone remember master documents from Word 6?) you can always extract plain text and reassemble it in something that respects your choices. The horrors of trying to get Word 6 to number sections in my master’s thesis are fresh in my mind. It took plain text and JPEGs assembled in Pagemaker to save me when the time came to create a properly formatted final copy fit to submit to the Graduate School (my committee had to settle for photocopies of a master with the page numbers pasted in old school style).

When Adobe killed Pagemaker, I learned LaTeX and never looked back at Word for my own projects, except with a raised finger. (However, I am forced to use it by my employer, and by colleagues who don’t know other tools. I choose not to fix problems with their parts of the document, merely appending comments for them to repair themselves.) [/rant]

1 Like

For the LaTeX-curious, this might be of interest: Working with LaTeX. I wrote it a while back so it might need revisiting. Another thing, again re LaTeX, is Overleaf†. It allows you to use LaTeX without needing to understand/install/maintain a local copy. At least, might allow experimentation to a stage where one might want an install locally.

Disclaimer: I do use Overleaf for academic papers as its just easier, but I get a free account via my affiliated University. Still the above is info rather than an intentional endorsement.

If people tend to start talking about using word, I tend to back away as there are so many better text tools. But, sadly, publishing does seem to be an area where folk seem real happy with a '90s copy of Word and a bare minimum of expertise leaving everyone else to figure out how to make documents they can use.

†. Overleaf’s free tier allows One free project (though you could rotate doc in/out if the online project.

2 Likes

RStudio is for me the best of two worlds: Latex and Markdown. You write your text in Tbx using Markdown, you compile — or « knit » — it with RStudio, and what you get is both a sublime pdf document as if you had used LaTex… and a LaTex file in the case you need it and… an alternative: knit your file in the perspective of a Word document without having to remove every LaTex markup you usually use when writing in LaTex. For instance, if you want to send a beautiful LaTex « manuscript » to an editor, but don’t want at all to rewrite entirely your work for Word, that solution is invaluable.

5 Likes

I love this. Are you using R Bookdown in this context. I love the prospect of being able to write from one source and potentially use R Bookdown for a LaTeX based blog, R Blogdown for a web-based native and even R Shiny to make interactive web views based on the data in the source project. There is also R Notebook if an interactive Jupyter Notebook-like document is desired.

<looks at in-tray> This has been on my to-try spike for ages so i’d love to know more about your method/workflow.

Like @satikusala’s impressive pandoc/MS Office pathway, the above is not everybody’s need (especially as a first TBX task!) but for those with the need it appears to offer a lot.

Discussion of Tinderbox & R, if of interest to the community, might warrant starting a discrete thread, so it doesn’t get mixed up with the useful discussion of pandoc with Tinderbox.

Yes, please!!! I so want to understand and learn RStudio—to pull it into my workflow.

Responding here to @mwra although I also responded on the new thread – bookdown has been replaced by quarto manuscripts, which is the new way to do this and more. I also make extensive use of a workflow like @dominiquerenauld’s for all my scientific manuscripts and it’s what I teach to students in my undergrad and grad courses. Once you get over the learning curve, it’s phenomenal for technical documents. Here’s an example of the qmd file (quarto markdown) document for a manuscript of ours that is currently in review:

which renders to

Note the inclusion of markdown prose, pandoc citations, TeX for equations, dynamic figure/table numbering, html, and live R code. And it’s all plain text. For the journal we just submit the TeX file, which is generated when rendering via pandoc to pdf, but it can also render to Word docx without much extra hassle.

2 Likes

Hey, is that Ed Ayres from The Valley Of Shadow, historian?

Me gusta! Thank you for troubling to share this. I shall learn from it. Deep joy.

1 Like

Don’t think so? The one we’re working with is in charge of environmental sensors at the National Ecological Observatory Network.