Pulling attributes from text

(Matthew Williams) #1

I want to bring in to tinderbox highlights from ibooks and automatically extract text into some attributes. So here is some original text from the ibooks output:

May 20, 2018

Chapter One: A COMPUTER WANTED, p. 12

punched cards, which separated pattern from process for the first time in history, would eventually find their way into the earliest computers. Patterns encoded on paper, which computer scientists later called “programs,” could meaningfully entangle numbers as easily as thread. The Jacquard loom

May 20, 2018

Chapter One: A COMPUTER WANTED, p. 10

Indeed, computing was the grunt labor of organized science; before they were made obsolete, human computers prepared ballistics trajectories for the United States Army, cracked Nazi codes at Bletchley Park, crunched astronomical data at Harvard, and assisted numerical studies of nuclear fission on the Manhattan Project. Despite the diversity of their work, human computers had one thing in common. They were women.

So the text in the note should just be the highlighted text. $Chapter should be the chapter text up to but not including the page, and $page should be the page number.

I am generally comfy with regex but not seeing any examples of use in an action in tinderbox.

(eastgate) #2

I believe there’s an example of this in the Agents and Actions section of Getting Started.

The general pattern is to use an agent to find a pattern, and extract the substring:

Query: $Text.contains(“Chapter.+:(.+),”)
Action: $Chapter=$1;

(Matthew Williams) #3

ahhh, i think i was missing it because i wasn’t sure how to search for what i wanted, and i didn’t realize that $1 would be the results of the query.