Building Academic DB for events

lfriedla · October 14, 2018, 7:52pm

I am designing a TBX database in which I want to track ten years of legislation in Wisconsin on multiple issues (e.g. taxation, health, labor law). The most important criteria are that I want to a) generate individual timelines for each topic area (e.g tax acts from 2010-2018); b) a combined timeline of all legislation so it could be viewed as a whole; and c) would like to be able to export to an excel file so research collaborators can integrate with larger analyses.

At one level the design of the note seems pretty simple: fields for topic, specific bill, sponsors, date passed. But I haven’t worked with timelines before and want to make sure it meets my export criteria.
James Fallows’ use of the Attribute browser seems relevant here, but just wanted to ask before I dive in. Thanks.

PaulWalters · October 14, 2018, 8:23pm

It appears the Wisconsin legislature considered 1,100 bills in the 2017 session, so ten years’ data will be significant, even if you are narrowing down to a slice of that.

It’s hard to answer the general question but it’s easy to take a sample to the data you have or will gather, create a sample Tinderbox file, and experiment for yourself. The short answer, is yes Tinderbox can do (a) (b) and (c) For (c), using CSV export from Tinderbox for import to Excel.

Timeline views can be quite dense, and a large monitor or two will be helpful.

Timeline is useful for visualizing, but for in-depth analysis of legislative history I think Attribute Browser is the better choice.

Since you’ll be dealing with hundreds (perhaps thousands) of notes, I suggest it’s best in this case to not attempt too much incremental formalization when gathering data. Say, you decide not to track all the dates in a bill’s history (introduction, committee reports, committee action, floor votes, etc.) and later realize that the history is important to your analysis. Does your project allow for time to go back and capture data you didn’t capture initially? Did you not capture links to the legislative documents in URL attributes and then later need to do it?

These questions are another reason to start with a small sample project. Pick 5 health bills for each year of your 10 year period (50 master notes) and try a file. Does the timeline work for you or is AB better? Are you capturing the right attributes?

mwra · October 14, 2018, 9:14pm

I can’t agree enough with the last post. Given the implicit size of the dataset there is a lot of sense in using a small sample of data from across the dataset (i.e. both early and late in date) to work out the necessary user attributes you may need to capture the key metadata.

By default, timelines look at the children (and descendants) of the given container, using $StartDate (and $EndDate). However, Timelines can use any Date-based attributes to plot data and agent can act as the source container for a timeline.

As a general point, real world data tends to be distributed inconsistently across the timeline so it’s difficult to get a nice looking timeline visualisation. One’s impulse is to blame the tool, but having tried a lot of them, I don’t think anyone has yet found a way to work around the ‘lumpiness’ of real world data. Most tools show their wares with deliberately, but false, evenness of distribution. IOW, getting an aesthetically pleasing timeline layout is hard.

There are specialist timeline tools but—having used those too—Tinderbox is a much better place to start analysing the data you’ll eventually visualise even if you end up using a singe purpose timeline app for final visualisation.

lfriedla · October 15, 2018, 2:48pm

Thanks to both of you for your generous and quick replies.
I realized, Paul, that I omitted a key fact that you picked up on immediately. We are analyzing all of the bills in a large quant DB, using large computational methods, but this is the the “qualitative” part of the analysis so I am focusing primarily on key events: major tax bills (not all), etc. My goal is to pick up the high points, with the most impact, but my judgement guides the decision. So, Mark, the lumpiness of the potential timelines is an issue, but that’s ok. (Actually, it’s a theoretical issue in this analysis: large meaningful events are uneven, as is their effect on public opinion).
Mark, it sounds as if simply including a date field or two (e.g. date introduced, date passed or signed into law) is more than sufficient to generate the timelines. Once I am playing around with them may seek more guidance on timelines, since I’ve never had much luck (skill) with them.

PaulWalters · October 15, 2018, 2:59pm

It would be interesting to know what qualitative info a timeline will provide – vs., say, Attribute Browser. Obviously a timeline indicates “this occurred before that” or “this and that were simultaneous or nearly simultaneous”. Those nuggets are available by inspection in Outline and AB too. So how do you expect the Timeline to add value to the analysis?

Not saying it can’t but knowing expectations may be helpful.

mwra · October 15, 2018, 3:00pm

Absolutely. For any given timeline view - i.e. any container with children/descendants, the Timeline properties pop-up, accessed via the ‘i’ logo on the tab’s label will let you swap out different Date-type attributes. Regardless, if you capture important dates for a bill, each in a discrete Date-type attribute it’s also easy to use an action to copy pertinent dates (which might vary in type per note) into $StartDate and $EndDate. IOW, at this early point, i’d capture all the dates of interest for a bill into $text (at least) but also/instead into as many user Date attributes as you need. Then you’ve got all the data to hand whilst you experiment with it’s best fit into a timeline.

Don’t overlook timeline bands as they can help tease out different themes within the overall data.

lfriedla · October 15, 2018, 3:08pm

Paul: a good question. My intuition is that it will help me visualize a) the flow of significant legislation over multiple years and b) the clustering of legislation (e.g. the “lumpiness” that is a given of the process). May not be useful, but being able to switch easily between timeline and AB helps.

Mark, thanks for advice. I’ve already set up a prototype based on both of your comments. Has a startdate (introduced) and enddate (passed or signed) attribute. Others are pretty simple: author (legislator), district, url of leg, source of additional info (e.g newspaper article), and a box for whether the source of legislation was an outside organization (e.g. in our case ALEC, a conservative national organization), an abstract, and keywords. More than enough to start.

PaulWalters · October 15, 2018, 3:19pm

Thanks Lew - I think Mark’s suggestion of Timeline Bands is useful. You might also consider using agents to tease out relationships and create timeline views for the agent containers too.

As you develop results if there is non-proprietary view of how you wind up using Tinderbox / Timelines / AB, etc., it would be interesting to see screen shots. This sort of socio-political analysis might be an instructive use case for other researchers looking to use Tinderbox.

lfriedla · October 15, 2018, 3:37pm

Paul, absolutely. This is all public record and I work in a public university. If anyone can benefit from this, would be pleased.
I also may come back once I’ve built this up a bit for guidance on Timeline Bands.
And now, I will ask my customary dumb question when I am starting a new project with a layoff from TBX:
I created a LegPrototype (as described). I want each category (e.g. Taxes) to a) take that prototype; b) have distinct keywords (e.g. taxes) that all notes inherit. I inserted this code in OnAdd to the LegPrototype: $RefKeywords=$RefKeywords(parent);
But the keywords from the parent don’t show. Is there something obvious here?

mwra · October 15, 2018, 4:04pm

The LegPrototype OnAdd action is fired once when a note is moved to be a child of the prototype or a new note is created as a child of the prototype. Most likely this isn’t where you’re placing notes using the prototype and thus the lack of expected results.

Simple by making a note use a prototype, it should inherit the prototype’s $RefKeywords. Note, though, if you had edited a note’s $RefKeywords before applying the prototype, this breaks inheritance unless you reset it. If that all doesn’t make sense see my tutorials on Inheritance and prototypes. More tutorials.

lfriedla · October 15, 2018, 4:26pm

Mark, all makes sense. Unfortunately, I know about the firing rule, and did some experiments. Just did again to make sure; deleted old note, added new one, parent has keywords, did not show. Will fiddle a bit more and read tutorials again.

mwra · October 15, 2018, 6:39pm

Well, using your description, I can’t replicate the OnAdd problem. Can you post a small example file that replicates the problem?

lfriedla · October 16, 2018, 2:56pm

Mark, thank you, it was a classic mistake. I had the RefKeywords code right, but had forgotten to include: $Prototype=$Prototype(parent).
Guess I need to get out more

mwra · October 16, 2018, 3:15pm

No worries. Easily done.