Pluralization and singularization of nouns

The nounList function will return a singular and plural noun as two entries.
$MyList = “let uns try school and many schools”.nounList; will return try;school;schools

Is there a way to singularize the nouns? I want only a single entry in the list and depluralize all plural nouns.

A quick test with some English examples (I recall your original request was re German) would suggest that normalisation of singular-vs.plural is not undertaken—at present! I’m not usre what underlying framework is in use and thus whether it can offer such a choice.

What would be really good would be the lemmatization of the text. Something like that (Swift):

import NaturalLanguage

let text = “This is text with plurals such as geese, people, and millennia.”
let tagger = NLTagger(tagSchemes: [.lemma])
tagger.string = text

tagger.enumerateTags(in: text.startIndex…<text.endIndex, unit: .word, scheme: .lemma) { tag, range in
let stemForm = tag?.rawValue ?? String(text[range])
print(stemForm, terminator: “”)
return true
}

(see: How to lemmatize text using NLTagger - free Swift 5.4 example code and tips)

1 Like

When I tried this, a few years back, the API wasn’t good enough in English. It might be better now!

I’m away from the office anyway. Perhaps you could write a concrete example of how this facility could help a specific person with a specific problem?

Hello Mark,

if you look into this thread - importing annotations and then auto-connect them using Natural Language toolkits would help a lot when working with academic papers.
To set the links manually is one way to go and would deliver a high quality for the structure you create. But the automatic linking gives an alternative way to look into your data and it is very, very quick. Maybe we could call it “suggested linking” and this would be a feature to promote Tinderbox (Obsidian needs a PlugIn for this feature) too :wink:

I was using MarginNote to create my annotations in a PDF. Margin Note puts those notes on a map and links them into a kind of mind map. This was helpful when extracting the ideas found in the paper. I left MarginNote because of its closed architecture and because there is no way to export the annotations together with the PDF. But the idea of using a map view to structure the notes is what I miss now.

The nounList function isn’t perfect if there is no converting of the different forms a noun could have. Singular and plural forms should make no difference here. And the NLTagger seems to offer an easy implementation but this is just a wild guess.

Thanks and have a good time in London.

Detlef

You know that the Map View is pretty good, right?

Also, there’s a LOT more to linking than linking among synonymous nouns.

Yes - I agree with you 100%. Many of the map views, tag clouds or 3D clusters are primarily nice marketing gimmicks. But in a defined and narrowly defined area, the automatically generated links can be very helpful - a kind of start for my own, intelligent connections between the information units.
For me, the exported highlights are only a preliminary stage for my own thoughts. I create new notes on this basis and store them in a structure that I think about very carefully and in which I no longer use these automatic links. But this extension to TB would really help me on the way to my own associations.

1 Like

@webline - you may not realize this but you are talking about things Mark was writing about 30 years ago and publishing before he ever came out with Tinderbox.

He had a hash table algorithm to speed up computers analyzing language to find links in language and stack links into tours. Back in the days of HyperCard, that was far out! And a number of exploratory methods to suggest tour routes through notes.

Back in those days, I was playing with semantic heuristics to find conceptual links in tech papers. That was before we had machine learning and AI neural nets. I was using giant vector arrays on a cray-3 and realizing that there just wasn’t memory or bandwidth yet. Using Monte Carlo and game theory to simulate helped but it was a problem waiting for another day. Ultimately, I went other ways and subsequent law school damaged that part of my brain that used to solve boundary equations and fourth order equations for fun. I can still remember the best parts of stat thermo and p-chem though.

Mark didn’t relent and he has persisted in the space of links, linking, meaning of links, how they are depicted, how the user perceives, and so on and turned it into so many features of tinderbox that it’s easy to overlook how much he’s put towards all that and advanced what we have available to us in a consumer app because it is not hidden but so non-intrusive.

Mark’s general approach is to listen and suggest but when I saw you talking about how you’d like to go about linking, I had to say something!!! Pick his brain!!! He’s been researching and exploring this very idea for decades!!!

3 Likes

I can remember good deal of of statistical mechanics, and even more physical chemistry! Great to run into you again!

1 Like