The New York Times homepage Taxonomy DNA.
12/14/18-3% threshold-7 day rolling learning.

AI-Operated Taxonomy.
As you could read in the FAQ, “TrustedOut makes Intelligence trustworthy with its AI-operated media profiling, TrustedOut brings Corpus Intelligence to feed analytics tools, so business managers can make strategic decisions on sources they do trust”.
Collection and Classification.
TrustedOut media profiling uses 2 different methods: collection and classification to permanently feed and update its media database and thus, our user corpuses.
Today, let’s focus on the latter, AI-Operated Classification:
Our taxonomy (the hierarchy and list of all classifications) is now stable. AI continues to learn and classify media and sources at all times.
At very moment, our taxonomy has 3 levels of categories:
- 4 level1 (trunks)
- 17 level2 (branches)
- 291 level3 (leaves)
Comparing New York’s NYT homepage and San Francisco’s SFGate Bay Area News.
Below are the top10 categories for both sources.
Interesting to compare and see, amongst other things, Transportation, Bus in particular appearing in SF and Politics, Law in NY.
Reminder: Here, we are not talking about articles topics but in what categories we are classifying sources (feeds) and thus medias.
SF Gate – Bay Area News | New York Times – Homepage |
People 48.1%
|
General 44.9% |
Industries 27.5%
|
People 39.0% |
Industries › Transportation 20.1%
|
General › Politics 26.1% |
Sciences 18.1%
|
General › Law 11.5% |
People › Society 13.3%
|
People › Society 11.4% |
People › Public Services 10.0%
|
General › Politics › Government 8.7% |
Industries › Transportation › Bus 8.1%
|
Sciences 8.1% |
People › Public Services › Emergency Services 8.0%
|
Industries 8.0% |
People › Sports 7.0%
|
General › Politics › Political Party 6.6% |
People › Society › Misc News 6.7%
|
People › Culture And Arts 6.0% |
The 2 Taxonomy DNA views
New York Times – Homepage

SFGate – San Francisco Bay Areas News

Corpus Intelligence in action
In our example above, if a TrustedOut user wants to get some insights on Transportation in Buses in America, she/he will use the condition “Taxonomy is Transportation>Bus” AND “Country is USA” in the Corpus definition (UI below). In this case, and with the taxonomy setup (threshold and sensitivity will be covered in another post), only the source for SFGate will part of the Corpus.
The screenshot above comes from the “Country comparisons” Business Case.