This document is archival. New document is Credibility Signals.

This document was never reviewed by the Credible Web group.

This document is archival. Please use Credibility Signals instead.

This version includes all documented indicators. You may want Release 1 Candidates.

This document aims to increase interoperability among systems cooperating to make the web more trustworthy. The approach taken here is to exchange "indicators" which potentially show which items of web content (initially news articles) are worthy of greater trust.

This is a dynamic draft, with content which refreshes from master data at every reload.

Intro

@@@ TODO

Article Structure

Subject Area

Data Type
Multiple Choice (though I think one is default)
Definition
Indicates a subject of an item: Sports, Entertainment
Frame as a Question
What is the article's genre?
Good Toy Example
Technology
Interoperability
NewsCodes (IPTC)
Interoperability Example
http://show.newscodes.org/index.html?newscodes=subj&lang=en-GB&startTo=Show
Potential Markup Effort
Medium
Problematic Toy Example
Sports and Entertainment (just highlighting varying classifications)

Source Language (if translation)

Data Type
Multiple Choice
Frame as a Question
If the article is a translation, what is the source language?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Shows versions and changes

Data Type
Multiple Choice (How rigorous the changes are. Is there a Newsdiff-type thing)
Definition
Shows versions, changes, diffs of an article.
Frame as a Question
Does this article show revisions/diffs? (Most places do not)
Potential Markup Effort
Low

Publication (site)

Definition
Parent of the Article.
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low
Tags
@

Length

Data Type
Integer
Dynamic; Change over Time?
true
Frame as a Question
How many words does the article contain?
Machine Generation from Article
true
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Language

Data Type
Multiple Choice
Frame as a Question
What language is the article written in?
Machine Generation from Article
true
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Is Translation

Data Type
Boolean
Frame as a Question
Is the article a translation?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Is Original

Data Type
Multiple Choice
Definition
(A) Most likely original (B) Appears to be a copy of one or more articles, with some portions different or remixed (C) Extensive quoting from another source, with some original content (D) A wholesale duplicate of another article
Frame as a Question
Has the text of this article appeared in exactly the same words or very similar words in another publication?
Location
Article Context
Potential Markup Effort
Medium
Priority Rationale
Like doing a reverse image search, knowing that the article is original does not by itself guarantee credibility. However, knowing if it's *not* original can be a flag that the site is either duplicating content or running content from a newswire service. We might want to set this as scalar, as there might be slight modifications to original content.
Proposed Priority for WebConf Paper
Very High
Rationale for Inclusion
Sometimes article texts get repurposed in new publications. This can be due to licensing agreements from a wire service or the article can simply be stolen without crediting the original article. Sometimes the article is copied wholesale while other times some words are changed or a new article is created that copies from multiple articles. Finally, some articles will quote extensively from one or more other articles with only a small amount of original reporting or writing.
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Headline

Dynamic; Change over Time?
true
Frame as a Question
What title or headline does this article offer for itself?
Machine Generation from Article
true
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Genre

Data Type
Multiple Choice (should include Unknown/Unspecified)
Definition
Opinion, Feature, Biography -- but it will not always be labeled and i am not sure this will be defined consistently
Frame as a Question
What is the state genre, if available?
Good Toy Example
Opinion
Interoperability
NewsCodes (IPTC)
Interoperability Example
http://show.newscodes.org/index.html?newscodes=genre&lang=en-GB&startTo=Show
Potential Markup Effort
Medium
Problematic Toy Example
Quote (see IPTC genre) or, unknown
Proposed Priority for WebConf Paper
Medium

Factual assertions

Data Type
Boolean or multiple choice
Definition
Complements the Genre (is the content actually opinion piece? or verifiable information?)
Frame as a Question
Does the article contain factual, verifiable assertions or is entirely opinion based?
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Medium

Dateline (Location)

Data Type
Geolocation
Frame as a Question
Where does the article claim it was published?
Machine Generation from Article
true
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Dateline (Date)

Data Type
Date and Time
Definition
From An: "i think technically this states both the date and location of the article’s publication. in which case it needs two forms of operationalization
Frame as a Question
When does the article claim it was published?
Machine Generation from Article
true
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Correction/Redaction

Data Type
Boolean or Multiple Choice
Dynamic; Change over Time?
true
Frame as a Question
Does the article contain a stated correction or redaction?
Machine Generation from Article
true
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Medium

Author

Article Example URL
https://www.newscientist.com/article/mg23631503-200-is-modern-life-making-todays-teenagers-more-depressed/
Data Type
Text
Definition
Articles 1:M authorship. Note that not all articles have bylines, even in traditional news sources. Bylines also don't always start with 'by' (https://medium.com/@rchang/advice-for-new-and-junior-data-scientists-2ab02396cf5b)
Frame as a Question
Who is the author of the article?
Good Toy Example
"By Abigail Beall" (https://www.newscientist.com/article/2151823-bitcoin-mining-uses-more-energy-than-ecuador-but-theres-a-fix/)
Interoperability
Schema.orgNewsML-G2 (IPTC)
Interoperability Example
http://schema.org/author, http://www.iptc.org/std/NewsML-G2/2.25/specification/XML-Schema-Doc-Power/NewsItem.html#Link60
Machine Generation from Article
true
Potential Markup Effort
Low
Problematic Toy Example
No known author [author is 'the publication']
Proposed Priority for WebConf Paper
Low

Attribution of Non-Original Content

Data Type
Multiple Choice
Definition
(A) Attribution was not given (B) Attribution was given but was inaccurate (C) Attribution was given and was accurate (D) Unclear which is the original
Frame as a Question
If the content of the article is not original, was attribution given and if so, was the attribution accurate?
Location
Article Context
Potential Impact
Medium
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
Originality or non-originality is not by itself a sufficient condition for measuring credibility. Understanding the attribution of non-original content is going to be key.
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Article Locator

Definition
URL, DOI,
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Low

Article Awards

Definition
Awards are also assigned to specific Articles (but rarely)
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Medium

Article/Site Metadata

Subhead/Dek

Machine Generation from Article
true
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
Just a note that this can be critical; people don't always read the articles they forward, they do depend a lot on this Dek/summary hook

Publication Domain Registration Location

Data Type
Date
Machine Generation from Article
true
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Publication Domain Registration Date

Data Type
Date
Machine Generation from Article
true
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Photo/Video Geotags

Data Type
Geolocation
Frame as a Question
What are the geolocations for the photos and videos on the article?
Machine Generation from Article
true
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Low

Article Rights

Data Type
Multiple Choice
Definition
Explicit Copyright, Creative Commons, Unstated, etc.
Frame as a Question
What are the rights for this article?
Machine Generation from Article
true
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Low

Ads.txt Exists

Data Type
Boolean
Definition
ads.txt (Authorized Digital Sellers) is an Interactive Advertising Bureau initiative. It specifies a text file that companies can host on their web servers, listing the other companies authorized to sell their products or services. This is designed to allow online buyers to check the validity of the sellers from whom they buy, for the purposes of internet fraud prevention. (from https://en.wikipedia.org/wiki/Ads.txt)
Dynamic; Change over Time?
true
Frame as a Question
Does an ads.txt file exist on the domain?
Good Toy Example
https://www.huffingtonpost.com/ads.txt
Known Weakness
Standards still under development
Machine Generation from Article
true
Notes
Status: DRAFTING
Potential Markup Effort
Very Low
Problematic Toy Example
https://www.reddit.com/r/adops/comments/703k4s/is_there_a_legitimate_reason_to_accept_an_adstxt/
Rationale for Inclusion
Evolving standard. some initial brainstorming on other potential features: ```- the number of sellers in given file (could consider SSPs and seller accounts separately) - the balance of DIRECT vs. RESELLER flags - which SSPs are included (some are more quality/stringent than others) - weighting SSPs included by the number of accounts per SSP.``` I’m not sure that any of those would be strong misinfo signals especially since it seems like publishers are still figuring out how to handle ads.txt, but could be things to explore.
Release Target
June 2018
Source Team
Credibility Coalition Study: June 2018
Supporting Evidence
https://wiki.appnexus.com/display/industry/AppNexus+Support+for+Ads.txt
Tags
Revenue Model

Author Reputation

Track record

Article Example URL
http://www.lse.ac.uk/GranthamInstitute/news/uk-newspaper-regulator-acts-against-fake-news-story-about-climate-change-which-fooled-us-congressman/, https://climatefeedback.org/authors/david-rose/
Data Type
scale
Dynamic; Change over Time?
true
Frame as a Question
Has the author already published articles containing misleading or credible information?
Potential Markup Effort
Medium
Priority Rationale
A track record of credible or misleading reporting can be a very strong indicator of the author's current article. It may also be very difficult to reproduce at scale.
Problematic Toy Example
Daily Mail journalist David Rose has published a number of misleading articles on the topic of climate change as shown by IPSO, the UK MetOffice, Climate Feedback...
Proposed Priority for WebConf Paper
High

Public Accessibility

Data Type
scale, or set of booleans
Definition
How accessible is this author to the public. Do they have a website? Does this author have a publicly available email address? Are they on Twitter or Facebook? Do they often respond to readers?
Dynamic; Change over Time?
true
Frame as a Question
How accessible/responsive is this author to the public?
Potential Markup Effort
High
Priority Rationale
<airtable:mention id="menKVnNV4RtZwePOP">@Amy Zhang</airtable:mention> Can you fill in some details here?
Proposed Priority for WebConf Paper
High

Occupation

Data Type
Category
Definition
What is the occupation of the author. Are they a full-time journalist or something else
Dynamic; Change over Time?
true
Frame as a Question
What is the occupation of the author?
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Medium

Number of publications

Data Type
integers tied with categories (publication names)
Definition
How many publications does this author have and in which venues.
Dynamic; Change over Time?
true
Frame as a Question
How many publications does this author have and in which venues?
Notes
I feel like "which publications" rather than number is a stronger indicator (you can have both, but if I had one, I'd go with which). Also like where would you get "number?" That's hard to track. J8L
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Has Author Bio

Notes
Has author bio, or author bio?

Followers/Listeners

Data Type
scale
Definition
How many people follow this author (on social media). How many other journalists follow this author? What speaking engagements does this author command?
Dynamic; Change over Time?
true
Frame as a Question
What degree of attention does this author command from other individuals?
Known Weakness
clearly can be gamed, by raw twitter follower count by bots or by other strategies like state-sponsored ecosystems
Machine Generation from Article
true
Potential Markup Effort
Medium
Priority Rationale
<airtable:mention id="meno2pYeEX76tMUGi">@Amy Zhang</airtable:mention> can you fill in rationale here?
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Kind of vague and could be broken down. I don't know if raw Twitter follower count is quite right but some idea of who listens to or associates with this author.

Education Credentials

Data Type
category or series of booleans?
Definition
What is the educational background of the author. Do they have a degree in journalism? Do they have any post-grad education?
Dynamic; Change over Time?
true
Frame as a Question
What is the educational background of the author?
Notes
I don't know we should narrowly frame it as "education" as a top level indicator, rather as "expertise" or which education can be a key subpart?
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Medium

Claim

Fact-check results of claim

Data Type
Single Choice
Definition
A) Most likely not fact-checked B) Fact-checked and determined false C) Fact-checked and determined true D) Fact-checked with unclear results E) Fact-checked with mixed results
Frame as a Question
What are the results of the fact check of the central claim in this article?
Notes
One of two indicators replacing https://credweb.org/cciv/#claim-has-been-fact-checked
Rationale for Inclusion
Modification of https://credweb.org/cciv/#claim-has-been-fact-checked used in Web Conference 2018 study above.
Source Team
Credibility Coalition Study: Summer 2018

Claim has been fact checked

Data Type
Single Choice
Definition
A) Most likely not fact-checked by an IFCN Verified Signatory B) Most likely not fact-checked by an approved source C) Fact-checked and determined false D) Fact-checked and determined true E) Fact-checked with unclear results F) Fact-checked with mixed results
Frame as a Question
Has the central claim in this article been fact-checked by an IFCN Verified Signatory?
Location
Article Context
Notes
Deprecated for Credibility Coalition Study: Summer 2018
Potential Impact
Very High
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Very High
Rationale for Inclusion
As many IFCN signatories use the ClaimReview schema, understanding correlations between fact checks and credibility can help us understand how best to utilize the schema or improve it.
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Claim contains misleading assertion

Claim contains logical fallacy

Claim contains false assertion

Claim contains false and misleading assertion

Claim contains bad data

Inbound References

Verdict from fact-checking websites

Definition
If the page (or one with the same claim) has been fact-checked by one of the signatory organisations, show this inlink with the fact-checking verdict.
Dynamic; Change over Time?
true
Frame as a Question
Has a trusted fact-checking organisation verified it?
Known Weakness
N/A
Potential Markup Effort
Medium
Priority Rationale
Most reliable 3rd party validation, should be included for sure.
Proposed Priority for WebConf Paper
Very High
Rationale for Inclusion
We should show any existing fact-checks from trusted organisations. This is a form of 3rd party validation.

Ratio of comments to likes <10%

Definition
Ratio of comments to likes on Facebook. Indicative of bot-promoted content.
Dynamic; Change over Time?
true
Frame as a Question
Does it have many more likes than comments?
Known Weakness
Can be easily gamed by posting dummy comments.
Potential Markup Effort
Low
Priority Rationale
<airtable:mention id="menvPR5vY15x0DrQt">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Indicative of bot networks? (NY meeting)

Number of links in Wikipedia

Data Type
Integer
Definition
Number of inbound links from Wikipedia main namespace.
Dynamic; Change over Time?
true
Frame as a Question
How many links on Wikipedia articles point to the article?
Known Weakness
Context-dependent. E.g. the link might be coming from a list of fake news websites, would require some contextual filtering. Also, in principle anyone can add a link albeit in all likelihood for only a brief period of time before it is reverted.
Machine Generation from Article
true
Potential Markup Effort
Low
Priority Rationale
Links to the article in Wikipedia indicate a level of social relevance that is more enduring than inlinks from social in a given moment. This can easily be gamed, but this also relies on the immune systems that Wikipedia has set up.
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
Wikipedia is the highest quality reference material on the internet, and links to articles (outside of writing about fake news, or lists of fake news websites) would be a form of third party validation.

Links from social media

Definition
What social media accounts have shared the URL? Do they include credible accounts?
Dynamic; Change over Time?
true
Frame as a Question
What are the 5 most liked (Twitter, Facebook), upvoted (Reddit) social accounts that shared this link
Known Weakness
Depends on a catalogue of "trusted accounts"
Potential Markup Effort
Low
Priority Rationale
<airtable:mention id="mencvUniaucJ7PaTC">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Assumes that low-quality social accounts link to low-quality content.

Links from other news sites

Definition
What other news sites are linking to this URL? Does it include credible sites or not?
Dynamic; Change over Time?
true
Frame as a Question
What are the 5 most popular (Alexa rating) news sites linking to this article?
Known Weakness
Depends on a catalogue of "trusted websites".
Potential Markup Effort
Low
Priority Rationale
<airtable:mention id="menz50fBqZ3NmhpA9">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Assumes that low-quality content links to other low-quality content.

Facebook shares

Definition
Number of times shared on facebook. Important to estimate the reach of the news article.
Dynamic; Change over Time?
true
Frame as a Question
How many facebook shares?
Known Weakness
Not directly related to credibility
Potential Markup Effort
Low
Priority Rationale
<airtable:mention id="menIqOs8wrxKXUbS3">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Not directly related to credibility, but to reach.

Facebook engagement

Definition
Number of facebook engagement. Important to estimate the reach of articles.
Dynamic; Change over Time?
true
Frame as a Question
How many people engaged on facebook?
Known Weakness
Not directly related to credibility
Potential Markup Effort
Medium
Priority Rationale
Engagements is a composite of comments, likes and shares. As a total number, it potentially reflects popularity of a given article on social media. This value, or any of the elements that comprise it could be used in conjunction with other indicators to try and extract the credibility (or lack-thereof) of the source or article. For example: if an article has a high value for engagement but the domain
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Not directly related to credibility, but to reach.

Facebook comments

Definition
Number and content of facebook comments. Important to estimate the reach and sentiment of articles.
Dynamic; Change over Time?
true
Frame as a Question
How many facebook comments?
Known Weakness
Can be gamed with communities with a strong agenda or bots.
Potential Markup Effort
Medium
Priority Rationale
<airtable:mention id="menjtGf3vGrlsWiwF">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Not directly related to credibility, but to reach.

Facebook comment sentiment

Definition
Sentiment analysis on Facebook comments. Important to gauge the attitudes towards the news, and types of reactions they invoke.
Dynamic; Change over Time?
true
Frame as a Question
What is the sentiment of the facebook comments?
Known Weakness
Can be gamed with communities with a strong agenda or bots.
Potential Markup Effort
High
Priority Rationale
<airtable:mention id="menOQnaAgtpSwpezx">@Cameron Hickey</airtable:mention> 
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
Not directly related to credibility, but to reach.

Journalistic Rigor

Shares any documents cited in piece

Data Type
Multiple Choice or Boolean
Frame as a Question
Does this article publish any important documents cited in the article?
Potential Markup Effort
Low
Rationale for Inclusion
"Show your work." Also allows readers to judge for themselves.

Represents scientific process

Data Type
Boolean
Frame as a Question
Does the content rigorously represent the scientific process?
Potential Markup Effort
High
Priority Rationale
Not simply regurgitating scientific findings but setting them in the context of the scientific method might be the mark of a paid science writer, who offers a critical perspective for the reader. This may be difficult to reproduce by misinformation agents?
Proposed Priority for WebConf Paper
Very High

Represents scientific literature

Data Type
Scale
Frame as a Question
Does the content fairly represent scientific literature on the issue at hand?
Potential Markup Effort
Very High
Priority Rationale
Accurate representation of scientific literature reflects a certain level of knowledge of the literature that can be difficult to replicate. Presumably, only scientists and full-time science writers with some subject matter knowledge can do this effectively?
Proposed Priority for WebConf Paper
Very High

Primary subjects of article have opportunity to respond

Data Type
Multiple Choice or Boolean
Frame as a Question
Do the main subjects of the article have a chance to respond?
Known Weakness
Can be made up.
Potential Impact
High
Potential Markup Effort
Low
Problematic Toy Example
https://thehill.com/homenews/administration/332132-fcc-probing-colberts-trump-putin-joke
Rationale for Inclusion
Giving subjects a chance to respond or say "no comment" is basic journalism rigor. Many articles spiral because premise is taken out of context.

Presents multiple perspectives

Data Type
Boolean?
Frame as a Question
Does the content fairly represent multiple perspectives on an issue?
Known Weakness
False balance. Enforcing a "multiple perspectives" approach to science topics can lead to pathological results over-representing fringes, eg vaccines and climate change.
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Premise of article is disputed by primary subjects

Data Type
Multiple Choice or Boolean
Known Weakness
Dispute could possible be in the article, or afterwards in "fallout", which would not be captured in the original article.
Problematic Toy Example
https://www.bloomberg.com/news/features/2018-10-04/the-big-hack-how-china-used-a-tiny-chip-to-infiltrate-america-s-top-companies
Rationale for Inclusion
Disputed does not mean not credible. In fact may be more credible. Again interplay. But worth noting. Dispute can also be internal or external to the article.

Gives motivation of anonymous sources for revealing information

Data Type
Multiple Choice or Boolean
Frame as a Question
Does this article explain the motivation of any of the anonymous sources reveal information?
Potential Markup Effort
Medium
Rationale for Inclusion
Knowing the motivation of the anonymous source is important to assessing the credibility of the article. Was this a purposeful leak? Was it only confirmed after reporters approached them. Examples: "Sexual assault victim," "fears for safety," "government official revealing secret information."

Dependency on anonymous sources

Data Type
Multiple Choice (premise of article is based on anonymous sources to more mild) or Boolean
Definition
Does this premise of this article rest on anonymous sources. This can be made more nuanced depending on how many sources, and how critical it is to the main premise of the article.
Frame as a Question
Does this article cite anonymous sources for the premise of the article?
Potential Impact
High
Potential Markup Effort
Low
Rationale for Inclusion
Critical factor in "showing your work" or "replicability." This will likely interplay with the institution that publishes it. Should likely have different statuses of how critical the anonymous source is. "Rape victim" vs. "government official." Also can be multiple choice. "Premise based entirely on single anonymous source." Premise based on two anonymous source." "Premise based on two more more anonymous sources."

Corroboration?

Cites wire services

Data Type
Boolean?
Frame as a Question
Does the article clearly state if the content comes from a known wire service?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Medium

Logic/Reasoning

Use of conspiratorial thinking

Article Example URL
https://www.naturalnews.com/2017-10-02-lone-gunman-theory-of-las-vegas-shooter-is-complete-nonsense-stephen-paddock.html
Data Type
boolean
Frame as a Question
Does the article suggest a conspiracy?
Known Weakness
Corruption does exist, not all conspiracy theories are BS
Potential Markup Effort
Very High
Priority Rationale
<airtable:mention id="men7jP0Rb7kY8wT7i">@ADITYA RANGANATHAN</airtable:mention> <airtable:mention id="menSXr8Q1m59hn4Ls">@Emlen Metz</airtable:mention> 
Problematic Toy Example
"It’s all hogwash. The “official” narrative of how things went down..."
Proposed Priority for WebConf Paper
High

Supporting Claim Types

Data Type
Multiple Choice
Definition
Types of evidence for a claim: Correlation Cause precedes effect The correlation appears across multiple independent contexts A plausible mechanism is proposed An experimental study was conducted (natural experiments OK) Experts are cited Other kind of evidence No evidence given
Frame as a Question
What evidence is given for the primary claim? Select all that apply.

Straw Man Argument

Data Type
Likert
Definition
Presentation of a counterargument as a weaker, more foolish version of the real counteargument
Frame as a Question
Does the author present the counterargument as a weaker, more foolish version of the real counterargument (use a Straw Man Argument)? If so, highlight the relevant section(s).
Location
Article Content
Potential Markup Effort
Very High
Problematic Toy Example
"Proponents of nuclear energy have argued that it is safer than it used to be. But we don’t want energy that is barely safer than Chernobyl."
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Public Editor

Slippery Slope Argument

Data Type
Likert
Definition
Argument that one small change will lead to a major change
Frame as a Question
Does the author say that one small change will lead to a major change (use a slippery slope argument)? Highlight the relevant section(s).
Location
Article Content
Potential Markup Effort
Very High
Problematic Toy Example
"If we allow human cloning, we will soon be overrun by armies of human clones."
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Public Editor

Orders of Understanding

Data Type
Scale
Good Toy Example
"We need to do a better job exploring the complex set of causes that produced 9/11, among them the rise of Islamic fundamentalism and postcolonial nationalism, as well as the spread of globalism and regional conflicts."
Potential Markup Effort
Very High
Priority Rationale
<airtable:mention id="menvnpLZMNWoUIwoO">@Emlen Metz</airtable:mention>  <airtable:mention id="menpfHH6DZC5QnE70">@ADITYA RANGANATHAN</airtable:mention> 
Problematic Toy Example
"The Hurricane Harvey disaster was caused entirely by society creating and perpetuating vulnerability to these natural events."
Proposed Priority for WebConf Paper
Very High
Source Team
Public Editor

Numbers of Argument Components

Data Type
Integer
Frame as a Question
How complex is the argumentation and logic in the article?
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Low
Potential Markup Effort
Very High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Number of Supporting Premises

Data Type
Integer
Frame as a Question
How far does the article back up arguments with clear hypotheses
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Medium
Potential Markup Effort
Very High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Number of Enthymemes (Arguments with Missing Premises)

Data Type
Integer
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Medium
Potential Markup Effort
Very High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
From argument mining literature: http://aclweb.org/anthology/W/W16/W16-2804.pdf
Source Team
Factmata

Number of Claims

Data Type
Integer
Frame as a Question
How far could this article go wrong?
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Low
Potential Markup Effort
High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Number of Attacking Premises

Data Type
Integer
Frame as a Question
How far does the article contradict itself
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Medium
Potential Markup Effort
Very High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Number of Arguments For

Data Type
Integer
Frame as a Question
How biased is the article towards one set of premises
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Medium
Potential Markup Effort
High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Number of Arguments Against

Data Type
Integer
Frame as a Question
How biased is the article towards one set of premises
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Medium
Potential Markup Effort
High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Naturalistic Fallacy

Data Type
Likert
Definition
Suggestion that something is good because it is natural or bad because it is not natural
Frame as a Question
Does the author suggest that something is good because it is natural, or bad because it is not natural (the naturalistic fallacy)?
Location
Article Content
Potential Markup Effort
Medium
Problematic Toy Example
"Vaccines aren't natural; they're full of all kinds of lab-manufactured chemicals. That stuff can't possibly be good for you."
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Public Editor

Mistaking Noise for Signal

Article Example URL
https://www.sbnation.com/lookit/2016/11/2/13506174/2014-tweet-predicts-world-series-ninth-inning-game-7-cubs-indians
Data Type
Scale
Potential Markup Effort
Very High
Priority Rationale
<airtable:mention id="menvnpLZMNWoUIwoO">@Emlen Metz</airtable:mention>  <airtable:mention id="menpfHH6DZC5QnE70">@ADITYA RANGANATHAN</airtable:mention> 
Problematic Toy Example
"The seemingly miraculous Twitter prediction isn’t new, but normally we see it from some never-used account that is kept private and meticulously stocked with every possible permutation — only to be made public and get attention at a later date. That’s not the case for @RaysFanGio. He uses his account all the time — this was a one-off prediction and we should all be scared."
Proposed Priority for WebConf Paper
Very High
Source Team
Public Editor

Just World Fallacy

Identifiable Victim Effect

False Dilemma

Data Type
Likert
Definition
Presentation of a complicated choice as if it is binary
Frame as a Question
Does the author present a complicated choice as if it were binary (construct a false dilemma)? If so, highlight the relevant section(s).
Location
Article Content
Potential Markup Effort
Very High
Problematic Toy Example
"America; love it or leave it! If you're not happy with U.S. policies, you should just move somewhere else."
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Public Editor

Draws sound conclusions from available evidence

Potential Markup Effort
Very High
Priority Rationale
<airtable:mention id="menvnpLZMNWoUIwoO">@Emlen Metz</airtable:mention>  <airtable:mention id="menpfHH6DZC5QnE70">@ADITYA RANGANATHAN</airtable:mention> 
Proposed Priority for WebConf Paper
High

Confirmation Bias

Data Type
Scale
Good Toy Example
“Lots of bad things have happened in the last month. Since bad things have happened under every president, maybe we shouldn’t count that as additional evidence that the current president is awful.”
Known Weakness
Difficult to assess
Potential Markup Effort
Very High
Priority Rationale
@Emlen Metz  @ADITYA RANGANATHAN 
Problematic Toy Example
“I think the world is getting worse. So many bad things have happened in the last few years; just think about it! Famine, mass murder, hurricanes, wildfires. Everything is going wrong.”
Proposed Priority for WebConf Paper
High
Source Team
Public Editor

Causal Claim Types

Data Type
Multiple Choice
Definition
General Causal Claim Singular Causal Claim No Causal Claim
Frame as a Question
Is a general or singular causal claim made? Highlight the section(s) that supports your answer.
Location
Article Content
Potential Impact
High
Potential Markup Effort
Very High
Release Target
1
Source Team
Public Editor

Calibrating Confidence - Level of Confidence

Article Example URL
http://www.theblaze.com/news/2017/09/02/scientist-shuts-down-climate-change-alarmists-with-new-report-about-hurricane-harvey/
Data Type
Likert
Definition
Expressed confidence in a claim
Frame as a Question
Do they acknowledge uncertainty or the possibility that things might be otherwise? If so, highlight the relevant section(s).
Location
Article Content
Potential Markup Effort
Very High
Problematic Toy Example
"According to Mass, the idea that human-caused climate change had any effect on Harvey is more than far-fetched — it’s downright not true."
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Public Editor

Calibrating Confidence - Justification

Article Example URL
http://www.theblaze.com/news/2017/09/02/scientist-shuts-down-climate-change-alarmists-with-new-report-about-hurricane-harvey/
Data Type
Likert
Definition
Expressed confidence in a claim, with justification
Frame as a Question
To what extent does their confidence in their claims seem justified?
Location
Article Content
Potential Markup Effort
Very High
Problematic Toy Example
"According to Mass, the idea that human-caused climate change had any effect on Harvey is more than far-fetched — it’s downright not true."
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Public Editor

Begging the Question

Data Type
Scale
Potential Markup Effort
Very High
Problematic Toy Example
"Capital punishment is wrong because it is immoral to inflict death as a punishment for a crime."
Proposed Priority for WebConf Paper
Medium
Source Team
Public Editor

Average Number of Premises Per Claim

Data Type
Integer
Good Toy Example
N/A - has to include example URLs
Known Weakness
Argument mining is relatively recent as an ML technique
Machine Generation from Article
true
Potential Impact
Low
Potential Markup Effort
Very High
Priority Rationale
Not sure about the priority here but I would say very difficult for WebConf
Problematic Toy Example
N/A - has to include example URLs
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
From argument mining literature
Source Team
Factmata

Appeal to Fear Fallacy

Data Type
Likert
Definition
Exaggeration of the dangers of a situation and use of scare tactics in an attempt to persuade
Frame as a Question
Does the author exaggerate the dangers of a situation and use scare tactics to persuade (the appeal to fear fallacy)?
Location
Article Content
Potential Markup Effort
Very High
Release Target
1
Source Team
Public Editor

Outbound References

Source Types

Data Type
Multiple Choice
Definition
None Experts Studies Organizations Other
Frame as a Question
Which of the following types of sources are cited in the article? Check all that apply. If Other, please highlight.
Location
Article Context
Potential Impact
Medium
Potential Markup Effort
Medium
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Quotes reputable scientists

Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Quotes outside experts

Data Type
Boolean
Frame as a Question
Does the article quote experts who are not part of the study but are part of the field?
Known Weakness
Quotes or experts could be fabricated
Potential Markup Effort
Medium
Priority Rationale
A good science writer should make sure to quote experts in the field who are not the original authors. This is a miniature reflection of the peer review system. Quotes could be fabricated, of course.
Proposed Priority for WebConf Paper
Very High

Number of quoted sources

Data Type
Integer
Frame as a Question
How many sources does the article quote?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Number of links

Data Type
Integer
Frame as a Question
How many URLs does the article link out to?
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Databases

Definition
Databases as a 'Publication' owned by a Publisher
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Contains original quotes

Data Type
Boolean
Frame as a Question
Does the article contain original quotes that appear to be sourced directly by the reporter?
Potential Markup Effort
Medium
Priority Rationale
Original quotes could be fabricated, but if they are not, they show an attention to getting opinions from subject matter experts. However, they could also just indicate a competent mid-level blogger who doesn't have the resources to seek outside perspectives.
Proposed Priority for WebConf Paper
Very High

Contains Video Embeds

Data Type
Multiple Choice
Definition
List of embeddable video sites (YouTube, Vimeo, etc.)
Frame as a Question
Does the article embed content from video sites?
Machine Generation from Article
true
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Contains Link to Scientific Journals

Data Type
Boolean
Definition
A simple link to the a scientific journal article that backs up the assertion made. This may also be paired with a URL to the specific article.
Frame as a Question
Is a link provided in the article to where the original content came from?
Location
Article Context
Potential Impact
Medium
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Medium
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Contains Image Macros

Data Type
Boolean?
Definition
Image Macro
Frame as a Question
Does the article contain content with image macros?
Potential Markup Effort
Low
Priority Rationale
Image macros reflect that the site might be focused more on driving clicks. That said, they might also be an effort to make difficult concepts more accessible to a general interest audience.
Proposed Priority for WebConf Paper
High

Contains Attributed images

Data Type
Boolean?
Frame as a Question
Does the article contain images attributed to a photographer or other source?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Contain Original Images

Data Type
Boolean?
Frame as a Question
Does the article contain original images?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Agencies for Authority

Definition
I think that Agencies could also be considered Publishers. But Publisher captures I think the relationship to the Article
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Accuracy of representation of source article

Article Example URL
https://climatefeedback.org/evaluation/breitbart-misrepresents-research-58-scientific-papers-falsely-claim-disprove-human-caused-global-warming-james-delingpole/
Data Type
Likert, with Multiple Choice
Definition
This article properly characterizes the methods and conclusions of the cited or quoted source. In addition to a Likert measure, two other options are possible: (A) Unable to find source, (B) Source is behind a paywall
Frame as a Question
Does this article properly characterize the methods and conclusions of the original source?
Location
Article Context
Potential Impact
High
Potential Markup Effort
Very High
Priority Rationale
Ability to accurately reflect the source articles in outbound references requires a good deal of knowledge of the field and of the scientific method in general and may be difficult to replicate.
Problematic Toy Example
"It’s sad that the blogger did not understand what this study is about..."
Proposed Priority for WebConf Paper
Very High
Rationale for Inclusion
This ideally would be done per source article
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Academic Journal Impact Factor

Data Type
Integer
Definition
From Wikipedia: The impact factor (IF) or journal impact factor (JIF) of an academic journal is a measure reflecting the yearly average number of citations to recent articles published in that journal. It is frequently used as a proxy for the relative importance of a journal within its field; journals with higher impact factors are often deemed to be more important than those with lower ones. https://en.wikipedia.org/wiki/Impact_factor
Frame as a Question
What is the impact factor of the journal or conference cited?
Location
Article Context
Potential Impact
Medium
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium
Release Target
1

Publication

Working Phone Number

Data Type
Boolean
Frame as a Question
Is there a phone number and does calling the phone number lead someone to a representative of the publication?
Known Weakness
Over time, this could be gamed easily with VOIP services, such that someone outside the region can still operate a number.
Potential Markup Effort
High
Priority Rationale
Having a working phone number is de rigeur for a genuine news outlet. Simply calling the number and seeing if someone representing the outlet responds can indicate if it at least has the resources to operate a phone line. A few questions of the person on the line can also yield insights.
Proposed Priority for WebConf Paper
Very High
Rationale for Inclusion
Context is for something representing itself as a news publisher
Tags
phone number

Trust Project Metadata

Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Site Analytics

Data Type
Multiple Choice
Frame as a Question
Does the publication use an analytics platform? Which one?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Publisher (Publication Owner)

Data Type
Free Text
Definition
Publishers: Individuals or group entities -- see Publisher Table for more related attributes
Dynamic; Change over Time?
true
Frame as a Question
What is the name of the person or organization that published the publication under review?
Good Toy Example
Arthur Ochs Sulzberger, Jr.
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Publication Type

Data Type
Multiple Choice
Definition
One of the following: Media Outlet, Governmental Agency, Non-profit, Private Corporation
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Publication Start Date

Data Type
Date
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Publication Name

Article Example URL
https://www.nytimes.com/section/t-magazine
Data Type
Free Text
Definition
Publications are parents of articles. Publishers may have 1:M Publications. Publications newsline magazines, series, newspapers, etc.
Frame as a Question
What is the name of the magazine, newspaper, journal, book that the article appeared within?
Good Toy Example
New York Times Style Magazine Online
Machine Generation from Article
true
Potential Markup Effort
Low
Problematic Toy Example
New York Times -- the NYT has several versions , a print version that's Metro (different section, US general, international, that's digital for starters. There is the NYT also available via podcast. So I think that perhaps Publication name must have space to think about media channel (which also might be an attribute to add to this table)
Proposed Priority for WebConf Paper
Low

Publication Identifier

Definition
Publishers may have 1:M Publications
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Publication End Date

Data Type
Date
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Publication Domain

Data Type
URL
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Publication CMS

Data Type
Multiple Choice (with option for Other)
Frame as a Question
What is the CMS that the publication uses?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Niche Topic

Data Type
Free Text?
Frame as a Question
Is the publication focused on a niche topic? If so, what is the topic?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Masthead (Nameplate)

Data Type
Boolean
Frame as a Question
Does the publication have a masthead/nameplate?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Masthead (Imprint)

Data Type
Boolean
Frame as a Question
Does the publication have a clear and logical masthead/imprint?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Links to Relevant Articles

Data Type
Boolean?
Frame as a Question
Does the publication regularly link to other relevant articles on its own site?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Language

Data Type
Multiple Choice
Frame as a Question
What language(s) does the publication publish in?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Is Part of Press Corps

Data Type
Boolean
Frame as a Question
Is the publication part of the press corps?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
High

Has a Wikipedia Entry

Data Type
Boolean
Frame as a Question
Does the publication have an entry in Wikipedia?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Code of Principles of fact checking organization

Data Type
Single Choice
Definition
A) IFCN Signatory B) Not IFCN signatory but organization/institution with similar standards and commitments C) Unknown, not discernable
Dynamic; Change over Time?
true
Frame as a Question
If the publication is from a fact-checking organization, what are its commitments to accuracy and other standards?
Good Toy Example
Africa Check (https://africacheck.org/)
Notes
Replaces https://credweb.org/cciv/#claim-has-been-fact-checked
Priority Rationale
One of two indicators replacing https://credweb.org/cciv/#claim-has-been-fact-checked
Rationale for Inclusion
One of two indicators replacing https://credweb.org/cciv/#claim-has-been-fact-checked
Source Team
Credibility Coalition Study: Summer 2018

Clear Editorial Policy

Data Type
Free Text?
Frame as a Question
Where does it state on any of the pages of the website a clear editorial policy?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Awards

Potential Markup Effort
High
Proposed Priority for WebConf Paper
Medium

Reader Behavior

Volume of readership over time

Definition
What is the nature of the volume of readership over time. Is it spiky? How many spikes are there? Is there a long tail?
Dynamic; Change over Time?
true
Frame as a Question
What is the nature of the volume of readership over time?
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Low

Emotional comment shared in response

Article Example URL
http://www.independent.co.uk/news/uk/politics/manfred-weber-angela-merkel-theresa-may-brexit-will-cause-damage-a8057311.html see comments
Data Type
boolean or integer with valence(+/-)
Definition
The direction and weight of reader interaction with this article.
Dynamic; Change over Time?
true
Frame as a Question
Did the article provoke emotional -- positive or negative -- comments and response (in the discussion space around its content)?
Good Toy Example
"It would be good if brexiteers would be the first to be let go when jobs go.and the last in the NHS waiting lists due to loss of skilled people. Of course the would deny being so stupid to vote brexit then."
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Low

Common Referrers

Data Type
category
Definition
From where did readers come from to read this article. Social media/unknown/news org page/search result page.
Dynamic; Change over Time?
true
Frame as a Question
What is driving traffic to this article?
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Low

Average time spent on page

Data Type
decimal number
Definition
Amount of time people spend reading the article on average
Dynamic; Change over Time?
true
Frame as a Question
How much time do people spend on this page?
Potential Impact
High
Potential Markup Effort
Medium
Priority Rationale
A very low value could be signal of clickbait. This needs to be balanced against article length, i.e., we should expect a certain amount of time on a page relative to its length. This requires resources we will not have access to for the WebConf paper.
Proposed Priority for WebConf Paper
Low

Revenue Model

Top Call to Action for Donations

Frame as a Question
Does the site or article have a topline call to action for donations?
Potential Markup Effort
Low
Rationale for Inclusion
Sometimes this is a flag

Spam or Clickbait Advertisements

Data Type
Likert
Definition
The page of the article has spammy or clickbaity advertisements. This is limited to a subjective assessment at this time.
Frame as a Question
How strongly do you agree or disagree that the page of the article has spammy or clickbaity advertisements?
Location
Article Context
Potential Impact
Medium
Potential Markup Effort
Medium
Priority Rationale
Advertisements that the reader feels, subjectively, are spammy or clickbaity could indicate a site that prioritizes clicks and shares over useful content and therefore represents financially-motivated misinformation.
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Presence of Sponsors

Frame as a Question
Are the sponsors clearly stated and reputable?
Potential Markup Effort
Low

Presence of Paywall or Subscription

Frame as a Question
Does the site contain a paywall or subscription type? If so, what?
Potential Markup Effort
Low
Rationale for Inclusion
This is actually at the Publication Level, right? not on the Article level

Presence of Freemium Content

Frame as a Question
Does the site contain selectively free content outside of its paywall?
Potential Markup Effort
Medium

Presence of Donors

Frame as a Question
Are the donors clearly stated and reputable?
Potential Markup Effort
Low
Rationale for Inclusion
At the publication level

Number of Advertisements

Data Type
Integer
Definition
There are multiple types of ads to look for. (1) Display ads. These are boxes that are clearly advertisements, typically in the form of a graphic image or, in the case of Google Adwords, a box with text. (2) Content recommendation engines, specifically, Taboola, Outbrain, Tivo, RevContent. A box of content recommendations on a page counts as one. (3) Sponsored content. This is content recommended on the site with a clear label: “Sponsored.” (4) Call for social sharing. (5) Call to subscribe to a mailing list
Frame as a Question
How many ads appears on the article page?
Location
Article Context
Notes
Credibility Coalition Study note: intention to split post Summer 2018
Potential Impact
Medium
Potential Markup Effort
Low
Priority Rationale
The number of ads can be an indicator of a site that is focused on financially-motivated misinformation. A simple tally of ads can reveal a lot, if there’s an unusual number of ads on the page.
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Aggressive Social Shares

Definition
The page of the article has aggressive social shares, which may include calls to share the article within the text. This is limited to a subjective assessment at this time.
Dynamic; Change over Time?
true
Frame as a Question
How strongly do you agree or disagree that the page of the article has aggressive social shares?
Good Toy Example
A call to share in the content of the article text itself.
Notes
One of two replacing https://credweb.org/cciv/#aggressive-advertisements-or-social-shares
Rationale for Inclusion
Replaces https://credweb.org/cciv/#aggressive-advertisements-or-social-shares which is double barreled.
Release Target
June 2018
Source Team
Credibility Coalition Study: Summer 2018
Supporting Evidence
Credibility Coalition Study: Web Conference 2018 examples

Aggressive Advertisements or Social Shares

Data Type
Likert
Definition
The page of the article has aggressive advertisements. This is limited to a subjective assessment at this time.
Frame as a Question
How strongly do you agree or disagree that the page of the article has aggressive advertisements?
Location
Article Context
Notes
Deprecated for Credibility Coalition Study: Summer 2018
Potential Impact
Medium
Potential Markup Effort
Medium
Priority Rationale
Advertisements that the reader feels, subjectively, are aggressive could indicate a site that prioritizes clicks and shares over useful content and therefore represents financially-motivated misinformation.
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Aggressive Advertisements

Definition
The page of the article has aggressive advertisements. This is limited to a subjective assessment at this time.
Dynamic; Change over Time?
true
Frame as a Question
How strongly do you agree or disagree that the page of the article has aggressive advertisements, including calls to join a mailing list?
Good Toy Example
An ad that follows you around as you scroll.
Notes
One of two replacing https://credweb.org/cciv/#aggressive-advertisements-or-social-shares
Rationale for Inclusion
Replaces https://credweb.org/cciv/#aggressive-advertisements-or-social-shares which is double barreled.
Release Target
June 2018
Source Team
Credibility Coalition Study: Summer 2018
Supporting Evidence
Credibility Coalition Study: Web Conference 2018 examples

Rhetoric

Title Representativeness Types

Data Type
Multiple Choice
Definition
Types of title representativeness (A) Title is on a different topic than the body (B) Title emphasizes different information than the body (C) Title carries little information about the body (D) Title takes a different stance than the body (E) Title overstates claims or conclusions in the body (F) Title understates claims or conclusions in the body
Frame as a Question
How is the title unrepresentative of the content of the article? (Select all that apply).
Location
Article Content
Potential Impact
Very High
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Title Representativeness

Data Type
Likert
Definition
A measure of how representative the content of the title is with the content of the body copy.
Frame as a Question
Does the title of the article accurately reflect the content of the article?
Location
Article Content
Potential Impact
Very High
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Spelling Errors

Data Type
Integer
Frame as a Question
How many spelling errors does the article have?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Medium

Reading Level

Data Type
Integer
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Low

Proportion (Exaggeration - Minimization Spectrum)

Article Example URL
Examples taken from https://en.wikipedia.org/wiki/Exaggeration, https://en.wikipedia.org/wiki/Minimisation_(psychology)
Data Type
Scale
Definition
The extent to which language in the text is proportional to the situation, or exaggerates or minimizes events. Exaggeration as defined in Webster 1913, "...the act of doing or representing in an excessive manner;..." and Minimization as defined in Oxford Living Dictionary accessed June 2018, "1.1 Represent or estimate at less than the true value or importance"
Frame as a Question
Is the description an extreme exaggeration, an extreme minimization, or proportional to the event or situation described?
Good Toy Example
"[Certain people group] are either all good or all bad," or "It's just a flesh wound" (_Monty Python and the Holy Grail_ Black Knight's response to his having his left arm severed)
Location
Article Content
Notes
DRAFT. Replacing part of https://credweb.org/cciv/#exaggeration for Credibility Coalition study
Rationale for Inclusion
Refining 'overly emotional language' indicator into multiple
Release Target
June 2018
Source Team
Credibility Coalition Study: Summer 2018

Politicizing Tone

Data Type
Scale
Frame as a Question
Does the content of the article politicize an issue unrelated to politics?
Potential Markup Effort
Medium
Proposed Priority for WebConf Paper
Medium

Overly Emotional Language

Article Example URL
http://www.thedailybeast.com/gop-senators-to-sick-americans-drop-dead
Data Type
Scale
Potential Markup Effort
Medium
Problematic Toy Example
"Ted Cruz emitted a string of meaningless slime."
Proposed Priority for WebConf Paper
Medium
Source Team
Public Editor

Number of Exclamation Points

Data Type
Integer
Frame as a Question
How many exclamation marks appear in the article?
Potential Markup Effort
Very Low
Proposed Priority for WebConf Paper
Low

Hyperpartisanship / Political bias

Data Type
Scale
Definition
Extreme political bias, e.g. unconstructive political discourse
Potential Markup Effort
High
Priority Rationale
This needs to be operationalized further, but evidence of hyperpartisanship can indicate a political agenda. That said, it's important to consider this in the context of genre, e.g., sports and opinion sections we might allow for a certain degree of partisan language.
Proposed Priority for WebConf Paper
High

Hate speech

Data Type
Boolean, possibly categorical
Definition
Usage of abusive or toxic language, e.g. racism, sexism, etc.
Known Weakness
people seem able to somewhat get around the existing algorithms, with some thought
Machine Generation from Article
true
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
Discussion of the challenges in doing this via machines: https://www.technologyreview.com/s/603735/its-easy-to-slip-toxic-language-past-alphabets-toxic-comment-detector/

Grammatical Rules

Data Type
Integer
Frame as a Question
Does the article follow rules of US or UK English grammar?
Potential Markup Effort
Low
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
We have limited scope to US and UK English; will need to expand this with other forms of English.For this and spelling errors you could use a ratio to the number of words.

Exaggeration

Article Example URL
http://www.thedailybeast.com/gop-senators-to-sick-americans-drop-dead
Data Type
Scale
Notes
Deprecated for Credibility Coalition Study: Summer 2018
Potential Markup Effort
Very High
Problematic Toy Example
"Republicans in Congress are pushing a bill that literally no one wants."
Proposed Priority for WebConf Paper
Medium
Source Team
Public Editor

Exaggerated Claims

Data Type
Likert
Definition
Claims are exaggerated, as indicated by the tone
Frame as a Question
1.21 mc Does the author exaggerate any claims? If so, highlight the relevant section(s).
Location
Article Content
Potential Impact
High
Potential Markup Effort
High
Release Target
1

Emotionally Charged Tone

Data Type
Likert
Definition
Article has an emotionally charged tone
Frame as a Question
Does the article have an emotionally charged tone? (i.e, outrage, snark, celebration, horror, etc.). If so, highlight the relevant section(s).
Location
Article Content
Potential Markup Effort
High
Proposed Priority for WebConf Paper
High
Release Target
1
Source Team
Public Editor

Emotional Valence

Data Type
Scale
Frame as a Question
Is the language extremely negative, extremely positive, or somewhere in the middle?
Good Toy Example
"VADER is VERY SMART, handsome, and FUNNY!!!"
Location
Article Content
Machine Generation from Article
true
Rationale for Inclusion
Refining 'overly emotional language' indicator into multiple
Release Target
June 2018
Source Team
Credibility Coalition
Supporting Evidence
VADER (Valence Aware Dictionary and sEntiment Reasoner) Natural Language Processing library, available at https://github.com/cjhutto/vaderSentiment

Contains Profanity

Data Type
Boolean
Frame as a Question
Does the article contain profanity?
Machine Generation from Article
true
Potential Markup Effort
Low
Priority Rationale
In the context of breaking news from citizen reporters, this can sometimes indicate credibility: http://www.bbc.com/news/magazine-33365782
Proposed Priority for WebConf Paper
High
Rationale for Inclusion
This can sometimes indicate credibility: http://www.bbc.com/news/magazine-33365782

Contains Hyperbolic Language

Data Type
Boolean
Frame as a Question
Does the article contain hyperbolic language?
Potential Markup Effort
Medium
Priority Rationale
Hyperbolic language could indicate a political bias or overt attempts to sway audiences without substantive evidence. This should always be considered in the context of genre, as we might expect more hyperbolic language in, for instance, the Sports and Style sections.
Proposed Priority for WebConf Paper
High

Cognitive Distortion

Data Type
Scale
Definition
(the representation of something in an excessive manner)
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Medium

Clickbait Headline

Data Type
Likert
Definition
A measure of how much the title of the article conforms to a predetermined set of clickbait genres.
Frame as a Question
Is the headline clickbaity?
Known Weakness
Major news outlets are starting to borrow clickbait conventions to attract readership
Location
Article Content
Potential Impact
Very High
Potential Markup Effort
Medium
Priority Rationale
High clikbaity-ness in a headline can indicate that the article is engineered for a surface-level virality. This may be particularly instructive when seen in the context of reader behavior.
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Clickbait Genres

Data Type
Multiple Choice
Definition
A typology of clickbait headlines: Listicle (“6 Tips on …”) Cliffhanger to a story (“You Won’t Believe What Happens Next”, “Man Divorces His Wife After Overhearing This Conversation”) Provoking emotions, such as shock or surprise (“...Shocking Result”, “...Leave You in Tears”) Hidden secret or trick (“Fitness Companies Hate Him...”, “Experts are Dying to Know Their Secret”) Challenges to the ego (“Only People with IQ Above 160 Can Solve This”) Defying convention (“Think Orange Juice is Good for you? Think Again!”, “Here are 5 Foods You Never Thought Would Kill You”) Inducing fear (“Is Your Boyfriend Cheating on You?”) Other
Frame as a Question
What clickbait techniques does this headline employ (select all that apply)?
Location
Article Content
Potential Impact
Very High
Proposed Priority for WebConf Paper
Very High
Release Target
1
Source Team
Credibility Coalition Study: Web Conference 2018

Astroturfing

Data Type
Boolean
Definition
the practice of the masking sponsors of a message
Potential Markup Effort
Very High
Proposed Priority for WebConf Paper
Medium
Rationale for Inclusion
This seems miscategorised, to be honest. It's an attribute of the author/publication, not the rhetoric.

Apophasis

Data Type
Scale
Definition
A rhetorical device wherein the writer brings up a subject by either denying it, or denying it should be brought up
Potential Markup Effort
High
Proposed Priority for WebConf Paper
Very Low