Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Credible Web Community Group as an
Editor's Draft.
GitHub Issues are preferred for
discussion of this specification.
Publication as an Editor's Draft does not imply endorsement by the
W3C Membership. This is a draft document and may be updated, replaced or
obsoleted by other documents at any time. It is inappropriate to cite this
document as other than work in progress.
This document was produced by a group
operating under the
W3C Patent Policy.
The group does not expect this document to become a W3C Recommendation.
W3C maintains a
public list of any patent disclosures
made in connection with the deliverables of
the group; that page also includes
instructions for disclosing a patent. An individual who has actual
knowledge of a patent which the individual believes contains
Essential Claim(s)
must disclose the information in accordance with
section 6 of the W3C Patent Policy.
This document is governed by the
1 March 2019 W3C Process Document.
1. Introduction
1.1 Purpose
This document is intended to support an ecosystem of interoperable credibility tools. These software tools, which may be components of familiar existing systems, will gather, process, and use relevant data to help people more accurately decide what information they can trust online and protect themselves from being misled. We expect that an open data-sharing architecture will facilitate efficient research and development, as well as an overall system which is more visibly trustworthy.
The document has three primary audiences:
- Software developers and computer science researchers wanting to build systems which work with credibility data. For them, the document aims to be a precise technical specification, stating what they need for their software to interoperate with any other software which conforms to this specification.
- People who work in journalism and want to review and contribute to this technology sphere, to help make sure it is beneficial and practical.
- Non-computer-science researchers, interested in helping develop and improve the science behind this work.
In general, we intend for this document to be:
- Welcoming for implementers of systems using credibility data
- Easy for non-tech folks to understand the proposed signals & contribute
- Practical to maintain by the editors
- Practical to contribute to, for a wide audience
- A source of accurate guidance about signal quality and adoption
1.2 Credibility Data
The document builds on concepts and terminology explained in Technological Approaches to Improving Credibility Assessment on the Web. Our basic model is that an entity (human and/or machine) is attempting to make a credibility assessment — to predict whether something will mislead them or others — by carefully examining many different observable features of that thing and things connected with it, as well as information provided by various related or trusted sources.
To simplify and unify this complex situation, with its many different roles, we model the situation as a set of observers, each using imperfect instruments to learn about the situation and then recording their observations using simple declarative statements agreed upon in advance. Because those statements are inputs to a credibility assessment process, we call them credibility signals. (The term credibility indicators is sometimes also used.)
This document, then, is a guide to these signals. It states what each observer might say and exactly how to say it, along with other relevant information to help people choose among the possible signals and understand what it means when they are used.
Because this is a new and constantly-changing field, we do not simply state which signals should be used. Instead, we list possible signals that one might reasonably consider using, along with information we expect to be helpful in making the decision.
1.3 Example
[explain]
Assessing credibility of https://news.example/article-1
Looking at title
I consider it to be clickbait
It's clickbait because it's a cliffhanger
Looking at article
It cites scientific research
Looking at provider
Established in 1974
Owned domain since 2006
1.4 Factors in Selecting Signals
When building systems which use credibility signals and trying to decide which signals to use, there are different factors to weigh. This section is aspirational; we hope this document will in time provide guidance on all these factors.
1.4.1 Measurement Challenges
There are factors about how difficult it is to get an accurate value[a][b] for the signal[c][d]:
- Do people independently observing it get approximately the same value?
- Do observations vary with the culture, location, language, age, beliefs, etc, of the people doing the observation?
- Would the same people make the same observation in future months or years?
- How much time and effort does it take people to make the observation?
- Do people need to be trained to make this specific observation?
- What kind of general training do people need (eg a journalism degree) to do it?
- How do machines compare to humans in making this observation, in terms of cost, quality, types of errors, and susceptibility to being tricked.
Many of these factors can be measured using inter-rater reliability (IRR) techniques. When studies have made such measurements, our intent is to include that data in this document.
Here is a table of the data we have. Excerpts are listed with the relevant signals.
Quotes from Outside Experts
Citation of Organizations
Confidence - Extent Claims Justified
Confidence - Acknowledge Uncertainty
Logical Fallacies - Straw Man
Logical Fallacies - False Dilemma
Logical Fallacies - Slippery Slope
Logical Fallacies - Appeal to Fear
Logical Fallacies - Naturalistic
Tone - Emotionally Charged
Tone - Exaggerated Claims
Inference - Type of Claims
Inference - Convincing Evidence
1.4.2 Value in Credibility Assessment
Another important set of factors relates to how useful the measurement is in assessing credibility, assuming the observation itself is accurate.
- Does the signal have a strong correlation to content accuracy, itself determined by consensus among experts[e][f]?
- Is it particularly indicative of credibility when used in combination with other signals? (For example, as part of computing the value of a latent variable.)
- Is it conceptually easy for people to understand?
- Do professionals in the field think it's likely to be a useful signal?
- How dependent are these characteristics on the culture or time period being considered?
- How dependent are these characteristics on the subject matter of the information being assessed for credibility?
1.4.3 Feedback Risks (“Gameability”)
One should also consider how the overall ecosystem of content producers and consumers might be changed by credibility tools adopting the signal. Once attackers see it’s being used, a signal that works well today might stop working, or even be used to make things worse. See Feedback Risks.
- Is it disproportionately useful for attackers (eg viral call to action) ? If so, making this a negative credibility signal should generally be beneficial
- Is it disproportionately expensive for attackers (eg journalistic language) ? If so, making this a positive credibility signal should generally be beneficial.
- Who might get impacted by “friendly fire”? Even if adopting a signal might — on average — harm attackers more than everyone else, certain individuals or communities who have done nothing wrong might be penalized. Tradeoffs must be carefully made, ideally in a consensus process with the impacted people.
1.4.4 Interoperability
The value of sharing signal data depends on how that signal is used by other systems.
- Are others producing data using this signal?
- Are there useful data sets available?
- Are others consuming data, paying attention to reported observations of this signal?
- Are there tools which work with it, eg running statistics?
- Is the definition clear and unambiguous, so people using it mean the same thing?
- Are there clear examples?
- Is there an open history of commentary, with questions and answers, and issues being addressed by various implementers?
- Is documentation available in multiple languages?
- If the definition is under development, how can one participate?
- If the definition could possibly change, who might change it, and under what circumstances?
- Are there any intellectual property considerations? See W3C Patent Policy.
- Is there a test suite / validation system for helping confirm that an implementation is working properly?
- Are there implementation reports, confirming that tools are functioning properly, according to the testing system? (For an example, see ActivityPub).
1.5 Publishing Credibility Data
TBD, basically follow schema.org technique using JSON-LD.
1.6 Consuming Credibility Data
TBD, point to some tools and the relevant specs. Basically JSON-LD.
1.7 Organization of this document
Section 1 (“Introduction”) provides instructions for how to use and help maintain this document, along with general background information.
The rest of this document, after the introduction, is a list of signals and information about them, as discussed in the introduction. The signals are organized into related groups, in hierarchical sections. At the lower levels of the hierarchy are the signals themselves, while the higher levels provide grouping of the signals, to help people understand them.
One important level of the hierarchy identifies the subject type of the signal. This is the conceptual entity being examined, considered, or inspected, when one makes the observation being recorded in the signal data. This could be imagined in different ways: when you are observing a claim made in the 3rd paragraph of an article published in some newspaper, are you observing the claim, the paragraph, the article, the newspaper, or even the author of the article? In general, we aim for the smallest granularity that makes sense, which in this case would probably be the claim.
At times, it may not be obvious to which subject type a signal belongs, or it could sensibly belong with several different ones. In this case, it might be moved to a different section in the document as people come to understand it better. When it’s not clear, there should be links from the places a signal could reasonably be to the place it actually is.
This may require discussion, and might remain open for debate. When a signal or group of signals makes sense in two places, consider linking it from the places it isn’t, to help people find it.
In many cases, a signal could be seen as a set of similar signals which are not strictly identical. This can be handled by adding additional signal headings with the finer distinction, when necessary. In this case, template statements might appear under more than one signal.
Note that sections may be moved and renumbered. Do not rely on section numbers remaining the same. For linking to a part of the document, consider using the gdocs h.xxxxxx fragment ids, provided by the Table of Contents; those should remain stable. Also, whenever changing a heading, especially a signal heading, if someone might be referring to it by name, please move the old text into a paragraph starting “Also called:”.
1.8 Template Statements
The most important thing about a signal definition is to be clear what observation the signal data is recording. If the signal heading is “Article length”, does that mean length in words or bytes or characters or some other metric? Does it include the title? For each signal, we want an easy way to communicate its definition that is short but clear, while being as detailed as necessary.
The technique we use here is to express the semantics of the signal using plain and simple sentences in natural language which convey the same knowledge as the signal data. If you imagine people using credibility software exchanging these statements (perhaps in text messages or on Twitter), you should get the right semantics. You can assume metadata, like who sent it and when it was sent is available, so the statements can include terms like “I” and “now”.
For machine-to-machine data Interoperability, these template sentences and the signal heading are turned into a data schema, after which the JSON-LD/schema.org/sematic web/linked data technology stack can be used.
The statements we use are templates because they abstract over a variety of similar sentences which differ in specific limited ways. For example, these statements:
- I have examined the article at https://example.com/alice and find it highly credible
- I have examined the article at https://example.com/brian and find it highly credible
- I have examined the article at https://example.com/casey and find it highly credible
are all the same, except in the URL. We convey this using a template statement, which has a variable portion in square brackets, like:
- I have examined the article at [subject] and find it highly credible
Tech note | If we (automatically or manually) map this template to a property with the pname :iHaveExaminedHighlyCredible, then the sentence number 2 above would be encoded in turtle as - { <https://example.com/brian> :iHaveExaminedHighlyCredible true }.
Alternatively, we could make it a class, but boolean valued properties may be better, so that all signals remain as properties.. |
The bracketed template expression “[subject]” is required in every template, to indicate what entity is being observed. Additional bracket expressions can be used when there are other elements of the statement to make variable. In particular, [string] (for text in quotes) and [number].
(For now, try to just use those three. Software and documentation is being developed to allow more features. If you find this too restrictive, go ahead and write something else inside the square brackets and we'll deal with it later, but include a question mark so it's clear you knew you were making it up.)
An example needing multiple variables:
- https://example.com/alice took 4.75 seconds to load, just now.
- https://example.com/brian took 5.9 seconds to load, just now.
could be matched by:
- [subject] took [number] seconds to load, just now.
1.9 Instructions for editing this document
As an experiment, this document is currently set so everyone can edit it, like Wikipedia. It is the Google docs version that is editable. We suggest you change the “Editing Mode” to “Suggesting” (using the pencil icon in the upper-right) until you are quite familiar with this document. You may also comment using the usual Google Docs commenting features.
If you make or suggest any edits to this document, you are agreeing to the W3C Community Contributor License Agreement which has significant copyright and patent implications.
The subsections below give some advice for how to make edits which are helpful.
1.9.1 Expand discussion
Each section should begin with a short introduction written with a neutral point of view, reflecting consensus about why the signal might be useful and what the risks might be. To enable consensus among a broad community, the intent is for this text to be developed iteratively, with each contributor adding their perspective while respecting what is already present.
Questions and minor concerns should generally be added as annotations using the “Add a Comment” function, without editing the document. If they become issues requiring back-and-forth discussion, they should be turned into github issues and linked from the most relevant place in this document with a paragraph starting “Issue:”
These discussion sections are intended to be nonnormative. That is, they do not say how software using the signal is required to behave for interoperability. The normative content of this specification is the template statements and the mapping of the statements to RDF.
1.9.2 Add new template statements
If you are confident you understand what a signal is intended to measure, and think you can provide a template statement which expresses it more clearly and simply, with little ambiguity, please add a new row to the bottom of the “Proposed template statements” table and add your entry. Please also put the next higher number in the Key field for reference, and your name in the By field. This “by” field is optional; it is intended to help simplify discussion, telling people who to talk to, and to give some credit. Listing the name of a large group in this field is not particularly useful.
After adding an entry, for a short time (perhaps a few hours, guided by any comments on it) it’s okay to edit it if you change your mind. After that, please leave it, and just add a new row for the new version. You can put new versions in the middle of the table and use keys like 1a.
1.9.3 Add new signals
Once you’re familiar with the structure of this document and all the signals in your area of interest, you may add new signal sections (with a title starting “Signal:” or even new group sections. (For heading numbering, you can use the “Table of contents” add-on from LumApps to number the headers. Or just leave the numbering for someone else using the add-on.)
When you add a new signal, please copy this table to the new section, and then fill in at least one row to clarify what the signal data conveys.
Key | Proposed Template Statement | By |
| | |
1.10 Contributors
Folks who add content to this document are encouraged to add themselves in this section, potentially with some affiliation & credential information. This also allows the “By” column to stay short, as people can use short forms of names (eg only first or last name, if unique in this doc).
- Entries marked as by “Credibility Coalition” are prior work by members of the Credibility Coalition. At the time, individual authorship information was not maintained. Moving forward, specific authorship detail is welcome.
- (add yourself here...)
1.11 Sources
This document is assembled from multiple data sources. They provide both the overall structure of this document and the details about each signal, include definitions, example data, and implementation status.These sources are fetched started with a source list, which appears as the first entry below. In general, text in this document links back to its source with a link-out icon.
The sources used for this current view were:
If you want to privately experiment with bookmarkable alternative views generated using a different source list, try Custom View of Credibility Signals.
2. Subject type: Claim
This section is for signals about claims.
A claim is “an assertion that is open to disagreement; equivalently, a meaningful declarative sentence which is logically either true or false (to some degree); equivalently, a proposition in propositional logic.” [credweb report]
Claims can be stated (with various decree of clarity) in some content or implied by the content (even non-textual content, like a photograph).
Claims are usually the smallest practical granularity. Credibility data about claims is largely focussed on what other sources have said about that claim, as in fact checking, but could also involve relationships between claims and textual analysis of claim text.
2.1 Claim Review
The “ClaimReview” model developed at s[g][h][i]chema.org grows out of the tradition of independent, external fact-checking, as in PolitiFact. With this model, a fact-checker reviews a claim, typically made by a public figure, and then publishes a review of that claim, a “claim review”. Within schema.org, this parallels other reviews, like restaurant reviews.
[ Can we fit claimreview neatly into this observer/signal model? It’s a bit of a stretch. TBD. ]
2.1.1 Signal: Fact-check status of claim
From Section 7.7.1. Signal: Article has a central claim, claims in articles according to Credibility Coalition WebConf2018 and more recent studies includes the following values for fact-check results at the time of the study: false, true, unclear, mixed; not finding a fact-check is equivalent to an empty statement.
Interoperability with ClaimReview: This signal seems to relate to https://schema.org/reviewRating and bestValue of https://schema.org/ClaimReview, with bestValue in this case equal to VERIFIED; further discussion is needed with members of schema.org to confirm.
2.1.2 Signal: Fact-check status of claim — VERIFIED
Ref |
Definition (Template) |
Tags |
0bd70468 |
An IFCN signatory did a fact-check and verified claim [Claim]. |
|
cbd33df5 |
The fact-check result by [Venue] of claim [Claim] is that it is TRUE. |
|
2.1.3 Signal: Fact-check status of claim — REFUTED
Ref |
Definition (Template) |
Tags |
16c7ee48 |
An IFCN signatory did a fact-check and refuted claim [Claim]. |
|
97e6d1a0 |
The fact-check result by [Venue] of claim [Claim] is that it is FALSE. |
|
2.1.4 Signal: Fact-check status of claim — UNCLEAR
Ref |
Definition (Template) |
Tags |
c9434942 |
The fact-check result by [Venue] of [Claim] is UNCLEAR. |
|
2.1.5 Signal: Fact-check status of claim — MIXED
Ref |
Definition (Template) |
Tags |
9a0da6b7 |
The fact-check result by [Venue] of [Claim] is that the claim contains elements that are TRUE and FALSE. |
|
2.1.7 Signal: Claim - Coded Meaning
Ref |
Definition (Template) |
Tags |
9f86e8b2 |
[ClaimA] is a claim that equals another claim, [ClaimB].[j] |
|
Developed for CredCo Political Indicators Study 2018-19. Original example question: “Are there claims that contain phrases, words, or coded language that have taken on a special loaded meaning, in the understanding of the speaker and audience?”, with an example of "go to work," used as code for killing during the Rwandan genocide.
Can be used in connection with 7.2.3. Signal: Generalization/Characterization of Group.
2.2 Fact-checking Organization[k][l]
Signals below [2.2.1. Signal: Fact-checking Organization commitments — member of the IFCN, 2.2.2. Signal: Fact-checking Organization commitments — accuracy and professionalism, and 2.2.3. Signal: Fact-checking Organization commitments — unknown ] were developed in combination with those under 7.8. Claims in Articles, and originally expressed as a question:
If the publication is from a fact-checking organization, what are its commitments to accuracy and other standards?
A) IFCN Signatory
B) Not IFCN signatory but organization/institution with similar standards and commitments
C) Unknown, not discernable
2.2.1 Signal: Fact-checking Organization commitments — member of the IFCN
Ref |
Definition (Template) |
Tags |
177559e3 |
[Organization] which published fact-check [Webpage] is a member of the International Fact-Checking Network at Poynter (IFCN). on [date].[n][o][p][q] |
|
2.2.2 Signal: Fact-checking Organization commitments — accuracy and professionalism
Ref |
Definition (Template) |
Tags |
ec6b8ebf |
[Organization] has expressed commitments to accuracy and other fact-checking professional standards similar to IFCN organizations. |
|
35a66989 |
[Organization] has expressed commitments to accuracy and other fact-checking professional standards. |
|
2.2.3 Signal: Fact-checking Organization commitments — unknown
Ref |
Definition (Template) |
Tags |
c0632f51 |
[Organization]’s commitments to accuracy and other professional standards are unknown. |
|
2.3 Explicitly Unverified Claims
From CredCo Political Indicators Study 2018-2019: in some cases, articles may reference claims or pieces of information that do not contain citations or references. In some cases, within an article, an author can make explicit reference to a claim that has not been verified, using language that specifies that the claim has not been validated or proven to be true. This includes language in an article explicitly referencing that a claim has not yet been verified to date - but the claim is being mentioned in the article nonetheless.
This is used in connection with 7.7.2. Signal: Article has a claim.
Key | Proposed Template Statement | By |
1 | [Claim] is explicitly unverified, containing language such as “charges have not been proven true.” [r] | CredCo |
3. Subject type: Text
Includes: phrase, sentence, paragraph, document, document fragment
A text, in this sense, is a sequence of words, with the usual punctuation, and sometimes embedded multimedia content or meaningful layout, like tables. That is, it’s a document or portion of a document. As examples, a phrase, sentence, paragraph, document section, book chapter, book, and complete book series would typically each count as a text.
Signals here concern properties of the text, itself, separate from how it might be published (eg on a Web Page, on a billboard, spoken at a rally) or where it might be published (in some Venue). The text should be considered immutable: a text (in this sense) doesn't change. If you take a text and change it, you are making a new text, which needs to be reexamined, to see which observations (and thus which signal data) applies to this other, new text.
Issue: (tech) How to represent texts in RDF? Options include annotation URL with secure hash, annotation object URL with secure hash, data: URI, etc.
3.2 References or citations
3.2.1 Signal: Uses standardized references or citations
These standards are required and enforced by professions that demand accuracy, and are typically found in highly researched, and therefore more authoritative, texts. Examples: Legal, academic, or scientific citations, e.g., MLA, APA.
Ref |
Definition (Template) |
Tags |
c3b7d174 |
Text of [subject article] uses standardized references or citations. |
|
Example sentence: "Changes in body temperature have long been used as an indicator of injury, inflammation or infection in veterinary medicine (George et al., 2014), however, the use of
temperature devices such as rectal thermometers and thermal microchips can be both invasive
and time consuming (Johnson et al., 2011)." (Source)
3.2.3 Signal: Few to zero references or citations
A text with no references to other materials is original content, which often means it is opinion, personal experience, or even fiction. These tend to be less authoritative than texts with references.
Ref |
Definition (Template) |
Tags |
baa89d32 |
Text of [subject article] has few or no references or citations. |
|
One exception is a first-hand account, which can become a primary document for later research. These personal accounts, however, should be vetted and cross-referenced with other sources to evaluate its accuracy.
Example sentence: "The shrine is the work of SUNY Purchase sophomore Phillip Hosang, who, like a lot of students at the school, had long heard rumors about a secret room in a men's bathroom somewhere in the visual arts building." (Source)
3.3 Pronouns
3.3.1 Signal: Many or multiple instances of the pronouns "I" or "you"
Texts that use the pronouns "I" or "you" are typically opinion, correspondence or personal account. These texts are usually not trying to be authoritative or explanatory, however, they sometimes form a primary document that is used in secondary research.
Ref |
Definition (Template) |
Tags |
43280bee |
Text of [subject article] has many instances of the words "I" or "you." |
|
Example sentence: "After paying close attention to many of your campaigns, I believe you are united by a desire to get things done to help a lot of people who’ve been left behind." (Source)
3.3.2 Signal: Few or no instances of "I" or "you"
Texts that do not use first or second person are less likely to be opinion content. However, this is no indication of credibility.
Ref |
Definition (Template) |
Tags |
dcbf79f6 |
Text of [subject article] has few or no instances of the words "I" or "you." |
|
"President Trump said he would not overrule his acting attorney general, Matthew G. Whitaker, if he decides to curtail the special counsel probe being led by Robert S. Mueller III into Russian interference in the 2016 election campaign." (Source)
3.4 Signal: Vocabulary or reading level
Texts with a wide and varied vocabulary, which may include jargon or uncommon words, is an indicator of formal tone.
3.5 Incivility and impoliteness
3.5.1 Signal: Incivility
Ref |
Definition (Template) |
Tags |
fb2e81c3 |
Text of [subject article] contains stereotypes, such as calling a person a “faggot,” “terrorist,” or “backward” (e.g. “Muslims are terrorist sympathizers”) |
|
ca34c8ce |
Text of [subject article] contains threats to people’s individual rights, such as freedom of speech or personal freedom (e.g. “You foolish Republicans better shut up”) |
|
929fa567 |
Text of [subject article] contains verbalized threat to democracy, such as a proposal to overthrow democratic government by force or undemocratic way (e.g. “Obama is a Muslim Agent with Brotherhood Ties. American people must take him down.”) |
|
Source: Oz, M., Zheng, P., Chen, G. M., & Park, R. H. (2018). Twitter versus Facebook: Comparing incivility, impoliteness, and deliberative attributes. New Media & Society, 20(9), 3400–3419. http://doi.org/10.1177/1461444817749516
3.5.2 Signal: Impoliteness
Ref |
Definition (Template) |
Tags |
700e99be |
Text of [subject article] contain insults or name-calling (e.g. “stupid” or “moron”) |
|
5491ed97 |
Text of [subject article] contains profanity (e.g. “hell” and “damn”) |
|
ec40d14f |
Text of [subject article] contains words in all capital letters (e.g. “Who flew the planes into the towers on 9/11? ILLEGAL IMMIGRANTS!”) |
|
Source: Oz, M., Zheng, P., Chen, G. M., & Park, R. H. (2018). Twitter versus Facebook: Comparing incivility, impoliteness, and deliberative attributes. New Media & Society, 20(9), 3400–3419. http://doi.org/10.1177/1461444817749516
3.6 Text type
Editor note: This should probably be abstracted to all different types of contents.
3.6.1 Signal: Text type is news
Ref |
Definition (Template) |
Tags |
18da5ac5 |
[Text] appears to be news. |
|
3.6.2 Signal: Text type is opinion
Ref |
Definition (Template) |
Tags |
e1f7bea2 |
[Text] appears to be an opinion piece |
|
76519eb9 |
[subject article] URL contains directory name or file name indicating opinion |
|
639cf2d6 |
[subject article] is self-labeled opinion |
|
Examples: #2: Opinion, Perspective, Editorial, Commentary, etc.
#3: https://www.nytimes.com/2019/02/28/opinion/alexandria-ocasio-cortez-cohen-hearing.html
3.6.3 Signal: Text type is satire
Ref |
Definition (Template) |
Tags |
0cc29b00 |
[Text] appears to be a satire piece |
|
a151c9c3 |
[source] is self-described satire site |
|
f6fe3e31 |
[subject article] URL contains directory name or file name indicating satire |
|
4cc17948 |
[subject article] is self-labeled satire |
|
Examples: #2: Satire, humor, etc.
#3: https://www.newyorker.com/humor/borowitz-report/mueller-says-he-has-obtained-trumps-sat-scores
#4: http://www.thedailyrash.com/about “The Daily Rash is satire! Merely a parody of the life that we watch around us daily. We spoof the famous and not so famous people who fill our lives with beauty and who bring us so much joy. Any similarities between our stories and real life are coincidental. Nothing here is very true.”
3.6.4 (Section with no title?)
7. Subject type: Article
Includes: News Story, News Article, Scientific Paper, Blog Post
An article is a collection of information intended to convey some information, usually factual, usually created by one or more identifier people, and usually released at a specific point in time in some venue. It consists of elements like a body, a title, a publication date, and an author list. Unlike Texts, where any change makes it a different Text, an Article may be revised over time and still be considered the same Article (albeit a different version). Usually only minor changes are socially appropriate, however. Consumers of credibility data may need to be cautious of which version an observation applies to.
If an article appears on a web page, or in a portion of a web page, we can use its URL to identify the article.
Differentiation between Article and Text. Consider whether the signal data would be the same if the text were moved to a different article, perhaps published in a different venue, with a different title, at a different time, and with other text before or after it in some article. If the observation would be the same, then the signal is a property of the text, not the article. In that case it be in 3. Subject type: Text not here.
7.1 Originality
7.1.1 Originality Types
7.1.1.1 Signal: Most Likely Original
Ref |
Definition (Template) |
Tags |
e595c4b2 |
Text of [subject article] is mostly likely original. |
|
71a19757 |
[Photo] is mostly likely original. |
|
Ref |
Definition (Template) |
Tags |
e595c4b2 |
Text of [subject article] is mostly likely original. |
|
71a19757 |
[Photo] is mostly likely original. |
|
7.1.1.2 Signal: Appears to be a Copy, with Some Different Portions
Ref |
Definition (Template) |
Tags |
6acabe52 |
Text of [subject article] appears to be a copy of one or more articles, with some portions different or remixed |
|
7.1.1.3 Signal: Quotes Extensively From Another Source
Ref |
Definition (Template) |
Tags |
8001f76c |
Text of [subject article] quotes extensively from another source, with some original content |
|
7.1.1.4 Signal: Wholesale Duplicate
Ref |
Definition (Template) |
Tags |
69857b93 |
Text of [subject article] is a wholesale duplicate of another article |
|
7.1.2 Attribution of Non-Original Content
These signals assume that the content has already been flagged as not original.
7.1.2.1 Signal: Attribution Given and Accurate[t]
Ref |
Definition (Template) |
Tags |
9ce28cf4 |
[subject article] includes accurate attribution, pointing to the original. |
|
7.1.2.2 Signal: Attribution Given and Inaccurate
Ref |
Definition (Template) |
Tags |
334e1ec7 |
[subject article] includes inaccurate attribution. |
|
7.1.2.3 Signal: Attribution Not Given
Ref |
Definition (Template) |
Tags |
31755cbb |
[subject article] does not include attribution. |
|
7.1.2.4 Signal: Unclear Which is Original
Ref |
Definition (Template) |
Tags |
b1794319 |
[subject article] is a copy, but it is unclear which is the original. |
|
7.1.3 (Section with no title?)
7.1.4 Personal Perspective
These signals help parse author perspective on the content of the article.
7.1.4.1 Signal: Article contains personal perspective on lived experience
Ref |
Definition (Template) |
Tags |
9f948295 |
[subject article] includes “I” statements AND recounts personal lived experience |
|
f275cebe |
[subject article] includes “I” statements and does NOT recount personal lived experience |
|
7.2 Language and Rhetoric
To-do: Move Rhetoric to a different bucket, not Article.
7.2.1 Rhetorical Proportionality
7.2.1.1 Signal: Proportional Rhetoric
Ref |
Definition (Template) |
Tags |
fe3b53d1 |
The rhetoric used in [Text] is proportional to the event or situation described. |
|
7.2.1.2 Signal: Extreme Exaggerating Rhetoric
Ref |
Definition (Template) |
Tags |
853a3706 |
The rhetoric used in [Text] is an extreme exaggeration of the event or situation described. |
|
853a3706 |
The rhetoric used in [audio] is an extreme exaggeration of the event or situation described. |
|
Ref |
Definition (Template) |
Tags |
853a3706 |
The rhetoric used in [Text] is an extreme exaggeration of the event or situation described. |
|
853a3706 |
The rhetoric used in [audio] is an extreme exaggeration of the event or situation described. |
|
7.2.1.3 Signal: Extreme Minimizing Rhetoric
Ref |
Definition (Template) |
Tags |
853a3706 |
The rhetoric used in [Text] is an extreme exaggeration of the event or situation described. |
|
7.2.2 Signal: Emotional Valence
Could be measured by VADER (Valence Aware Dictionary and sEntiment Reasoner) Natural Language Processing library
Ref |
Definition (Template) |
Tags |
d3f90e88 |
Is the language extremely negative, extremely positive, or somewhere in the middle? *** |
idea(cciv) |
7.2.2.1 Signal: Extremely Negative Valence
Ref |
Definition (Template) |
Tags |
7a192974 |
The language in [Text] is extremely negative. |
|
7.2.2.2 Signal: Extremely Positive Valence
Ref |
Definition (Template) |
Tags |
c07b1b5d |
The language in [Text] is extremely positive. |
|
7.2.2.3 Signal: Neutral Valence
Ref |
Definition (Template) |
Tags |
d8c8293c |
The language in [Text] is neutral. |
|
7.2.3 Signal: Polarizing Language
Ref |
Definition (Template) |
Tags |
d8e3739c |
[Text] uses language such as “pro” and “anti,” signaling a division into two sharply contrasting groups or sets of opinions or beliefs. |
|
Developed for CredCo Political Indicators Study 2018-19. Taken from the Oxford Living Dictionary’s definition of polarization as the “division into two sharply contrasting groups or sets of opinions or beliefs.” Can be used in combination with 7.8. Claims in Articles.
7.2.4 Signal: Generalization/Characterization of Group
Ref |
Definition (Template) |
Tags |
1a6fe948 |
[Text] in [Content-Object] [u][v][w]characterizes a group or groups of people along lines that explicitly differentiate them from others. |
|
Developed for CredCo Political Indicators Study 2018-19.This can apply to situations in which the author is associated with the defined group or defining an external group. Can be used in combination with 7.8. Claims in Articles and other “Content-Objects.”
7.2.5 Signal: Dehumanization
Ref |
Definition (Template) |
Tags |
7fcf4da7 |
[Text] equates a human individual or group(s) as insects, bacteria, despised animals, cancer — less than human beings. |
|
Developed for CredCo Political Indicators Study 2018-19. See https://dangerousspeech.org/about-dangerous-speech/.
7.2.7 Signal: Call to Violence
This signal is meant to capture a call to violence. Perhaps also expressed as part of ‘Dangerous Speech’: “ any form of expression (speech, text, or images) that can increase the risk that its audience will condone or participate in violence against members of another group” (see https://dangerousspeech.org/about-dangerous-speech/).
Ref |
Definition (Template) |
Tags |
b051c9b2 |
[Text] contains language that can be understood as a call to violence or seems harmful. |
|
Developed for CredCo Political Indicators Study 2018-19.
7.2.8 Signal: Call to Action (Political)
This signal is meant to capture a textual call to action, not to be confused with a marketing call to action https://en.wikipedia.org/wiki/Call_to_action_(marketing). Sometimes, these calls to action are also associated with requests for enacting/executing an action as an expression of one’s loyalty, identity, or affiliation.
Ref |
Definition (Template) |
Tags |
70dd369e |
[Text] contains language that can be understood as a political call to action, which requests readers to follow-through with a particular task, or tells readers what to do such as: signing online petitions, joining a mailing list, giving donations, voting, protesting, boycotting. |
|
Developed for CredCo Political Indicators Study 2018-19.
7.3 Logic/Reasoning
7.3.1 Types of Bias
7.3.1.1 Signal: Confirmation Bias
Ref |
Definition (Template) |
Tags |
596f4162 |
*** |
idea(cciv) |
5b19acfa |
Text of [article title] contains examples of ….. |
|
7.4 Outbound References
7.4.1 Source Types
7.4.1.1 Signal: No Source Type Cited
Ref |
Definition (Template) |
Tags |
bf2e6507 |
There is no source cited in [subject article]. |
|
7.4.1.2 Signal: Domain Expert Cited
Ref |
Definition (Template) |
Tags |
af7ab75e |
There is an expert cited in [subject article]. |
|
7.4.1.3 Signal: Study Cited
Ref |
Definition (Template) |
Tags |
bfe3b411 |
There is a study cited in [subject article]. |
|
7.4.1.4 Signal: Unaffiliated Expert Cited about Study
Ref |
Definition (Template) |
Tags |
a6ee73ba |
There is an unaffiliated expert cited about [study] in [subject article]. |
|
This I believe is the best practice in reporting on scientific studies.
7.4.1.5 Signal: Organization Cited
Ref |
Definition (Template) |
Tags |
56ecd8ab |
There is an organization cited in [subject article]. |
|
7.4.1.6 Signal: Other Type of Source Cited
Ref |
Definition (Template) |
Tags |
ccb6072c |
There is another type of sourced cited in [subject article]. |
|
7.4.1.7 Signal: Anonymous Sources Cited
Ref |
Definition (Template) |
Tags |
5ab55dfc |
One or more anonymous sources are cited in [subject article]. |
|
7.4.1.8 Signal: Single Anonymous Sources Materially Cited
Ref |
Definition (Template) |
Tags |
db18b042 |
A single anonymous source is materially cited in [subject article]. |
|
Would the interpretation of the article be substantively different without the single anonymous source.
7.4.1.9 Signal: Anonymous Sources Materially Cited
Ref |
Definition (Template) |
Tags |
27096b57 |
One or more anonymous sources are materially cited in [subject article]. |
|
7.4.1.10 Signal: Multiple Anonymous Sources Materially Cited
Ref |
Definition (Template) |
Tags |
70fff801 |
More than one anonymous sources are materially cited in [subject article]. |
|
7.4.1.12 Signal: Motivation of Anonymous Source Wanted Anonymity is Given
Ref |
Definition (Template) |
Tags |
69a1b877 |
The motivation of the anonymous sources to be anonymous is given in [subject article]. |
|
7.4.1.13 Signal: Documents are Cited in the Article
Ref |
Definition (Template) |
Tags |
457405d2 |
Documents are cited in [subject article]. |
|
7.4.1.14 Signal: Documents Cited in the Article Are Made Available in Publication
Ref |
Definition (Template) |
Tags |
5da755f8 |
Documents cited in [subject article] are also made available in publication. |
|
7.4.2 Signal: Contains Link to Scientific Journals
Ref |
Definition (Template) |
Tags |
fd96f7bf |
The text includes a link to original content |
idea(cciv) st(cciv) |
Notes (not normative):
- cciv: A simple link to the a scientific journal article that backs up the assertion made. This may also be paired with a URL to the specific article.
|
2a85d78f |
There is a link provided in [subject article] to where the original content came from. |
|
7.4.3 Signal: Accuracy of representation of source article
Also called: Representative Citations
Ref |
Definition (Template) |
Tags |
268b0193 |
The text properly characterizes the methods and conclusions of its sources |
idea(cciv) st(cciv) |
Notes (not normative):
- cciv: This article properly characterizes the methods and conclusions of the cited or quoted source. In addition to a Likert measure, two other options are possible: (A) Unable to find source, (B) Source is behind a paywall
|
890a77ac |
This article properly characterizes the methods and conclusions of the cited or quoted source (Source 1). |
|
4bce9ba4 |
This article properly characterizes the methods and conclusions of the cited or quoted source (Source 2). |
|
e60e0259 |
This article properly characterizes the methods and conclusions of the cited or quoted source (Source 3). |
|
f2e110d3 |
[subject article] properly characterizes the methods and conclusions of the original source. |
|
7.4.4 Signal: Academic Journal Impact Factor
Ref |
Definition (Template) |
Tags |
18db0556 |
The impact factor of the journal or conference cited is [number]. |
|
059e6c34 |
What is the impact factor of the journal or conference cited? *** From Wikipedia: The impact factor (IF) or journal impact factor (JIF) of an academic journal is a measure reflecting the yearly average number of citations to recent articles published in that journal. It is frequently used as a proxy for the relative importance of a journal within its field; journals with higher impact factors are often deemed to be more important than those with lower ones. https://en.wikipedia.org/wiki/Impact_factor |
idea(cciv) |
7.4.4.1 Signal: Academic Journal Impact Factor Cannot Be Found
Ref |
Definition (Template) |
Tags |
8c312a70 |
The impact factor of the journal or conference cited cannot be found. |
|
7.5 Article/Site Metadata
7.5.1 Signal: Subhed/Dek
Ref |
Definition (Template) |
Tags |
a675ee6e |
A “dek” is a subhed in journalism that appears below the headline of an article, usually in a smaller font (but in a larger font than the main body of the article). It typically summarizes the article or highlights a main point from the article. |
|
7.6 Claims in Articles
Although there is a separate section for Claims [2. Subject type: Claim], this section deals with the case when the analysis of one or more claims within an article is made to signify something about the article itself. [Probably could use an introductory paragraph on different levels/objects once those are clarified, since this translation is taking place for a number of projects, consider articles to domains/publishers.]
In the following signals, an assumption is made on the existence of a central claim of the article that is recognizable.
7.6.1 Signal: Article has a central claim
Ref |
Definition (Template) |
Tags |
c871d2dc |
The central claim in [Article] is [Claim].[x] |
|
001c97cb |
There is a central claim in [Article]. |
|
The first version of this signal was used in Credibility Coalition’s WebConf2018 study in which it was expressed as a question with multiple choice answers as follows:
Has the central claim in this article been fact-checked by an IFCN Verified Signatory?
A) Most likely not fact-checked by an IFCN Verified Signatory
B) Most likely not fact-checked by an approved source
C) Fact-checked and determined false
D) Fact-checked and determined true
E) Fact-checked with unclear results
F) Fact-checked with mixed results
It was initially deprecated due to the recognition of a number of valuable fact-checking efforts that are not IFCN Signatories, but then has remained with a change to its options as follows:
Does the article rely on a claim that has been fact-checked by a member of the International Fact Checking Network (IFCN)? If so, has it been debunked?
A) The article was fact-checked and determined false
B) The article was fact-checked and determined true
C) The article was fact-checked with unclear results
D) The article was fact-checked with mixed results
E) The article was most likely not fact-checked by an IFCN member
To express these questions as signals, combine with the signals related to fact-checking organization, see section 2.2. Fact-checking Organization and 2.1. Claim Review above. [y][z][aa][ab]
7.6.2 Signal: Article has a claim
Ref |
Definition (Template) |
Tags |
b7f2d683 |
[Claim] is a claim in [Article]. |
|
9. Subject type: Web Page
9.1 Layout
Issue: Should this be a Heading1 like Title? Probably no, because the statements naturally get phrased with the subject of the statements being a web page. Most people wouldn't conceptualize the page layout as its own entity.
9.1.1 Signal: Framed with navigation
Also called: topnav, sidenav, framenav
Ref |
Definition (Template) |
Tags |
7b21549b |
[subject] has a prominent top or side menu structure or buttons or links, taking user to other parts of site |
|
b0b323b3 |
[subject] has obvious navigation elements at one or more edges of the content, providing a way to reach other content on the same website |
|
(Consensus discussion including benefits and risks goes here)
(External data from studies and implementation reports gets inserted here, matched by heading text, “also called” text, and the template text.)
9.1.2 Signal: Number of images accompanying story
Ref |
Definition (Template) |
Tags |
66a577e0 |
Article page contains [number] of images to illustrate story |
|
9.4 Metadata in page head
9.5 Metadata inline in body