Statistical Aspects Of Claims Substantiation


Claims on overall performance, liking, preference, and efficacy can be challenged through the Advertising Standard Council of India (ASCI), a self-regulatory system for the industry. Stated claims are supported by data that are statistically evaluated to judge if the experimental results are due to a real effect or to random variation. Therefore, before one can be translated into a statistical research hypothesis. If a claim cannot be translated as a research hypothesis, it cannot be substantiated.

Claims may be disputed for one or more of the following reasons:

  • The experiment is poorly designed to address the claim.
  • The experiment is poorly executed, i.e., protocol not strictly followed, unqualified personnel, and faulty instrumentation.
  • The claim has no technical merit in relation to the product. Can the stated claim be related to product ingredients?
  • Poor choice of methods for gathering the data, i.e., instrumental methods, use of consumer and/or trained panel. Is the method used following established guidelines, such as that provided by the Bureau of Indian Standards, Colipa guidelines for cosmetics, American Society for Testing and Materials, ISO standards etc?
  • The statistical analysis is faulty. i.e., wrong methodology used.
  • The stated claim is misleading.  

Null And Alternative Hypotheses

A statistical hypothesis is a statement about the quantifiable aspects of products, which can be estimated from experimental results but are not otherwise directly observed. In statistical terminology, a research hypothesis (Claim) is called the alternative hypothesis. A complementary statement to the alternative hypothesis is called the null hypothesis. Statistical tests of significance are rules for judging whether the experimental results support the claim formulated as an alternative hypothesis.

Types Of Errors And The Power Of Statistical Test

Through experimental designs, data are collected and relevant sample statistics are computed, such as the mean, the standard deviation, etc. Since these statistics are subject to sampling and experimental errors, the statistical tests may lead to an incorrect decision. Suppose the decision is made to reject the null hypothesis and accept the alternative hypothesis. This decision, if it turns out to be wrong, is said to result in a type I error. The probability of type I error is denoted by α and is known as the significance level of the statistical test. A probability of α = 0.05 indicates that the test is liable to wrongly reject the null hypothesis 5 times in 100 cases. Significance levels α = 0.05 and 0.01 are often used in scientific applications and are generally the accepted levels for claims substantiations. On the other hand, a type II error results if the decision is made not to reject the null hypothesis but in fact it is false. The probability of type II error is denoted by ß. In the planning of a claim substantiation study, both types of errors should be controlled.

Statistical Significance | Experimental Significance

Statistical analysis is probabilistic. A statistically significant result may not be of practical significance to the consumers. For example, the colour of a cosmetic product may have changed over time from its original colour. The change may be statistically significant, but not necessarily in the eyes of consumers. Thus, instead of merely having a statically significant change, one may need to determine the amount of change that will be perceived as significant by the consumers. This amount of change must be determined by correlating trained panel results with consumer test results.

Types Of Claims

Claims may be classified by two properties: style and competitive focus. Style refers to the statement being made about the advertised brand, the most common being a “distinction” claim, in which a brand claims to be preferred, more efficacious, safer, etc. Another style is a “similarity” claim, which conveys that the advertised product is like the competitor’s product in one or more attributes. All products in this category must be tested against the advertised product.

A competitive focus claim is a statement being made about the competition against one or more explicitly identified brands or implied brands. For example, the claim may be targeted against an implied brand, i.e., “preferred over the leading brand.” or more broadly against a brand set, i.e., “No leading oil is more absorbent.”

In both style and competitive focus, a claim statement can be monadic, making no comparison with other products, i.e., a statement of quality, an invitation to try the product or an untargeted claim. An untargeted claim is considered puffery and requires no formal substation.

Models For Analysis Of Product Performance Data

A useful model for the evaluation of a proposed claim must address the following aspects:

  • Rationale,
  • Objective evidence,
  • Subjective evidence, and
  • Safety

A model incorporating these aspects becomes increasingly important in disputed claims.


Consumer products contain ingredients that affect the perception that they are desirable. By linking ingredients in a product to experimental results, one can provide a rationale for the claim. Experimental support from allied sciences, such as in-vitro studies and model systems, can also provide additional rationale for the claim.


A claim becomes stronger if its usefulness can be objectively and subjectively determined. An objective measure of product performance is desirable. It can be obtained by clinical studies in real-life settings on humans and the targeted population per product. Responses from such clinical studies can be measured by bio instruments or obtained by trained or expert panels. Indeed, data from trained panels are recognized as objective measures. Descriptive analysis and the spectrum method, which used a trained panel, can provide objective measures of the sensory properties of personal care products.

A descriptive panel undergoes rigid training and validation/calibration as specified by each method.


When properly carried out, subjective measures obtained from home use testes, among others, may provide useful and acceptable data for claims substantiation.


Obviously, cosmetics products must be safe and without adverse side effects. The model must address the safety aspects. Safety-related data can be obtained from research guidance panel tests, central location consumer tests, and the various types of laboratory model systems, i.e., in vitro and in vivo tests.

As indicated, a model incorporating these aspects provides a way to deal with conflicts, permits more efficient use of data for the development of truthful claims, and promotes effective communications between parties in disputed situations.

A conceptual model for assessing perception data measuring interdependent attributes is postulated, this model defines the following:

  1. Whether the product performance claim is based on a product attribute or not i.e., the emphasis is on overall product performance with no focus on product attribute or benefit.
  2. Whether the product performance claims focus on a specific attribute or on a set of attributes i.e., drag, stickiness, residue, spreadability.
  3. Whether the claim is for a specific attribute or a set of attributes and whether it focuses on a feature or a benefit. For example, in antibacterial personal care products, the active ingredient provides a product dimension and the benefit of this dimension is cleanliness and safety to the users against bacterial infection.
  4. Whether the claim is merely stating the presence of the attribute or also its benefit and whether it suggests a parity or superiority against a specific competitor or class of competitors.
  5. Finally, whether the parity or superiority claim is restricted to an attribute and its advertised benefits or to an overall parity or superiority. For example, in consumer tests, an overall preference or overall liking when used as a claim suggests that all product attributes contribute, incorporating interdependence among sensory attributes during product evaluation by consumers.


The experimental designs suited for obtaining data for parity and superiority claims are discussed further on.

Superiority Claims

A superiority claim simply indicates that the product advertised is the best in the market. It is essential that direct product-to-product comparative testing be used for substantiating a superiority claim. An appropriate design for comparing two products at a time is known as the paired-comparison design. Which is discussed further on. An example of a superiority claim is “compared to the leading brands, tropical Isles is unsurpassed as a skin moisturizer and conditioner” As stated before, a claim must be translated into a statistical hypothesis. In order to do this, we must have a well-defined scale on which these products can be scored for comparison. Suppose product A is being compared with a leading brand for claim substantiation. If, on a scale for comparing such products, high scores correspond to superior products, we can formulate two statistical hypotheses such as the following:

H0: Average score of leading brand   average score of product A

H1: Average score of leading brand <average score of product A

To be able to claim superiority for product A, the null hypothesis must be rejected at, say, the 95% confidence level (5% significance level) in favour of the alternative hypothesis, which states that product A is superior to the leading brand.

Parity Claims

Parity claims are difficult to establish by means of hypothesis testing methodology because for parity claims the research hypothesis essentially states that the products are equivalent. Using a rating scale, the equivalency is translated as equality of two average scores, equality of average scores can only be stated as the null hypothesis. A statistical test will either reject the null hypothesis when there is sufficient evidence in support of the alternative or will not reject it. If the null hypothesis is not rejected, it should not be understood that the products are equivalent. Intentionally or otherwise, one can design an experiment to collect insufficient data, lacking information that leads to a decision not to reject H0. This decision only means that there is insufficient information to disown the parity claim. it does not mean that a parity claim is established with any degree of confidence.

In disputed parity claims, if a proper formulation of hypotheses and a sound design are not used, differences may arise that will be difficult to resolve among the parties involved. It is a waste of time to argue about the validity of a claim if the methodology and the design are not carefully employed. As stated above, one can design an experiment with an insufficient sample to mask significant differences between products because of the failure of the study to reject the null hypothesis. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis”.

Therefore, the formulation of hypotheses for a parity claim and their statistical testing must be done in such a way that the decision to reject the null hypothesis amounts to the parity of products.

Experimental Designs For Claim Support

There are three important elements in the development of a strong product claim:

  • A clearly stated claim,
  • A good experimental design to address the claim, and
  • A properly executed study following the experimental design. A critical part of the first element is the specification of the target population, because once this is done, the development maire development, sample size, test execution.

Target Population

A product is developed to meet either the needs of the general population or those of a specific user group in the population. Depending on the stated claim, the general population or a specific group defines the target population. In particular, the user of the product could be the purchaser and not necessarily the user. For instance, the wife is the purchaser of baby powder. on the other hand, the husband is the purchaser of after-shave skin conditioners. In the first case, wives would be the target population, and in the latter case, husbands. If the claim is for the general population, then the participants in the test would be a random sample of the population. Similarly, a random sample of a specific user group should be used in the study.

Questionnaire Design

In gathering consumer data for claim substantiation, it is important that the product attributes related to the claim be included in the questionnaire. For example, if “soft” and “smooth” are sensory attributes claimed for the product, then these attributes must be included in the questionnaire in the form of intensity and /or hedonic (like/dislike) questions.

How many attributes questions the questionnaire should include is often a difficult decision to make in questionnaire development. If a product has undergone a series of descriptive sensory analyses, this should provide the appropriate number of attributes for inclusion. Briefly, descriptive analysis is a sensory methodology that provides quantitative descriptions of products based on the perceptions of a group of qualified subjects. It is a total sensory description, taking into account all sensations perceived—visual, auditory, olfactory, kinesthetic, and so on – when the product is evaluated. In practice, the desirable number of attributes has ranged from 10 to 15.

Another aspect of questionnaire development is the choice of the rating scale (1=dislike extremely, 5=neither like nor dislike, 9=like extremely) developed in 1947 at the Quartermaster Food and Container Institute for the U.S. Armed Forces. This is the most extensively studied of rating scales and, as a result, is the most reliable one for acceptance/preference measurement. Information on questionnaire development is widely available.

Paired Comparison

The paired comparison is the most powerful design to support almost all types of product claims. The statistical analysis of paired-comparison design is simple and meets all the essential statistical assumptions; the test is simple to execute for both the experimenter and the panellist, and the evaluation of two products by a single panellist, and the evaluation of two products by a single panellists first nicely into the classic paired-comparison situation (i.e., right/left sides of biological materials.)

The general idea of the paired-comparison design is to form homogeneous pairs of like units so that comparisons between units of a pair measure differences due to treatments rather than units. This arrangement leads to dependency between observation on units of a pair measure difference due to treatments rather than units. This arrangement leads to dependency between observations on units of the same pair. This situation can be extended to sensory and consumer testing. The statistical assumption in the analysis is that the differences is independent and normally distributed; in most cases this assumption is satisfied in practice. Furthermore, the common problem of correlation of ratings among panelists becomes irrelevant, since one is now dealing with differences di.

Randomized Complete Block Design

For reasons of cost, time, and other business constraints, one must conduct a consumer test with more than two products for evaluation by panelists at the same time. In this situation, the randomized complete block design (RCBD) is used for claim substation. The statistical model for describing an observation is


Where = the observed rating for the  product given by the  panelist; µ= the grand mean; = the effect of the  product; = the effect of the panelist; and = random errors assumed to be independently and normally distributed, with mean zero and variance . In this model, the effect of panelist-to-panelist variation is removed from the random errors  , making the test of significance more sensitive.

In most consumer testing claim studies, the statistical analysis from the RCBD or the single-factor repeated- measures design is sufficient. Also, the SAS code in table 5 can easily be expanded to include demographics, product usage information, and so on.

Concluding Remarks

We have covered the importance of statistical experimental design to consumer tests for supporting claim substantiation. In particular, the formulation of statistical research hypotheses is discussed and its importance in parity claims reviewed. The use of a paired-comparison design is recommended for claims substantiation. The importance of understanding the power of a statistical test and its relationship to sample size to provide a claim that can withstand rigorous scrutiny was emphasized.


Maximo C. Gacula, J., & Singh, J. (1998). Consumer Testing Statistics and Claims Substantiation. In L. B. Aust, Handbook of Cosmetic Claims Substnatiations (pp. 235-258). New York: Marcel Dekker, Inc.