Authors of publications should provide sufficient details of the research conducted. This detailed information should discuss the study and its execution in a way that interested reviewers may be able to repeat the research. The value of sensory data is only as good as the test from which it came.
The following is meant to provide guidance for conducting high quality sensory science research, focusing on tips for its publication. With consideration of the following content throughout the research and publication process, Society of Sensory Professionals hopes to maximize the final research’s acceptance and future reapplication in the sensory science community.
These guidelines are not intended to replace the knowledge of sensory techniques found in Sensory Books and Manuals.
Determining the Test
The test objective should help define the test and parameters (samples, respondents, location, etc.) that is chosen. When more than one objective is present, it may be necessary to conduct more than one type of sensory test.
- The determination of liking or acceptance requires the use of a consumer test; trained descriptive panelists should never be used for this purpose.
- Determining intensity information about attributes can be done either by trained descriptive panelists or by consumers depending on the level of detail that is needed. Trained panelists typically provide more detailed information than untrained consumers.
- To determine if samples are different overall from each other, discrimination tests often are used. When using discrimination tests, such as triangle or duo-trio tests, it is important to conduct them with panelists who have been screened to ensure they can find differences in similar product categories.
- Regardless of the test structure that is chosen, the test must be statistically powerful to determine when key differences are or are not present.
Methodology (identify the specific method, evaluation instructions and scale used)
- Discrimination/Difference testing – duo-trio, triangle, tetrad, n-AFC (n-alternative forced choice), threshold testing, etc.
- Affective testing - acceptance and/or preference tests in a laboratory or real-world setting, Central Location Tests (CLTs), In-Home Use Tests (IHUTs), internet testing, etc.
- Descriptive Analysis - Quantitative Descriptive Analysis (QDA or Tragon QDA™), Spectrum™ Analysis, Time Intensity, Flavor Profile, Texture Profile, Hybrid Method, etc.
- New methodology – when a new and/or modified method is being utilized, a description with specific details should be provided.
- A reference(s) to the method from reliable sources (e.g. sensory journals or sensory textbooks) should be provided. New or modified methods should be referenced to show their validity or the validity of the method should be shown another way within the paper. Simply describing a new or modified method without an explanation of how it was developed and validated is insufficient.
- Explain the serving order of samples (if appropriate). Serving order or rotations are determined by the experimental blocking scheme, which helps to mitigate bias and can help increase the randomization. Some examples of blocking designs include: balanced incomplete block, randomized complete block, split-plot, and Latin squares.
- Mention how bias was minimized by describing how the samples were blinded or masked so that respondents could not identify samples unintentionally. Describe the use of blinding codes (e.g. three-digit numerical codes) or serving in/on a container/vessel with no suggestive information.
Recruiting (Target Population)
- The target population is critical to success and must be determined based on the test objectives and type of test chosen (e.g. threshold determination, affective testing, hedonic testing, profiling, difference testing, etc.). Nuanced differences separate the type of testing being conducted, be it qualitative or quantitative, as well as at what stage the product is in its lifecycle or whether it is renovation or innovation, as mentioned previously.
- It is important to select the right respondents when conducting sensory testing.
a. Naïve consumers are typically a subset of the larger user base and represent the typical consumer who would purchase or use the product. Broad demographic considerations for tests with naïve consumers include some of the following: age, gender, occupation, ethnicity, race, income, usage or target group, consumption sensitivities, psychographics, values, and willingness to participate. It should not include employees of the company manufacturing the product or anyone with additional knowledge about the product or category because it is important to eliminate any bias which might be present for the product or the category. These tests are typically subjective and based on the naïve opinion of the consumer.
b. Prescreening for sensory capabilities typically is not used for consumer studies but could be needed when consumers are used based on the test objectives. The pre-screen type and intensity would be for difference testing (e.g. matching tests, detection/discrimination testing, ranking/rating for intensity, etc.) or descriptive testing (e.g. prescreening questionnaire, acuity of detection/description, ranking/rating, personal interview, etc.). Pre-screening requirements often are used to determine a respondent’s ability to conduct the test effectively and to minimize noise that could enter a given study.
c. Trained panelists usually have extensive pre-screening and training on the attributes being testing and can identify the intensity of each attribute consistently. They also are trained to be objective with their scores and strive to make sure their scores are not affected by any underlying preference for a sample. It is unnecessary to balance trained panelists demographically. They are recruited, trained, and maintained on panels based on individual ability, not demographics. In addition, trained panels must be validated at regular intervals to assure they have maintained their acuities.
Sensory evaluations should occur in a well-defined area that is easily accessible and conducive for conducting sensory tests. Laboratory space should have appropriate ventilation, neutral color/background, proper lighting, and minimal traffic and be free from distractions, noise and odors. In some cases, it may be important to have internet connectivity or modifications to the test structure will have to be considered.
Some locations may be internal (Descriptive or Discrimination testing) or external to the researcher’s company/institution if it will not bias the respondents in the test.
- Sensory Booths or Partitioned Areas are often used for sensory tests (i.e. descriptive analysis, discrimination tests, consumer acceptance studies)
- Conference Rooms / Classrooms are typically utilized for Descriptive Analysis (i.e. consensus profiling), Focus Groups, One-On-One interviews, and some central location consumer tests if it is possible to separate respondents as needed, etc.
- Individual homes typically are used for In-Home Use Tests (IHUTS).
- Other areas are used when specialty tests are conducted. For example, observational studies may need to be conducted while a consumer is shopping, in a laboratory, or home that provides observation for watching consumers prepare food, shave, or go about other aspects of daily life. Personal care items can be tested in facilities with dedicated areas for these evaluations and/or in rented hotel rooms. On-line testing may be used for both qualitative (chats), and quantitative data collection.
The design of the study reflects many different decisions that impact the results. The design must be planned carefully to ensure that the objectives can be met with data collected in an accurate and efficient way.
1. Samples/Number of Samples
a. Samples should be prescreened to ensure that they represent the important parameters being tested and do not contribute to unnecessary bias. This includes considering whether the differences among the samples are appropriate for the intended evaluator. For example, the average consumer likely cannot detect small differences, and including samples that are only slightly distinct does not optimize testing resources.
b. The required number of samples varies depending on test objectives and the resources available for the study. For example, only one sample is needed if a gold standard profile is being developed but 15+ samples may be necessary for a robust design of experiments (DOE) study. More samples often are recommended, but samples should be chosen carefully because testing too many samples in a single day can result in fatigue, adaptation, contrast effects, and other unintentional biases. Thus, a lot of thought should be given when choosing a reasonable number of samples and the appropriate number of days to successfully answer the objectives, help describe/discriminate products, control bias, etc.
a. Number of samples included in the study
b. Sample description (i.e. manufacturing information)
c. Sample preparation (i.e. serving size, serving temperature, holding time, timed breaks, size of serving containers, preparation information, etc.)
d. Palate Cleansers
e. Carriers (if applicable)
3. Panelists—Identify the type of panelists used
i. Recruitment criteria
ii. Screening methods
i. Details of the training
ii. Methods used to determine they are reproducible
c. Identify and justify the number of panelists used.
i. Typically, for consumer studies, 100 or more targeted consumers are recommended. However, for screening studies, fewer might be appropriate. For studies purporting to represent large demographic segments or testing for advertisement claims substantiation, much larger numbers of 300-600+ may be necessary.
ii. For descriptive studies, the number of panelists recommended depends on the type of test and the degree of training. Numbers from 4–18 have been used but should be justified based on the level of training and the difficulty of the task.
Attributes help describe an experience and fall in the categories of appearance, aroma, flavor, sound, and sensation/feel (texture). Certain aspects of products differ in the nuances that further define the broader categories mentioned above.
Proper communication of the statistical analyses used in a body of work serves to grow reader confidence in the results as well as allows for future re-creation of the method as the research is extended. Here are some common questions to cover in the communication about statistical methods.
- Sampling Procedure—How was the data obtained and what/who does the sampling frame represent? Were replicates (i.e. the sample was made multiple times) or duplicates (i.e. the same lot of sample was used multiple times)? What are the final sample sizes for the analyses performed?
- Data Treatment—Were there any transformations needed for on the data? Was there missing data or outlier data present and how were they handled?
- Analysis Type—Which specific statistical analyses were performed on the collected data and what is the final test statistic? Which statistical software was used for the computations?
- Statistical Results and Conclusions—What was the statistical test result? What confidence level was used? In the case of small sample sizes, was their sufficient power to detect differences of interest? What does this test result allow you to infer about your research question (and how are your inference and the test statistic consistent)?
The limitations of the information should be stated clearly. Limitations should explain to the reader what really can be concluded and what should not be concluded from the study. Pilot studies are not encouraged as a stand-alone publication.
State any limitation in the population of samples, respondents, or analysis that need to be noted so that the reader can understand whether the information gained in the study can be projected onto other products or situations or is specific to this project only. To what products or to whom can the data be projected? What do the samples represent?
For example, the sensory data may be limited if the:
- Set of products is small and non-representative of the category/market or if products have a specific ingredient, processing design or particular features that make them unique or different from other products in the category, the limitations should state that the results can be applied to this set of products only.
- Consumers represent a subgroup of potential users; the limitations should make it clear that other populations may respond differently, and the results cannot be projected broadly.
- Ability of the statistics to determine differences is low (also known as the “power” of the test), this must be noted in the limitations because the likelihood of finding important differences is reduced.