Tobacco products were historically less regulated than strawberry jam in the US1 despite being a, if not the, leading preventable cause of early death and disability2. Marketing literature suggests that when a brand introduces a new product to market it should be reminiscent of existing brand packaging, drawing on design elements from its brand line; in doing so, it signals to the consumer that the new product aligns with the existing branding3-5. More specifically, this research suggests that a new product packaging should: 1) feature unique packaging characteristics, 2) align with existing brand strengths, and 3) be noticeably different to consumers3-5. The last function is vital, as a consumer’s ability or inability to notice newness can trigger a change in product purchasing patterns and perceptions3-6. Particular to perceptions of tobacco products, previous research has shown that changes to product packaging can increase consumers’ perception of healthfulness and harmfulness, even in instances when the physical product (e.g. the actual cigarette) has not changed7.

Changes to cigarette packaging, like other consumable goods, sway consumers’ purchasing behaviors and product perceptions8-10. Substantial evidence indicates that cigarette pack design matters to consumer perceptions of product harm7,11. Internal tobacco industry research revealed that cigarette product packaging changes are designed to influence smokers’ purchasing behaviors, crafting products that are more appealing to specific demographic and identity groups12. For instance, changes in smoking by women have been shown to be related to a carefully calibrated approach to create packages with ‘feminine’ design features13. Packaging color has been identified as a particularly important characteristic in marketing cigarettes7,14 and natural imagery represents a strategy that has been used to give a healthy ‘halo’ effect to products15. Supporting this research further, there has been a myriad of studies that have provided compelling evidence that plain packaging with its absence of brand features influences consumer behaviors and perceptions16-21, although plain packaging is likely not a viable policy option in the US given constitutional protections for marketing as corporate speech22.

Until 2009, there was no meaningful regulation of packaging in the US, and thus there was a limited need for research on the impacts of changes to cigarette packaging23. Globally, a substantial focus has been on the evidence for plain packaging16. While there is substantial literature on product packaging and the visual design of products in the fields of marketing and consumer behavior, in the US tobacco regulatory science research still has gaps that can hinder the ability of the U.S. Food and Drug Administration (FDA) to create and defend science-based regulations24. First, tobacco industry document research and observational studies7,11 offer courts evidence but with less weight than experimental research. Second, much of the scientific literature on cigarette product labeling changes was conducted outside of the US, often where plain packaging and large graphic warning labels are the design standard enforced by regulating bodies18,25. Third, research has not focused on small changes to cigarette packaging, leaving gaps in knowledge regarding how substantive a change would be required to indicate a new product. Combined, these limitations to the literature impede the development and defence of science-based regulations even when there is a strong scientific consensus that design changes to cigarette packs substantially impact population health.

A theoretical framework, the Context of Consumption Framework26, supports the importance of understanding the role of visual product design in cigarette packaging. Visual design researchers have used this framework to document and investigate affective, cognitive, and behavioral consumer responses to visual design26. Briefly, the Context of Consumption Framework suggests cognitive responses to product design involve the processing of how the design influences individuals’ perceptions of the product’s characteristics, the aesthetics of the packaging, and their own identity. Affective responses include emotional responses to how the product will satisfy a need or desire, feelings about the social positioning of the product, and interest. Cognitive and affective responses influence behavioral responses, leading to product approach or avoidance. Research suggests this framework is appropriate for studies of cigarettes27. This framework informs our study. To address these gaps and guided by theory, we conducted two separate experiments. In both, we used an experimental design in combination with a discrete-choice task, where research participants selected one pack per randomized set of choices.

First, we aimed to examine the role of subtle changes to a product on consumer choices. Specifically, we examined subtle changes in logo size, pack dimensions, and color saturation. Our first experiment’s stimuli were intentionally selected to test the role of subtle changes to assess if FDA was fully leveraging its powers to protect the public. At the time we designed our research, subtle changes such as logo size and color saturation were already being identified by FDA as potentially not worthy of regulation – both logo size and color saturation were specifically referenced by the FDA in its second edition draft guidance for industry as ‘examples that may not result in a distinct product’28. Pack dimensions were noted to be of interest by FDA, but the evidence cited by FDA has the limitations noted above and focused largely on very noticeable changes to packaging29. Thus, experiment one was designed to provide evidence regarding consumer choices in relation to small changes to packaging.

Second, we aimed to link changes in pack design to adult smokers’ perceptions of harm, appeal, and match with one’s own style. Specifically, we examined color hue, designs with natural/organic imagery, and color saturation. Our second experiment’s stimuli were intentionally selected to test the role of changes that FDA’s early guidance suggest would yield a new product as ‘examples that may result in a distinct product’28. Given the importance of color in tobacco packaging, we also sought to further explore the role of color saturation. Thus, experiment two was designed to provide evidence regarding consumer choices in relation to large changes to packaging.


Study design

To address our study aims, for each experiment we utilized a within-subjects design; specifically, we used a balanced lattice design with 64 different pack designs (Plan 10.5 in Cochran and Cox)30 in combination with discrete-choice tasks. Discrete-choice tasks31 ask participants to choose between presented options. As such, they provide a closer proxy for behavioral decision making than do ratings, and they can be used to disentangle what factors contribute to choices. Following a power calculation and based on a prior study32, we planned for 275 respondents in each experiment. Using the advanced block randomization feature of Qualtrics, each participant was randomized to view one of nine repetitions containing eight blocks of eight different packs. In each block, the participant completed a discrete-choice task. We also randomized the order in which the blocks were presented. Thus, each participant made eight choices in total for each dependent variable. Supplementary file Figure S1 shows the design of the study. Balanced lattice designs are ideal for experiments with many different variations30.

Design of stimuli

To develop our stimuli, we drew upon focus groups with US adult smokers about cigarette pack designs33. A professional graphic designer with training in product packaging design then iteratively developed a unique brand for our study, Glacier, after reviewing existing packs on the market. Figure 1 shows the pack’s design.

Figure 1

Study 1 reference pack

For the first study on newness, we created four packs of different dimensions, four logo sizes, and four color saturation levels. We also created an ‘average’ pack (Figure 1). Variations were developed by the graphic designer to produce differences that were the smallest change to make a distinguishable difference. For example, color saturation was developed with a light blue and a dark blue and the designed used a gradient tool to transition between them. In Cyan, Magenta, Yellow, and Key (CMYK) values, the color saturation ranged from: Light Blue (20,0,0,20) and Dark Blue (80,60,0,20); to Light Blue (100,10,0,32) and Dark Blue (100,94,0,90); with the average pack (Figure 1) having values of Light Blue (53,10,0,32) and Dark Blue (100,94,0,50). Logo sizes ranged from: 0.4421 × 0.2892 inches to 1.0315 × 0.6749 inches; with the average pack having values of 0.7368 × 0.4821 inches.

For the second study, we created eight color hues (blue, gold, green, light blue, light green, orange, purple, and red), four levels of natural imagery (standard pack from the first study, a leaf logo, a field of tobacco leaves and organic logo, and a pack with the appearance of unbleached recycled paper and an organic symbol), and two levels of color saturation. Figure 2 shows the imagery. All stimuli are available in our institutional repository [].

Figure 2

Study 2 imagery examples

Sample and recruitment

To recruit participants, we utilized panel provider Qualtrics Research Services, which does not maintain its own survey panel but contracts with other panel providers and, using a proprietary algorithm, selects participants from across multiple panels. Qualtrics’s survey panel service has pre-screened and enroled research participants, conducts recruitment, and pays incentives to their survey panel participants. Specifically, we used the following quota sampling requirements to ensure a diverse study population: a 50–50 gender split based on sex assigned at birth, ≥33% have <4 years higher education, ≥33% identifies as sexual or gender minority, ≥15% identifies as Black/ African American, and ≥15% identifies as Hispanic/ Latino. We oversampled sexual or gender minority adults due to higher prevalence of smoking among lesbian, gay, and bisexual adults. Participants could participate in only one of the two surveys, i.e. we used non-overlapping samples in the two experiments. Beyond the quota sampling, we did not require that the two surveys have the same mix of participants. Participants were only eligible if: 1) they were using a computer (i.e. participants using mobile devices were not eligible), 2) they were aged ≥18 years, 3) they reported not being red-green color blind, 4) they lived in the US and spoke English, and 5) they had smoked 100 cigarettes in their lifetime and currently smoked every day or some days. To improve data quality, Qualtrics also excluded participants who failed attention checks or who completed the survey in less than half of the median time to complete it during its soft launch. Qualtrics fielded the experiments concurrently from 14 November to 18 December 2018. Table 1 shows the participant characteristics.

Table 1

Participant characteristics by study, 2018

CharacteristicsStudy 1 n=285 n (%)Study 2 n=284 n (%)
Smoking frequency
Every day210 (73.7)232 (81.7)
Some days75 (26.3)52 (18.3)
Usually smoke menthol142 (49.8)144 (50.7)
Time to first cigarette after waking (minutes)
>6079 (27.7)63 (22.2)
31–6052 (12.2)49 (17.3)
6–3092 (32.3)90 (31.7)
≤562 (21.8)81 (28.5)
Age (years), mean ± SD46.9 ± 14.348.4 ± 16.3
Sex assigned at birth
Female137 (49.1)143 (50.4)
Male145 (50.9)141 (49.6)
American Indian or Alaska Native10 (3.5)6 (2.1)
Asian8 (2.8)8 (2.8)
Black or African American43 (15.1)44 (15.5)
White226 (79.3)209 (73.6)
Other8 (2.8)24 (8.5)
Hispanic, Latino, or Spanish origin43 (15.1)43 (15.1)
Sexual orientation
Straight or heterosexual193 (67.7)186 (65.5)
Gay or lesbian45 (15.8)52 (18.3)
Bisexual47 (16.5)34 (12.0)
Educational level
<4 years of college111 (38.9)186 (65.5)
≥4 years of college174 (61.1)98 (34.5)

* Multiple choices. Percentages do not total to 100 due to sporadic missing values.


Demographic variables

We assessed age with ‘What is your age?’ and provided a text response box. We assessed smoking status by asking if participants had smoked 100 cigarettes in their life. We then asked: ‘Do you now smoke cigarettes every day, some days, or not at all?’. We used one item from the Fagerström Test for Nicotine Dependence34: ‘How soon after you wake up do you smoke your first cigarette?’. To have a binary variable for quota sampling, we asked: ‘What sex were you assigned at birth, on your original birth certificate?’ with options of male and female, as no US state provides for other options on birth certificates at the time of our participants’ birth. We asked participants to identify ‘Which one or more of the following would you say is your race?’ and ‘Are you Hispanic, Latino, or Spanish origin?’. To assess educational attainment, we asked: ‘What is the highest grade or year of school you completed with five response options’ (Grades 1–8, 9–11, 12 or GED, some college or technical school, and college graduate).

Study 1: Dependent variable

After consent and screening questions, we gave participants the following prompt: ‘On the following pages, we would like you to imagine that you have gone into a store to buy this pack of cigarettes (Figure 1 was displayed). If it was not available, but there were similar packs, we would like you to tell us which one you would buy instead. Please assume the pack styles you will see are the only ones available and they all sell for the same price. There are 8 sets of packs for you to compare’. After this screen, participants made the eight choices, one per screen. Each had the following prompt: ‘Imagine you went to the store to buy this pack of cigarettes (Figure 1 was displayed). If it was not available, but there were some very similar packs, if these were the packs you had to choose from, which one would you be most likely to buy? They are all the same price’. Participants then selected a pack and moved to the next screen.

Study 2: Dependent variables

We used the following instructions prior to the discrete-choice task: ‘We are going to show you 8 sets of cigarette packs. In each set of packs, we will ask you to choose a pack that seems the least harmful to your health, that seems like it would best match your style, and that seems most appealing to you. They are all the same price’. For each choice, we reiterated the instructions: ‘Please imagine you went to the store to buy cigarettes. The only options available were the packs below. They are all the same price’. We then requested: 1) ‘Select the one that seems the least harmful to your health’, 2) ‘Select the one that seems most appealing to you’; and 3) ‘Select the one that seems like it would best match your style’. Participants could select the same or different packs for each request.

Statistical analysis

For data management and descriptive statistics, we used SPSS v. 26 (IBM, Chicago, IL). We restructured data so each choice was a row. To analyze the choice experiments, we used Latent Gold Software v 5.1 (Statistical Innovations, Arlington, MA), which was designed for the analysis of discrete-choice experiments. We report choice parameters, 95% confidence intervals calculated with robust sandwich standard errors, and associated Wald tests, which allow us to present comparisons of levels (e.g. different colors) within attributes (e.g. color hue) of the design. We conducted pairwise comparisons between each package attribute’s levels, which are presented in our institutional repository []. We also report a measure of relative importance. The estimated parameters from the discrete-choice model are utilities from the field of economics (i.e. values established based on preferences). The parameter is not inherently interpretable in itself; however, the estimates indicate the relative influence of the level of the attribute on choices. The importance measure of an attribute is the difference between the maximum utility and minimum utility for its levels. It can also be expressed as a relative measure to other attributes35, which can be interpreted as the weight of the given attribute in choices made by participants.


Study 1

When tasked with picking the most similar package to an average package, package dimensions, color saturation, and logo size significantly predicted choices with Wald tests of 100.0 and p<0.001, 52.5 and p<0.001, and 18.3 and p<0.001, respectively. The relative importance of each characteristic indicated that subtle changes to pack dimensions had the greatest relative importance (55% of participant choice was driven by dimensions), followed by color saturation (33%). Logo size had the lowest relative importance (12%). Table 2 shows model estimates, which indicate greater or lower preference in participant choices, and if their 95% confidence interval crosses zero. Participants were unlikely to think pack 1 (3.3124 × 2.5 inches) was the most similar package, and participants were most likely to select pack 4 (4 × 2.125 inches) as most similar. Pairwise comparisons (Supplementary file Table S1) indicate that for each attribute type, the two packs with the most similar characteristics as the average pack were selected with no significant difference between each other, indicating that pack characteristics most like the average pack were selected similarly by participants. Thus, participants noticed and distinguished between the more extreme of our subtle design changes; changes in dimensions and color saturation had the biggest influence on the discrete-choice task.

Table 2

Study 1 package attribute and pairwise comparisons for discrete-choice task of picking most similar package to an ‘average’ package, 2018 (n=285 cases, n=2253 replications)

AttributesEstimate95% CI
Package dimensions (inches)
13.3124 × 2.5-0.92-1.10 – -0.73
22.5 × 20.220.09 – 0.34
32.3125 × 2.1250.280.14 – 0.41
44 × 2.1250.420.28 – 0.56
Color saturation
1Lightest-0.57-0.73 – -0.41
20.190.09 – 0.28
30.230.13 – 0.33
4Darkest0.160.04 – 0.28
Logo size
1Smallest-0.07-0.17 – 0.03
2-0.13-0.22 – -0.03
30.160.07 – 0.24
4Largest0.04-0.05 – 0.14

Study 2


For the most appealing choice task, color hue, design, and color saturation were each significant predictors of choices with Wald tests of 120.81 and p<0.001, 49.30 and p<0.001, and 17.30 and p<0.001, respectively. The relative importance of each characteristic indicated that color hue was the most important characteristic (50%), followed by the design (38%), and color saturation (12%). Table 3 shows a preference for green packages and the pack 1 (bear) and pack 2 designs (leaf). In pairwise comparisons (Supplementary file Table S2), there were significant differences between most different color hues, and pack 1 (bear) and pack 2 had significant differences between pack 1 and packs 3 (field + organic) and 4 (recycled + organic) and pack 2 and pack 4.

Table 3

Study 2 model estimates of attributes by discrete-choice task, 2018 (n=284 cases, n=2263 replications)

AttributesChoice task
AppealLeast harmStyle
Estimate95% CIEstimate95% CIEstimate95% CI
Color hue
1Blue-0.67-0.83 – -0.51-0.50-0.73 – -0.47-0.75-0.90 – -0.59
2Gold-0.37-0.54 – -0.19-0.10-0.25 – 0.06-0.18-0.35 – -0.01
3Green0.490.34 – 0.650.170.02 – 0.310.560.40 – 0.71
4Light blue0.340.20 – 0.490.300.16 – 0.430.280.13 – 0.42
5Light green0.10-0.03 – 0.230.340.19 – 0.480.10-0.03 – 0.23
6Orange-0.10-0.25 – 0.04-0.05-0.17 – 0.08-0.09-0.23 – 0.06
7Purple0.210.07 – 0.360.130.00 – 0.250.05-0.10 – 0.20
8Red-0.01-0.16 – 0.13-0.19-0.33 – -0.040.02-0.13 – 0.17
1None (bear)0.340.21 – 0.48-0.26-0.42 – - – 0.42
2Limited (leaf)0.240.11 – 0.37-0.15-0.31 – 0.000.320.18 – 0.46
3Field + organic-0.02-0.18 – 0.130.810.67 – 0.95-0.02-0.18 – 0.13
4Recycled + organic-0.56-0.74 – -0.37-0.39-0.58 – -0.21-0.58-0.77 – -0.39
Color saturation
1Saturated-0.14-0.21 – -0.07-0.05-0.12 – 0.02-0.14-0.21 – -0.07
2Low saturation0.140.07 – 0.210.05-0.02 – – 0.21

* Type of natural imagery.

Least harm

Design and color hue were statistically significant predictors of choices with Wald tests of 122.8 and p<0.001, and 101.5 and p<0.001, respectively. Color saturation was not statistically significant, Wald test of 1.8 and p=0.18. For the least harmful choice task, participant choices indicated that the relative importance of design (54%) was greatest, followed by color hue (42%) and color saturation (4%). As shown in Table 3, the pack with the field of tobacco leaves and an organic certification, was most commonly chosen in this choice task as indicated by the estimate. Indeed, this was the largest estimate in the model for any choice. Pairwise comparisons (Supplementary file Table S3) show that pack 3 (field + organic) was significantly more likely to be chosen than the standard bear pack (pack 1) and the recycled pack (pack 4), but it was not significantly less likely to be chosen than the leaf pack (pack 2) as least harmful.


For the best matches style discrete-choice task, design, color hue, and color saturation were significant predictors of participant choices with Wald tests of 49.53 and p<0.001, 118.01 and p<0.001, and 16.57 and p<0.001, respectively. Participant choices showed that color hue was most important (52%) followed by design (36%) and color saturation (11%). In pairwise comparisons (Supplementary file Table S4), most differences between color hues were significant, and pack 1 (bear) and pack 2 (leaf) were chosen over packs 3 (field + organic) and 4 (recycled + organic), with significant differences between pack 1 and packs 3 and 4 as well as between pack 2 and pack 4.

Across the different discrete-choice tasks in experiment 2, participants’ choices were influenced by the different packaging attributes. Color hue and design exerted the most influence over choices, as expected. There was a preference for the green color hue across the three choice tasks. Regarding design, there were similar patterns of choices between the different designs across the three discrete-choice tasks, except where the task was to select the pack with the least harm, which prompted a clear choice for the pack featuring a field of tobacco leaves and an organic certification.


Principal findings

In our first discrete-choice experiment looking at subtle changes to a cigarette pack, we found that the relative importance of subtle changes to a tobacco product was highest for package dimensions, followed by color saturation. Logo size had the lowest importance. Thus, regulators should be aware that small but noticeable variations in a package are indeed noticed by consumers. More extreme changes to packaging dimensions should be monitored by regulators, given more extreme changes to package dimensions can indicate a new product29, and package dimensions were a design feature that most influenced our experiment’s discrete-choice task. Our results suggest FDA’s early guidance28 not to focus on color saturation and, especially, on logo size was likely warranted. Resources could be directed to assess and monitor other areas of product packaging.

Our second discrete-choice experiment looked at variations by hue, natural/organic imagery, and color saturation across three choice tasks to select: 1) the most appealing, 2) the least harmful, and 3) the best match with one’s style. In each of these, hue and natural/organic imagery had the highest relative importance for choices, and color saturation ranked last. Thus, regulators should consider the color hue of packages and the imagery used on them, as these influence consumer choices. FDA’s early guidance28 that such changes could constitute a distinct product were likely warranted. Regulators would do well to monitor these types of changes to product packaging when legally authorized to do so given their influence on choices relating to appeal, harm, and style.

Study results in context

In Study 1, we found that subtle changes to a product package are noticeable to adult smokers and that dimensions and color saturation were more important to the choices made by participants than logo size. Given the existing literature showing that manufacturers of new products may want to extend an existing product’s branding rather than replace it with something new3-5, these findings provide early evidence that regulators should consider subtle changes in product packaging – especially regarding dimensions.

Our finding that changes to cigarette packages influenced consumers’ choices is consistent with theory6,26. It is also consistent with prior research into tobacco industry documents, which show the tobacco industry carefully calibrates packaging to reach specific profiles of consumers and to intimate differences in product characteristics8,9. Color is also used to convey information about the product inside the pack10; indeed, previous research shows color has been used to evade bans on text descriptors such as ‘light’ and ‘mild’ on packages as well as communicate flavor7. Packaging can clearly communicate the harms of the product, its appeal, and its relevance for one’s own identity. Finally, our findings match the ample evidence for plain packaging regulation16-21; however, our findings are useful in the US context given plain packaging is unlikely to survive a court challenge in the US22. Our findings, however, are not consistent with the current implementation of regulations, which provide substantial leeway for design changes by the industry36.

Strengths and limitations

The study strengths of an experimental design, behavioral discrete-choice task, and use of theory-informed measures must be balanced against its limitations. First, the sample we used was based on quota sampling from an online survey panel and may not be generalizable to the population of smokers in the US. However, prior research shows that similar experiments in tobacco research tend to be generalizable from such panels37. Second, our research included only adult smokers. Future work should consider the impact of designs on youth and non-smokers. Third, our experimental discrete-choice task has strong internal validity; however, real world choices take place in a much more complex environment with marketing, word of mouth, brand identity, price promotions, and other influences on choice tasks. Thus, the internal validity of our study must be balanced against its more limited ecological validity. Fourth, we designed our stimuli with a professionally trained graphic designer. However, packaging designs used by the tobacco industry are carefully calibrated by extensive formative research8; our approach may have attenuated the influence of designs. We did not test an exhaustive number of packaging features; future work should expand to other potential packaging characteristics. Finally, since our study was designed and funded, the U.S. FDA has since lost a court case that limits its ability to assess changes in labeling for the purpose of determining a distinct tobacco product38. We report the results of our experiments so they will be available if future court decisions open the possibility of greater regulation of packaging changes.


Among adult smokers in the US, changes to cigarette packaging design are associated with choices about the selection of cigarette packages. They show that even small changes to cigarette packages are noticeable to adult smokers. Packaging designs can influence choices related to health, product appeal, and individual identity. These findings are consistent with the theoretical literature, tobacco industry documents, and prior studies. Marketing researchers and behavioral scientists will likely agree that our findings are consistent with the scientific evidence base and are, effectively, unsurprising. Yet, our findings provide valuable evidence to regulators, who require specific evidence relevant to tobacco products for their work. The scientific evidence needed for regulatory science goes beyond scientific consensus39. Our findings add to the available tools of regulators by: 1) being specific to the US, 2) being directly about tobacco products, 3) utilizing a strong experimental design, and 4) leveraging theory-informed measures. In summation, we provide evidence to suggest the importance of regulating the visual design of cigarettes – and we would argue of other tobacco products.