Part two of this paper estimated fl

Part two of this paper estimated flourishing prevalence rates among 10,009 adult New
Zealanders, according to replications of each of the four frequently used operationalizations of
flourishing identified in part one, using the SWI variables and dataset. Results indicated there was a substantial difference in prevalence rates of flourishing depending upon the
operationalization employed, from 24% (Huppert & So), to 39% (Keyes), 41% (Diener et al.), and
47% (Seligman et al.). The low prevalence rate of flourishing from the SWI replication of Huppert
and So’s conceptualisation (24%) most likely reflects their more stringent theoretical and
conceptual criteria for flourishing: to be categorised as flourishing participants are required to
endorse the one item representing positive emotion (which only 41% of the sample did), plus
three out of four components of ‘positive functioning’, and four out of five components of
‘positive characteristics’; thereby allowing participants to score below the thresholds on only two
out of ten items. In contrast, participants could score below the thresholds on six out of 13
components in the SWI replication of Keyes’ operationalization, or seven out 15 items in the SWI
replication of Seligman et al.’s operationalization, and still be categorised as flourishing. In only
requiring an average score of 48 and above, our interpretation of Diener et al.’s operationalization
also allowed greater flexibility across components than our interpretation of Huppert and So’s
operationalization. (This is the most striking difference between these four operationalizations,
and the cause of the variation in prevalence rates.) It is important to note that the use of different
response formats in the SWI survey meant that some of the variation in prevalence rates between
our study and previous studies might be due to the use of different thresholds, making for
potentially inaccurate international comparisons. For example, New Zealand’s 24% flourishing
according to our replication of Huppert and So’s model may not be directly comparable to the
Danes’ 41% flourishing or Portugal’s 10% flourishing diagnosed using the same model (Huppert
& So, 2013). However, by applying consistent methodology for selecting thresholds across all
four models in our study, we are confident that the flourishing prevalence rates according to the
four different models are comparable with each other in our study.
While related samples Cochrane’s Q tests indicated all four operationalizations were
significantly different to one another, cross tabulation analysis revealed a strong agreement
between our replications of Keyes’ and Seligman et al.’s operationalizations (81%) and Diener et
al. and Seligman et al.’s (80%). Even the least comparable operationalizations (Huppert and So
and Seligman et al.) indicated moderate agreement (74%). In the absence of an established
empirical benchmark stating what degree of agreement is meaningful, or indeed any criterion
for interpreting what these levels of agreement mean, it is hard to draw any concrete conclusions
from these findings.
The strengths and unique contributions of this study include the application of the four
operational definitions to a very large, nationally representative, sample of adults, which allows
our results to be compared to other population samples; the prospective nature of the SWI, with
two more longitudinal rounds scheduled over the next four years, allowing us to monitor the
prevalence of flourishing among New Zealand adults over time using all four
operationalizations; and the use of cross-tabulation and pairwise Cochrane’s Q tests allowing us
to calculate, for the first time, the degree of agreement between the SWI replications of the
different measures commonly employed to assess flourishing.
In terms of limitations, we experienced challenges in accurately replicating three of the four
operationalizations of flourishing using the available dataset (the FS was replicated exactly).
While the SWI’s large number of wellbeing variables (n = 87) presented us with a compelling
opportunity to compare these operationalizations, we acknowledge that the fit was not perfect.
Differences in questionnaire items and response formats required us to make subjective decisions
regarding the best way to replicate the original models. The challenge was to stay true to the
theory and conceptualisation of the original models, while also remaining consistent in our methodology across models. We offer the following four examples of the types of challenges we
faced, and our methods for overcoming them.
Firstly, the absence of any categorical diagnosis of flourishing for the Flourishing Scale or
PERMA-P required us to devise our own methods. We were guided by Keyes, and Huppert and
So, in our methodology. This meant selecting a threshold for flourishing on the FS that allowed
endorsement of most, but not necessarily all, of the scale's eight components (scores ≥ 48, range
7-56, meaning respondents had to score an average of six on the 7-point Likert scale). To be
categorised as flourishing in the SWI replication of the PERMA-P required participants to score
above a threshold on two of three items of each component, and four out of the five components
overall. While we acknowledge the limitations in our approach, and acknowledge the PERMA-
P research team’s preference for dashboard reporting, categorical diagnoses of flourishing
provide vital information for decision makers.
Secondly, the various items selected and response formats used in the SWI frequently
differed from those in the original scales. For instance, while the response option for the MHC-
SF measured the frequency with which respondents experienced each component over the past
month, several items in the SWI asked respondents “how much of the time during the past week”
or “how much of the time would you generally say…”. Where possible we used the same items
as the original scale, but some could not be matched to an SWI variable (such as ‘social
coherence’), which meant this component had to be excluded from our analysis. Others were
matched, but not perfectly so, leaving us having to choose the item which came closest to
representing the original construct. Some of these were far from ideal. For instance, the MHC-SF
item for ‘social growth’ (“during the past month, how often did you feel our society is a good
place, or is becoming a better place for all people?”) was operationalized using the reverse-scored
SWI item “For most people in New Zealand life is getting worse rather than better”. Similarly,
Keyes’ ‘social contribution’ item assesses respondents’ contribution at a societal level, while the
SWI item has a greater focus on the individual. The MHC-SF’s ‘social integration’ item
concerning belonging to a community could be interpreted to refer to any type of group or
community, in contrast to the SWI item we were forced to use, which reflects respondents’
perceptions of people in their local area. In this sense we cannot claim to have replicated Keyes’
validated scale completely. The SWI items selected to match the PERMA-P were also not a perfect
replication, but we were at least able to include three different items for each PERMA construct,
allowing us to represent the original scale well in this regard. Despite these obvious limitations,
we maintain that having such a large number of wellbeing variables in the SWI, a large
representative sample, and the FS and ESS models represented in their entirety, made
comparison of the four models a worthwhile exercise.
Thirdly, the greatest single challenge involved the decision making around the selection of
thresholds differentiating between participants endorsing a component of flourishing and those
not endorsing a component. Recently published OECD guidelines on measuring wellbeing
suggest the use of thresholds as “one way to manage a large number of scale responses” (OECD,
2013, p. 187). Thresholds provide a useful way of conveying aspects of the data’s distribution
with a single figure, and are compatible with the SWI’s ordinal data. However, the OECD
guidelines also caution that great care must be taken when selecting thresholds: “there is
considerable risk that a threshold positioned in the wrong part of the scale could mask important
changes in the distribution of the data” (2013, p. 188). The OECD recommends examining data
distribution (particularly watching for the tendency for strong negative skew common to
subjective wellbeing responses), using median and mean statistics to help identify tipping points,
and selecting scale values above which empirical evidence suggests positive outcomes are associated. The OECD also acknowledges that a key challenge lies in combining a data-driven
approach with the identification of thresholds that are meaningful and have real-world utility.
With this in mind, and considering the purpose of this study was to examine measurement
equivalence across four different operationalizations, we needed to find a methodology we could
apply consistently both within each definition, and across all four different operationalizations.
Concerned that Huppert and So’s approach of selecting thresholds based upon the distribution
of data made (potentially erroneous) assumptions about the prevalence of flourishing, and
influenced the reported prevalence rates substantially, we instead selected threshol

Part two of this paper estimated flourishing prevalence rates among 10,009 adult New 
Zealanders, according to replications of each of the four frequently used operationalizations of 
flourishing identified in part one, using the SWI variables and dataset. Results indicated there was a substantial difference in prevalence rates of flourishing depending upon the 
operationalization employed, from 24% (Huppert & So), to 39% (Keyes), 41% (Diener et al.), and 
47% (Seligman et al.). The low prevalence rate of flourishing from the SWI replication of Huppert 
and So’s conceptualisation (24%) most likely reflects their more stringent theoretical and 
conceptual criteria for flourishing: to be categorised as flourishing participants are required to 
endorse the one item representing positive emotion (which only 41% of the sample did), plus 
three out of four components of ‘positive functioning’, and four out of five components of 
‘positive characteristics’; thereby allowing participants to score below the thresholds on only two 
out of ten items. In contrast, participants could score below the thresholds on six out of 13 
components in the SWI replication of Keyes’ operationalization, or seven out 15 items in the SWI 
replication of Seligman et al.’s operationalization, and still be categorised as flourishing. In only 
requiring an average score of 48 and above, our interpretation of Diener et al.’s operationalization 
also allowed greater flexibility across components than our interpretation of Huppert and So’s 
operationalization. (This is the most striking difference between these four operationalizations, 
and the cause of the variation in prevalence rates.) It is important to note that the use of different 
response formats in the SWI survey meant that some of the variation in prevalence rates between 
our study and previous studies might be due to the use of different thresholds, making for 
potentially inaccurate international comparisons. For example, New Zealand’s 24% flourishing 
according to our replication of Huppert and So’s model may not be directly comparable to the 
Danes’ 41% flourishing or Portugal’s 10% flourishing diagnosed using the same model (Huppert 
& So, 2013). However, by applying consistent methodology for selecting thresholds across all 
four models in our study, we are confident that the flourishing prevalence rates according to the 
four different models are comparable with each other in our study. 
While related samples Cochrane’s Q tests indicated all four operationalizations were 
significantly different to one another, cross tabulation analysis revealed a strong agreement 
between our replications of Keyes’ and Seligman et al.’s operationalizations (81%) and Diener et 
al. and Seligman et al.’s (80%). Even the least comparable operationalizations (Huppert and So 
and Seligman et al.) indicated moderate agreement (74%). In the absence of an established 
empirical benchmark stating what degree of agreement is meaningful, or indeed any criterion 
for interpreting what these levels of agreement mean, it is hard to draw any concrete conclusions 
from these findings. 
The strengths and unique contributions of this study include the application of the four 
operational definitions to a very large, nationally representative, sample of adults, which allows 
our results to be compared to other population samples; the prospective nature of the SWI, with 
two more longitudinal rounds scheduled over the next four years, allowing us to monitor the 
prevalence of flourishing among New Zealand adults over time using all four 
operationalizations; and the use of cross-tabulation and pairwise Cochrane’s Q tests allowing us 
to calculate, for the first time, the degree of agreement between the SWI replications of the 
different measures commonly employed to assess flourishing. 
In terms of limitations, we experienced challenges in accurately replicating three of the four 
operationalizations of flourishing using the available dataset (the FS was replicated exactly). 
While the SWI’s large number of wellbeing variables (n = 87) presented us with a compelling 
opportunity to compare these operationalizations, we acknowledge that the fit was not perfect. 
Differences in questionnaire items and response formats required us to make subjective decisions 
regarding the best way to replicate the original models. The challenge was to stay true to the 
theory and conceptualisation of the original models, while also remaining consistent in our methodology across models. We offer the following four examples of the types of challenges we 
faced, and our methods for overcoming them. 
Firstly, the absence of any categorical diagnosis of flourishing for the Flourishing Scale or 
PERMA-P required us to devise our own methods. We were guided by Keyes, and Huppert and 
So, in our methodology. This meant selecting a threshold for flourishing on the FS that allowed 
endorsement of most, but not necessarily all, of the scale's eight components (scores ≥ 48, range 
7-56, meaning respondents had to score an average of six on the 7-point Likert scale). To be 
categorised as flourishing in the SWI replication of the PERMA-P required participants to score 
above a threshold on two of three items of each component, and four out of the five components 
overall. While we acknowledge the limitations in our approach, and acknowledge the PERMA-
P research team’s preference for dashboard reporting, categorical diagnoses of flourishing 
provide vital information for decision makers. 
Secondly, the various items selected and response formats used in the SWI frequently 
differed from those in the original scales. For instance, while the response option for the MHC-
SF measured the frequency with which respondents experienced each component over the past 
month, several items in the SWI asked respondents “how much of the time during the past week” 
or “how much of the time would you generally say…”. Where possible we used the same items 
as the original scale, but some could not be matched to an SWI variable (such as ‘social 
coherence’), which meant this component had to be excluded from our analysis. Others were 
matched, but not perfectly so, leaving us having to choose the item which came closest to 
representing the original construct. Some of these were far from ideal. For instance, the MHC-SF 
item for ‘social growth’ (“during the past month, how often did you feel our society is a good 
place, or is becoming a better place for all people?”) was operationalized using the reverse-scored 
SWI item “For most people in New Zealand life is getting worse rather than better”. Similarly, 
Keyes’ ‘social contribution’ item assesses respondents’ contribution at a societal level, while the 
SWI item has a greater focus on the individual. The MHC-SF’s ‘social integration’ item 
concerning belonging to a community could be interpreted to refer to any type of group or 
community, in contrast to the SWI item we were forced to use, which reflects respondents’ 
perceptions of people in their local area. In this sense we cannot claim to have replicated Keyes’ 
validated scale completely. The SWI items selected to match the PERMA-P were also not a perfect 
replication, but we were at least able to include three different items for each PERMA construct, 
allowing us to represent the original scale well in this regard. Despite these obvious limitations, 
we maintain that having such a large number of wellbeing variables in the SWI, a large 
representative sample, and the FS and ESS models represented in their entirety, made 
comparison of the four models a worthwhile exercise. 
Thirdly, the greatest single challenge involved the decision making around the selection of 
thresholds differentiating between participants endorsing a component of flourishing and those 
not endorsing a component. Recently published OECD guidelines on measuring wellbeing 
suggest the use of thresholds as “one way to manage a large number of scale responses” (OECD, 
2013, p. 187). Thresholds provide a useful way of conveying aspects of the data’s distribution 
with a single figure, and are compatible with the SWI’s ordinal data. However, the OECD 
guidelines also caution that great care must be taken when selecting thresholds: “there is 
considerable risk that a threshold positioned in the wrong part of the scale could mask important 
changes in the distribution of the data” (2013, p. 188). The OECD recommends examining data 
distribution (particularly watching for the tendency for strong negative skew common to 
subjective wellbeing responses), using median and mean statistics to help identify tipping points, 
and selecting scale values above which empirical evidence suggests positive outcomes are associated. The OECD also acknowledges that a key challenge lies in combining a data-driven 
approach with the identification of thresholds that are meaningful and have real-world utility. 
With this in mind, and considering the purpose of this study was to examine measurement 
equivalence across four different operationalizations, we needed to find a methodology we could 
apply consistently both within each definition, and across all four different operationalizations. 
Concerned that Huppert and So’s approach of selecting thresholds based upon the distribution 
of data made (potentially erroneous) assumptions about the prevalence of flourishing, and 
influenced the reported prevalence rates substantially, we instead selected threshol

0/5000

From: -

To: -

Results (Indonesian) 1: [Copy]

Copied!

Part two of this paper estimated flourishing prevalence rates among 10,009 adult New Zealanders, according to replications of each of the four frequently used operationalizations of flourishing identified in part one, using the SWI variables and dataset. Results indicated there was a substantial difference in prevalence rates of flourishing depending upon the operationalization employed, from 24% (Huppert & So), to 39% (Keyes), 41% (Diener et al.), and 47% (Seligman et al.). The low prevalence rate of flourishing from the SWI replication of Huppert and So’s conceptualisation (24%) most likely reflects their more stringent theoretical and conceptual criteria for flourishing: to be categorised as flourishing participants are required to endorse the one item representing positive emotion (which only 41% of the sample did), plus three out of four components of ‘positive functioning’, and four out of five components of ‘positive characteristics’; thereby allowing participants to score below the thresholds on only two out of ten items. In contrast, participants could score below the thresholds on six out of 13 components in the SWI replication of Keyes’ operationalization, or seven out 15 items in the SWI replication of Seligman et al.’s operationalization, and still be categorised as flourishing. In only requiring an average score of 48 and above, our interpretation of Diener et al.’s operationalization also allowed greater flexibility across components than our interpretation of Huppert and So’s operationalization. (This is the most striking difference between these four operationalizations, and the cause of the variation in prevalence rates.) It is important to note that the use of different response formats in the SWI survey meant that some of the variation in prevalence rates between our study and previous studies might be due to the use of different thresholds, making for potentially inaccurate international comparisons. For example, New Zealand’s 24% flourishing according to our replication of Huppert and So’s model may not be directly comparable to the Danes’ 41% flourishing or Portugal’s 10% flourishing diagnosed using the same model (Huppert & So, 2013). However, by applying consistent methodology for selecting thresholds across all four models in our study, we are confident that the flourishing prevalence rates according to the four different models are comparable with each other in our study. While related samples Cochrane’s Q tests indicated all four operationalizations were significantly different to one another, cross tabulation analysis revealed a strong agreement between our replications of Keyes’ and Seligman et al.’s operationalizations (81%) and Diener et al. and Seligman et al.’s (80%). Even the least comparable operationalizations (Huppert and So and Seligman et al.) indicated moderate agreement (74%). In the absence of an established empirical benchmark stating what degree of agreement is meaningful, or indeed any criterion for interpreting what these levels of agreement mean, it is hard to draw any concrete conclusions from these findings. The strengths and unique contributions of this study include the application of the four operational definitions to a very large, nationally representative, sample of adults, which allows our results to be compared to other population samples; the prospective nature of the SWI, with two more longitudinal rounds scheduled over the next four years, allowing us to monitor the prevalence of flourishing among New Zealand adults over time using all four operationalizations; and the use of cross-tabulation and pairwise Cochrane’s Q tests allowing us to calculate, for the first time, the degree of agreement between the SWI replications of the different measures commonly employed to assess flourishing. In terms of limitations, we experienced challenges in accurately replicating three of the four operationalizations of flourishing using the available dataset (the FS was replicated exactly). While the SWI’s large number of wellbeing variables (n = 87) presented us with a compelling opportunity to compare these operationalizations, we acknowledge that the fit was not perfect. Differences in questionnaire items and response formats required us to make subjective decisions regarding the best way to replicate the original models. The challenge was to stay true to the theory and conceptualisation of the original models, while also remaining consistent in our methodology across models. We offer the following four examples of the types of challenges we faced, and our methods for overcoming them. Firstly, the absence of any categorical diagnosis of flourishing for the Flourishing Scale or PERMA-P required us to devise our own methods. We were guided by Keyes, and Huppert and So, in our methodology. This meant selecting a threshold for flourishing on the FS that allowed endorsement of most, but not necessarily all, of the scale's eight components (scores ≥ 48, range 7-56, meaning respondents had to score an average of six on the 7-point Likert scale). To be categorised as flourishing in the SWI replication of the PERMA-P required participants to score above a threshold on two of three items of each component, and four out of the five components overall. While we acknowledge the limitations in our approach, and acknowledge the PERMA-P research team’s preference for dashboard reporting, categorical diagnoses of flourishing provide vital information for decision makers. Secondly, the various items selected and response formats used in the SWI frequently differed from those in the original scales. For instance, while the response option for the MHC-SF measured the frequency with which respondents experienced each component over the past month, several items in the SWI asked respondents “how much of the time during the past week” or “how much of the time would you generally say…”. Where possible we used the same items as the original scale, but some could not be matched to an SWI variable (such as ‘social coherence’), which meant this component had to be excluded from our analysis. Others were matched, but not perfectly so, leaving us having to choose the item which came closest to representing the original construct. Some of these were far from ideal. For instance, the MHC-SF item for ‘social growth’ (“during the past month, how often did you feel our society is a good place, or is becoming a better place for all people?”) was operationalized using the reverse-scored SWI item “For most people in New Zealand life is getting worse rather than better”. Similarly, Keyes’ ‘social contribution’ item assesses respondents’ contribution at a societal level, while the SWI item has a greater focus on the individual. The MHC-SF’s ‘social integration’ item concerning belonging to a community could be interpreted to refer to any type of group or community, in contrast to the SWI item we were forced to use, which reflects respondents’ perceptions of people in their local area. In this sense we cannot claim to have replicated Keyes’ validated scale completely. The SWI items selected to match the PERMA-P were also not a perfect replication, but we were at least able to include three different items for each PERMA construct, allowing us to represent the original scale well in this regard. Despite these obvious limitations, we maintain that having such a large number of wellbeing variables in the SWI, a large representative sample, and the FS and ESS models represented in their entirety, made comparison of the four models a worthwhile exercise. Thirdly, the greatest single challenge involved the decision making around the selection of thresholds differentiating between participants endorsing a component of flourishing and those not endorsing a component. Recently published OECD guidelines on measuring wellbeing suggest the use of thresholds as “one way to manage a large number of scale responses” (OECD, 2013, p. 187). Thresholds provide a useful way of conveying aspects of the data’s distribution with a single figure, and are compatible with the SWI’s ordinal data. However, the OECD guidelines also caution that great care must be taken when selecting thresholds: “there is considerable risk that a threshold positioned in the wrong part of the scale could mask important changes in the distribution of the data” (2013, p. 188). The OECD recommends examining data distribution (particularly watching for the tendency for strong negative skew common to subjective wellbeing responses), using median and mean statistics to help identify tipping points, and selecting scale values above which empirical evidence suggests positive outcomes are associated. The OECD also acknowledges that a key challenge lies in combining a data-driven approach with the identification of thresholds that are meaningful and have real-world utility. With this in mind, and considering the purpose of this study was to examine measurement equivalence across four different operationalizations, we needed to find a methodology we could apply consistently both within each definition, and across all four different operationalizations. Concerned that Huppert and So’s approach of selecting thresholds based upon the distribution of data made (potentially erroneous) assumptions about the prevalence of flourishing, and influenced the reported prevalence rates substantially, we instead selected threshol

Being translated, please wait..

Results (Indonesian) 2:[Copy]

Copied!

Being translated, please wait..

Results (Indonesian) 3:[Copy]

Copied!

Being translated, please wait..

Other languages

The translation tool support: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chichewa, Chinese, Chinese Traditional, Corsican, Croatian, Czech, Danish, Detect language, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Frisian, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Korean, Kurdish (Kurmanji), Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar (Burmese), Nepali, Norwegian, Odia (Oriya), Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scots Gaelic, Serbian, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Tatar, Telugu, Thai, Turkish, Turkmen, Ukrainian, Urdu, Uyghur, Uzbek, Vietnamese, Welsh, Xhosa, Yiddish, Yoruba, Zulu, Language translation.