Data and discrimination: A research note on sexual orientation in the Canadian labour market

Growing interest in the labour market outcomes of sexual minorities presents novel methodological and theoretical challenges. In this note, we outline important challenges in the study of wage inequality between sexual minorities and heterosexuals in Canada. We discuss the current state of available data on sexual orientation and economic outcomes in Canada, and further evaluate how estimates of sexual orientation wage gaps differ across earnings definition and sample composition. Our analysis of the 2006 Census shows considerable heterogeneity in point estimates of wage disadvantage across definitions of earnings and sample selections; however, all estimates show that gay men suffer labour market penalties and lesbians experience wage premiums. L’interet grandissant pour la situation des minorites sexuelles sur marche du travail souleve de nouveaux enjeux methodologiques et theoriques. Dans ce commentaire, nous soulignons les enjeux importants que presente l’etude des inegalites salariales entre minorites sexuelles et heterosexuels au Canada. Nous discutons de la disponibilite actuelle de donnees sur l’orientation sexuelle et le revenu au Canada et evaluons la maniere selon laquelle les ecarts salariaux varient en fonction de la definition de revenu et la composition de l’echantillon. Notre analyse du recensement de 2006 indique une heterogeneite considerable des estimations ponctuelles de l’ecart salarial a travers differentes definitions de revenu et differentes selections d’echantillon. Cependant, toutes les estimations indiquent que les hommes gays sont desavantages sur le marche du travail et que les lesbiennes obtiennent des salaires superieurs.

Interest in the labour market outcomes of gay men and lesbian women has grown considerably over the past decade. While previous research was limited to small convenience samples, new population data at last includes information on sexual minorities in Canada. With this data, researchers have begun enumerating previously undocumented aspects of labour market stratification by sexual orientation, including the presence of wage disparities between gay men, lesbians, and heterosexual men and women. With few exceptions, the growing international literature has generally found that gay men earn less than heterosexual men and lesbians earn more than heterosexual women, but still less than all men (see Klawitter 2015 for a review and meta-analysis of this research). 2 This expanding field presents novel methodological and theoretical challenges that pertain to studying the populations at hand.
In this research note, we outline important challenges in the study of wage inequality between sexual minorities and heterosexuals in Canada. We first outline the current state of available data on sexual orientation and economic outcomes in Canada. We then turn to evaluating how estimates of sexual orientation wage gaps differ across definitions of earnings that are consistent with differences in earnings variables provided in available data sources. We discuss what divergent results across definitions of earnings and sample selection criteria indicate for the performance of sexual minorities in the Canadian labour market and how they relate to the conclusions reached in recent research. Our analysis shows considerable heterogeneity in point estimates of wage disadvantage across definitions of earnings and sample selections; however, all estimates are consistent with the growing international literature finding that gay men suffer labour market penalties and lesbians experience wage premiums. We argue that Census data provides a robust estimate of wage inequality, as it offers a sufficiently large sample of sexual minorities. Unfortunately, the Census remains limited, due to its failure to identify unmarried LGBTQ persons.

Identifying sexual minorities
Three sources of population data have been used to study earnings differences between sexual minorities and heterosexual Canadians: the General Social Survey (GSS), Canadian Community Health Survey (CCHS), and the Census. While each provides information on the sexual orientation of respondents, measures of sexual orientation vary across the surveys, and in ways that affect which sexual minorities are identified. The GSS, CCHS, and the Census all allow researchers to identify sexual orientation through partnership status. Namely, gay men and lesbian women are identified by their common-law or marital partnership with a person of the same self-identified sex. This method excludes unmarried people and ignores bisexuality. The CCHS and GSS further ask a direct question on sexual identity, asking the respondent, "Do you consider yourself to be… (1) heterosexual (sexual relations with people of the opposite sex), (2) homosexual, that is, lesbian or gay (sexual relations with people of your own sex); [or] (3) bisexual (sexual relations with people of both sexes)." While the GSS and CCHS characterize the question as one of identity and not behaviour, the wording of the clarifications defines identity through sexual practice/behaviour (Carpenter 2008). Previous research shows that individuals are more likely to report same-sex sexual behaviour rather than a same-sex identity (Badgett 2009). 3 Thus, the question, in addition to including single gay men and women, may also include individuals who may not necessarily identify as gay but do so because they have engaged or continue to engage in occasional same-sex sexual relations. Conversely, it may exclude people who have only ever engaged in same-sex sexual relations but nevertheless identify as heterosexual.
Why do these definitional differences matter? In the case of the Census, clearly a major issue is that information on unmarried sexual minorities is lost-and inferences about the sexual minority population as a whole are then trickier, requiring some information about how differences in selectivity into partnership vary by sexual orientation and how this selectivity may relate to earnings (Carpenter 2008). On the other hand, using partnership status may offer some advantages (Klawitter 2015). Those who are in long-term same-sex relationships may be less willing and/or less able to conceal their sexual orientation. Some individuals in same-sex partnerships may also have incentives to disclose their sexual identity, in order to receive employment-provided fringe benefits, like dental insurance, for their partner. If discrimination is a key mechanism producing earnings disadvantage, individuals must somehow reveal their sexual orientation to bosses and co-workers. Single people have less of an incentive to disclose their identity in the workplace, and individuals that engage occasionally in same-sex sexual relations may not convey a non-heterosexual identity at all. Thus, there is reason to believe that across surveys, the population identified varies, and in ways that shape the mechanisms which impact earnings disadvantage. 4

Earnings three ways
Both the couples approach and the GSS/CCHS question identify important aspects of sexual orientation that may be relevant in influencing labour market outcomes. But what both the CCHS and GSS definitely do not measure is earnings, instead providing (at times crude) indicators of income. The GSS contains a categorical variable on total individual income from all sources, ranging in values of 1-12 and representing incomes of "no income or loss" up to "$100,000 or more." The top-coding of income will curtail wage disadvantage if high-earning heterosexuals earn more than high-earning sexual minorities; this pattern is documented in Waite and Denier (2015). The CCHS, on the other hand, provides a continuous income variable for some respondents; those who do not provide an exact value are then probed with a series of categories that their income may fall into. The CCHS income variable is tangential to the main aims of the health survey, and survey documentation warns that it should be used with caution and as a control variable. Thus, studies using the CCHS income variable are prone to measurement error in the dependent variable. It is not clear if these errors vary across sexual orientation, but if this is the case, estimates of wage gaps may be biased. The Census, however, offers fairly high-quality earnings and income data. Starting in 2006, Canadians had the option of linking their Census responses to their tax records, with over 80 per cent of respondents allowing the linkage (Statistics Canada 2008). For those who did not give permission, the questionnaire asked about detailed income components, divided into various sources to facilitate accurate recall (Statistics Canada 2008). Thus, the Census data offers superior data on both earnings and income.
Using individual income to study earnings also poses an issue, as it often includes non-wage income sources; paramount among them are government transfers, which ultimately depend on family relationships. There is wide variability in the receipt of government transfers across the earnings distribution (Heisz 2007), which may systematically impact the income of gay and heterosexual individuals. Given that lower-wage workers tend to have lower incomes at similar hours worked, a larger portion of low-wage workers will receive income that is not wage and salary income, inflating the "earnings" of low earners relative to high earners in the sample. Depending on the program, government transfers may be means-tested to total household income. Thus, the lower-earning partner in a high-earning household may have lower individual income than the same low earner in a low-earning household. For example, lesbians may be eligible for more transfers-given they are partnered with another woman, who in general is paid less in the Canadian labour market-than a woman would be who is partnered with a high-earning man (married heterosexual women earn less, but their husbands earn more). Lesbian incomes may be inflated precisely because their position in a lower-earning all-female household allows them to qualify for more non-wage income. At the same time, heterosexual women are more likely to receive transfers that are targeted at families with children, since coupled heterosexual women are more likely to have children than coupled lesbians (Waite and Denier 2015). 5 Thus, reliance on an income variable may bias estimates of earnings differences by either understating or overstating actual labour market earnings.

Current estimates
Using these data sources, five studies have provided evidence of an earnings gap for sexual minorities in Canada. 6 Table 1 presents findings from the most fully specified model in each of the three studies, and reports estimates for coupled sexual minorities relative to coupled heterosexuals. 7 At first glance, the estimates vary widely across studies (part of the impetus for formulating this note). Two papers draw samples of couples to estimate sexual minority earnings disadvantage. Mueller (2014) examines the 2006-10 GSS, limiting his sample to those likely earning most of their total income, and finds no wage disadvantage for gay men but a large wage advantage of about 16 per cent for lesbian women. Waite and Denier (2015), on the other hand, find using Census data that gay men earn 5.1 per cent less than heterosexual men and lesbians earn 8 per cent more than heterosexual women. The other three studies, using the CCHS, include both singles and individuals in couples. Carpenter (2008) acknowledges that he is examining income and includes a broad sample, taking care to avoid relating income differences directly to labour market processes. He finds that on average, gay men have total incomes that are about 12 per cent lower than heterosexual men, while lesbians have incomes about 15 per cent higher. Carpenter (2008) further shows that these wage differences are larger when restricting the sample to those in couples, with the income penalty for gay men around 20 per cent for partnered gay men relative to partnered heterosexual men, and the income advantage at about 43 per cent for lesbians relative to coupled heterosexual women. LaFrance, Warman, and Wooley (2009) are primarily interested in how wage differentials vary across partnership status. Limiting their CCHS sample to individuals who work 30+ hours a week, they find that gay men in a married/common-law relationship make about 20 per cent less than married (not cohabiting) straight men, while lesbian women in marital/common-law relationships make about 10 per cent more than married straight women. Single gay men make about 24 per cent less than married heterosexual men, while single heterosexual men make about 14 per cent less. Single lesbians and single heterosexual women both have incomes that are about 10 per cent higher than married heterosexual women. These differences hold even when including only people whose main source of income is wages and salaries or who only receive income from wages and salaries. Cerf (2016) uses the 2000-09 CCHS and finds that partnered gay men have incomes about 13 per cent less than heterosexual men, while partnered lesbian women have incomes about 8 per cent higher than partnered straight women. Notably, in contrast to LaFrance et al. (2009) and consistent with Carpenter (2008), he finds no wage difference for single gay men and women. The variability in estimates for gay men suggests substantially different conclusions, ranging from no disadvantage to considerable earnings gaps, even after accounting for work effort and occupation and industry choice. For lesbians, the magnitude of the wage advantage over heterosexual women also remains unclear. In the following section, we attempt to uncover some of the sources of these differences.

Reconciling results
The available data present challenges to identifying sexual minority earnings gaps, as evidenced by the breadth of previous findings. Perhaps the single greatest challenge is that most surveys measuring sexual orientation do not measure earnings (or, conversely, most high-quality labour market studies do not measure sexual orientation). A second practical challenge is that researchers often specify different analytic samples, making it difficult to pinpoint whether it's the data or the sample that is driving the result. In order to better understand how sample composition and variable definitions impact estimates of sexual minority wage gaps, we use couples data from the 2006 Census to replicate the sample selection criteria and earnings/income variables used in some previous research.
We are interested in two types of comparisons: across variable definition and across sample specification. For variable definitions, we are primarily focused on how different income and earnings concepts change our understanding of pay (dis)advantage. We generate annual and hourly earnings variables, which directly reflect labour market processes. We further examine hourly and annual income measures, like those that would be found in the CCHS. 8 Finally, we generate a series of "discrete" income and earnings measures that reflect the type of imputation strategy required in surveys that have categorical measures of income, like the GSS. We follow Mueller (2014) and take the midpoint of the 12 income categories available in the GSS (from $0 to $100,000+) and assign those values to the corresponding continuous income/earnings levels of our respondents in the Census. We calculate average earnings/income differences between coupled gay men and coupled heterosexual men, and between coupled lesbian women and coupled heterosexual women, using OLS regressions with robust standard errors. We control for age, education, potential work experience, common-law status, presence of children in the household, rural residence, and province of residence. For annual earnings and income models, we further control for weeks worked and part-time status. We also present models controlling 5. Practically, the transfers are commonly assigned to the adult female in the household. 6. While Carpenter (2008) sheds light on the economic situation of gay and lesbian Canadians, he is focused on identifying income differences, and thus does not comment on labour market dynamics, particularly discrimination. 7. Waite (2015) explored whether sexual minority wage gaps attenuated between 2001 and 2011 using Canadian census and survey data. We do not include this study in our table since the sample, methodologies and point estimates are comparable to Waite and Denier (2015). 8. This is not to say that the CCHS variable will be as high-quality as that in the Census, which is drawn largely from tax data. occupation and industry of employment in Appendix A. Occupation is coded using the National Occupational Classification for Statistics major groups, and industry is coded using the North American Industry Classification System at the sector level. Appendix B further presents annual earnings and income differences unadjusted for labour supply.
We compare these earnings/income differences across two samples. With our data, we are only able to address differences across a coupled sample, and therefore focus on the samples of two studies that use data on couples. The first replicates Waite & Denier (2015), examining a sample of Canadian-born, non-visible minority, non-aboriginal employees between the ages of 25 and 64, with at least $1,000 in annual earnings. The second approximates Mueller (2014) by focusing on a sample aged 20-60, not in school, with discrete hourly incomes between $5 and $500. The major differences across the samples will reflect the combined effect of changes in age and the inclusion of immigrant and aboriginal populations. We are interested in this combined impact, particularly as Mueller (2014) reports no significant wage disadvantage for gay men using a coupled sample, a finding that contradicts previous research.
Taken together, our comparisons reconcile divergent findings in two previous studies that identify sexual orientation through partnership with a member of the same sex (Mueller 2014; Waite and Denier 2015). They further illustrate how studies using an income variable, like those that draw on the CCHS (LaFrance, Warman, and Wooley 2009; Cerf 2016), relate to estimates using an earnings variable. We cannot directly reconcile the results of all the previous studies reported in Table 1, as we lack a data source that includes a measure of sexual orientation (and thus identifies both singles and couples), as well as both earnings and income data. Making a direct comparison between estimates from our coupled sample and estimates derived from both singles and couples would be imprudent; the populations potentially differ in ways we are unable to quantify. This would not only directly impact the average wage differences between sexual minorities and heterosexuals, but could also potentially indirectly impact estimates by modifying the relationship between important control variables and sexual orientation wage gaps. 9 Instead, we focus on how the use of an income variable may generally affect the conclusions drawn in those studies (LaFrance, Warman, and Wooley 2009; Cerf 2016).
Tables 2 and 3 present estimates by variable definition and sample specification for gay men compared to heterosexual men, and lesbian women compared to heterosexual women. Our results suggest that the definition of earnings used introduces nuanced differences in the estimates. Comparing first earnings and income concepts in a single sample across the row (using the Waite and Denier 2015 sample), for gay men both the annual and the hourly income disadvantage is larger than the annual and 9. This could be important if, for instance, the impact of variables like age or education on the wages of sexual minorities and heterosexuals varies based on their relationship status. For example, it may be that older heterosexual men who remain single possess characteristics that make them both less attractive partners and less attractive workers, weakening the positive relationship between age/potential experience and earnings. Older gay men who remain unmarried may have done so as a result of discriminatory barriers uncorrelated with their productive capabilities. Such compositional changes to the sample would yield a lower pay gap for gay men. Comparing estimates drawn from a sample of singles and couples to one drawn only from couples would not be able to identify these types of differences-specifically, whether it is due to the changing composition of heterosexuals or sexual minorities present in the sample. hourly earnings disadvantage. This means that using an income variable, like that available in the CCHS, would likely overstate gay men's earnings disadvantage. The second important definitional distinction is that between continuous measures and discrete measures based on an imputation of categorical income measures, like those available in the GSS. In every instance, the discrete earnings/income measure understates the wage disadvantage of gay men. This is likely a result of top coding in the dependent variable, and suggests that a meaningful portion of the gay wage penalty and lesbian wage advantage emerges at the top of the earnings distribution. This is consistent with the larger wage disadvantage that Waite and Denier (2015) observed for gay men in the tenth percentile of the wage distribution. For lesbian women, on the other hand, differences in the discrete and continuous earnings/income measures are not large. Nevertheless, like for gay men, there are differences in the magnitude of advantage across earnings and income measures; the use of income rather than earnings actually understates the lesbian wage premium observed in the labour market. We then turn to differences across sample specification; here we compare estimates across the samples for similar earnings concepts (i.e., a comparison down the column). We focus on the two dependent variables used by Waite and Denier (2015) and Mueller (2014): continuous annual earnings and discrete hourly income, respectively. For gay men, what is striking is the sensitivity of the results to the inclusion/exclusion of the aboriginal and immigrant populations, depending on the measure of earnings. The Mueller (2014) sample produces slightly lower estimates of continuous annual earnings, but a full 3 percentage point difference in the pay gap based on the discrete hourly income measure. Considering a broader range of measures, the lower wage disadvantage reported using the Mueller (2014) seems to obtain in particular with measures of income and with measures based on discrete transformations of the variable. Notably, the models with additional controls account for both aboriginal group membership and immigration status, suggesting that these groups may be underrepresented among coupled gay men, or that they may modify the impact of other control variables in mediating the relationship between sexual orientation and earnings. These changes in estimates across the samples, conditional on the dependent variable, may help explain the null finding reported in Mueller (2014). The results for lesbians, in Table 3, similarly vary by sample specification. However, for all measures, the lesbian wage advantage is lower using the Mueller (2014) sample than the Waite and Denier (2015) sample-opposite the pattern  ( (2015) 0.093*** 0.095*** 0.083*** 0.080*** 0.062*** 0.058*** 0.075*** 0.077*** in the published studies, in which Waite and Denier (2015) reported a lower lesbian wage advantage. This suggests that perhaps the couples in the GSS and Census are qualitatively different.
Appendix A presents the results controlling for occupation and industry of employment, two important mechanisms accounting for observed (dis)advantage for gay men and lesbians, as documented in previous studies. The general shape of changes across definition and sample remains true when controlling for occupation and industry. For gay men, income measures tend to generate larger earnings disadvantage than earnings measures. And again, when using the Mueller (2014) type sample and an income measure, the wage disadvantage of gay men is lower-by more than half when considering discrete hourly income, and reduced to non-significance and approaching zero when examining discrete hourly earnings. For lesbian women, with controlling for occupation and industry, differences between discrete and continuous measures are not large. However, the controls do seem to reduce differences across the samples, particularly for the annual measures of income and earnings, suggesting that some unique aspects of the two samples may be accounted for with observable differences.

Conclusion
Recent research has generated interesting and important questions about the role of sexual orientation in labour market outcomes. This research has also generated a wide range of estimates of the wage penalties for gay men relative to heterosexual men and the wage premiums for lesbian women relative to heterosexual women in Canada. In this note, we provided evidence on the likely sources of some of these disparities, and our findings point to a few key reasons.
First, the use of total income rather than earnings can distort pictures of earnings inequality. For unadjusted estimates, a continuous income variable (like that in the CCHS) produces larger, although consistent, estimates of earnings disparities. However, the relationship between key explanatory variables, particularly occupation and industry, varies considerably across the income and earnings specification for gay men. Total income that is top coded (like that in the GSS) introduces larger distortions, particularly for gay men, because an important part of the gay pay penalty emerges at the top of the earnings distribution.
Second, using a younger sample that also includes the aboriginal and immigrant populations is associated with a lower estimate of the gay pay gap. This may be for a number of reasons. Immigrants, in particular, are less likely in the Census sample to be in same-sex couples than native-born Canadians (this may be for a variety of reasons, including previous immigration rules that may not have allowed same-sex couples to migrate together or cultural/religious intolerance towards homosexuality within certain immigrant communities). Immigrants are also more likely than the native-born to earn less at similar levels of education and potential work experience; thus including the immigrant population may lower the average wages of the heterosexual population relatively more than the gay population. Similarly, a younger sample may lead to lower estimates of disadvantage for gay men, particularly if older gay men gained much of their labour market experience during a time when there was less social acceptance of the LGBTQ community. Waite (2015), however, documents a larger earnings penalty for younger gay men. He offers that older gay men may have been more likely to conceal their sexual orientations in the past, perhaps subjecting them to less overt discrimination.
Taken together, this analysis helps to reconcile some differences observed in previous estimates. Specifically, Mueller's (2014) finding that there is no wage gap for gay men is likely influenced by low sample sizes in the GSS and a combination of sample and variable definition-difficulties unique to studying the population at hand. For example, estimates for gay men are more sensitive to sample specification than are those for lesbian women. Consideration of such issues will benefit future research.
Moreover, we outlined important deficiencies in identifying sexual minorities in population-based data in Canada. The CCHS and GSS do well in providing a question on sexual orientation. Yet, the question conflates sexual identity and sexual behaviour (Carpenter 2008). This distinction is important in identifying the mechanisms that may lead to disadvantage, particularly in teasing out whether it is choice or constraint that is responsible for leading to observed differences. Additionally, for many substantive outcomes of interest, these surveys do not provide a large enough sample of sexual minorities. The Census, which does provide large sample sizes of sexual minorities, does not ask a question about sexual identity. This omission continues to limit research on the economic lives of gay and lesbian Canadians. Mueller, R.E. 2014. Wage differentials of males and females in same-sex and different-sex couples in Canada, 2006Canada, -2010  Notes: * P ≤ .05; ** P ≤ .01; *** P ≤ .001. Standard errors given in parentheses. Model 1 controls age, education, work experience, common-law status, presence of children in the household, rural residence, and province of residence. Models for annual earnings and income further control weeks worked and part-time status. Models for Mueller (2014) with additional controls also include controls for aboriginal status and immigration status.  ( Notes: *** P ≤ .001. Standard errors given in parentheses. Model 1 controls age, education, work experience, common-law status, presence of children in the household, rural residence, and province of residence. Models for annual earnings and income further control weeks worked and part-time status. Models for Mueller (2014) with additional controls also include controls for aboriginal status and immigration status. Notes: *** P ≤ .001. Standard errors given in parentheses. Model 1 controls age, education, work experience, common-law status, presence of children in the household, rural residence, and province of residence. Models for annual earnings and income further control weeks worked and part-time status. Models for Mueller (2014) with additional controls also include controls for aboriginal status and immigration status.