Revisiting gender bias research in bibliometrics:
Standardizing methodological variability using
Scholarly Data Analysis (SoDA) Cards
Abstract
Gender biases in scholarly metrics remain a persistent concern, despite numerous bibliometric studies exploring their presence and absence across productivity, impact, acknowledgment, and self-citations. However, methodological inconsistencies, particularly in author name disambiguation and gender identification, limit the reliability and comparability of these studies, potentially perpetuating misperceptions and hindering effective interventions. A review of 70 relevant publications over the past 12 years reveals a wide range of approaches, from name-based and manual searches to more algorithmic and gold-standard methods, with no clear consensus on best practices. This variability, compounded by challenges such as accurately disambiguating Asian names and managing unassigned gender labels, underscores the urgent need for standardized and robust methodologies. To address this critical gap, we propose the development and implementation of “Scholarly Data Analysis (SoDA) Cards.” These cards will provide a structured framework for documenting and reporting key methodological choices in scholarly data analysis, including author name disambiguation and gender identification procedures. By promoting transparency and reproducibility, SoDA Cards will facilitate more accurate comparisons and aggregations of research findings, ultimately supporting evidence-informed policymaking and enabling the longitudinal tracking of analytical approaches in the study of gender and other social biases in academia.
1 Introduction
Gender inequality and bias in academia persist through systemic discrimination, implicit prejudices, and structural barriers, continuing to disadvantage women across various stages of their careers, hindering both their development and productivity [5, 13]. These inequities and biases manifest in multiple forms, such as biased hiring evaluations and unequal access to funding [10, 79]. Such challenges create a compounded disadvantage, adversely affecting female researchers’ recruitment, retention, promotion, and overall career progression [33, 80, 97]. Over time, these cumulative effects lead to the persistent underrepresentation of women at various stages in academic roles. For instance, according to the 2023 CRA Taulbee Survey [17], women represented approximately 22% of bachelor’s degree recipients, around 20% of master’s degree recipients, and roughly 23% of new Ph.D. graduates in Computer Science and closely related fields, consistently falling short of parity with their male counterparts across all degree levels. These disparities become even more pronounced in senior academic roles [3]: women represent just 15% of tenure-track engineering faculty and 14% of tenure-track computer science faculty.
Such disparities are reported to be driven in part by ingrained cultural biases and “masculine defaults”, underscoring the need to address systemic factors rather than viewing the problem solely at the individual level [73, 15, 44]. Understanding and addressing these layers of gender inequalities and biases is crucial for fostering an inclusive and transformative intellectual landscape. To this end, numerous studies have examined the extent of gender bias in terms of scholarly metrics (e.g., number of citations and h-index) in academia [56, 75, 64, 61, 58, 50, 31, 25, 2]. For instance, studies have continued to report that papers authored by male first authors receive significantly higher citation counts than those authored by female first authors [86, 14]. Other studies have demonstrated both persistent gender disparities in scholarly recognition and authorship patterns: an analysis of cardiovascular research articles by Chatterjee and Werner [14] found that women were underrepresented as both primary and senior authors: out of 5,322 primary authors, 35.6% were women, and out of 4,940 senior authors, 25.8% were women. Similarly, Mohammad [60] observed that women accounted for 29.7% of all authors, 29.2% of first authors, and 25.5% of last authors. Papers with female first authors received, on average, 37.6 citations, considerably fewer than the 50.4 citations garnered by papers with male first authors. Such patterns of under-representation and lower citation counts are consistent with additional research across various scientific disciplines [65], underscoring the continued influence of gender bias on scholarly recognition. However, there are also studies that report the reduction, absence, or reversal of gender bias [58, 82, 63, 35] e.g. Mishra et al. [58] reported that gender bias in self-citation goes away when controlled for author’s career length. These seemingly contradictory findings highlight the complex nature of gender bias in academia, with factors like career length often playing a crucial role. These insights can be leveraged to design and implement effective interventions that address the nuanced challenges faced by women in academia.
These systemic issues have the potential to affect policy decisions and perpetuate gender inequality and bias in academia while underscoring the need to identify, articulate, and address these biases. Accurately understanding the extent of gender bias is crucial, as these findings guide evidence-based policy decisions to mitigate this impact. It is, therefore, essential to ensure that the underlying research methods of gender bias studies are rigorously designed, executed, validated, and standardized to produce reliable and actionable results. However, existing studies often exhibit methodological inconsistencies, particularly in how authors’ names are disambiguated and genders are identified. Such variations in approach can significantly influence accuracy and reliability [38, 58, 98]. These discrepancies within and between methodology pipelines may lead to unreliable conclusions about gender effects and differences in scholarly metrics and emphasize the importance of developing and adopting a more standardized approach to designing and conducting reliable studies.
To address these challenges, we conducted a literature review of 70 papers published over the past 12 years, focusing on the methods used for author name disambiguation and gender identification. Through this review, we gained insights into recent practices and identified key inconsistencies. Building on these insights, we propose “Scholarly Data Analysis (SoDA) Cards,” a standardized framework for documenting and reporting methodological details in gender bias studies. By facilitating more accurate comparisons and aggregations across research, SoDA Cards can enhance evidence-based policy-making and help track methodological and analytical improvements over time. By developing SoDA Cards, we also identified a principled workflow (figure 1) which we suggest for conducting future scholarly data analysis research. In the following sections, we explain the motivation and objectives of our study, describe the methodology of our systematic literature review and annotation process, present our findings, and introduce the SoDA Cards.
2 A landscape of scholarly data analysis research
Scholarly data analysis is a broad research topic that focuses on the analysis of scholarly research publications. This ranges from analysis of citation networks of research papers and authors, to text analysis of research papers. Our work focuses on the practice of working with scholarly data with a focus on identifying author demographic-based patterns in scholarly metrics like citations, publication count, research careers, etc. To conduct such analysis, authors often utilize a few common methods like author name disambiguation and author demographic assignment, e.g., identification of gender, age, ethnicity, and race. Furthermore, scholarly data analysis often focuses on causal relationships between author demographics and scholarly metrics. We describe these methods and analysis in the following sections and highlight how the common methods in existing scholarly work motivated us to analyze the distribution of these methods in popular research and propose the construction of the SoDA Cards to standardize the practice of reporting scholarly data analysis practices in future works.
2.1 Author name disambiguation
Research papers list individuals involved with the research work in the paper byline. These individuals are often identified via their name, institution, and other identifiers. For the purpose of analysis, these listed identifiers on a research paper are called authorships, while the individuals associated with these identifiers are considered authors. This difference is illustrated in figure 2. Wu [98] also provides a good overview of gender bias in citation research based on authorships and authors. The process of going from a list of authorships to authors is commonly referred to as author name disambiguation or AND.
Author name disambiguation is an essential task in bibliometric research. It involves accurately identifying and distinguishing unique authors, accounting for variations in names that may be identical or similar, and finding and indexing variations of unique authors, which is a frequent challenge in large bibliometric databases [39]. Typically, author names are disambiguated using similarity measures to assign authorship of bibliographic records to individual authors based on data from a compiled dictionary or bibliographic databases. AND is particularly important in gender bias studies, as it ensures the precise attribution of publications to the correct author. Ambiguities in author names can lead to misinformation about scholarly output, research impact, and patterns of collaboration [38, 58]. The goal of AND is to distinctly identify and index or represent each author, eliminating biases associated with names that could affect subsequent gender identification analyses.
There are four primary approaches to author name disambiguation, each with its advantages and limitations. The first approach relies solely on heuristic or name-based methods, using partial or exact matches of names [8, 21]. While this method is straightforward and easy to implement, it can be prone to errors, e.g., in cases where multiple authors share similar names. The second approach uses an algorithm that compare name strings and additional features such as affiliation to disambiguate authors’ names [48, 69, 57]. The third approach involves manual searches, such as consulting online profiles, databases, or institutional directories [51, 52]. Although this strategy can yield highly accurate results, it is labor-intensive, time-consuming, and does not scale to large datasets. Fourth, some studies use self-reported data or draw upon identities established through a gold-standard process [23, 84]. The gold-standard process typically involves curating a validated reference set of author identities by carefully verifying author information through trusted sources or expert judgment. This method provides a reliable benchmark for evaluating and refining other disambiguation techniques, but it demands extensive effort and resources to establish and maintain the reference set.
Author name disambiguation is challenging for many reasons, such as variations in the spelling of names, the indexing of authors via initials instead of full names, inconsistent reporting of affiliation changes, and cultural differences in the ordering of names. Additional complexities arise, particularly with names of Asian origin as these names may not adhere to Western naming conventions, which increases the risk of misidentification of authors [38, 20]. Such misidentification can introduce significant biases into study results by incorrectly attributing publications to the wrong authors or failing to recognize distinct individuals as unique authors. For example, in the case of Asian-origin names, common surnames such as “Kim” or “Wang” combined with limited given-name differentiation can result in multiple authors being grouped under a single identity or a single author being split into multiple records [37]. Therefore, the effectiveness of the chosen author name disambiguation method in accurately distinguishing between authors is critical. Ensuring the precise identification of authors, particularly those with Asian-origin names, is essential to maintain the validity and reliability of gender bias studies in bibliometric research.
2.2 Gender identification
Once an author has been correctly identified, associating them with further demographic information often uses only the author’s name. The task of identifying the gender of authors in bibliometric studies is another critical aspect in the investigation of gender bias. This data can then be used to assess patterns of gender bias within a research field. Traditional gender identification methods have relied on gender-name dictionaries or lists, which associate first names with a specific gender based on societal norms or legal documents [46]. These databases are often built from census data or similar large-scale demographic datasets. While effective for many Western names, these methods can struggle with unisex or non-Western names, leading to inaccuracies in gender assignment [102].
Recently, more sophisticated methods of gender identification have been developed to address these limitations. Some studies utilize machine learning algorithms trained on large datasets to predict the gender of an author based on their first name. These models can incorporate additional features, such as a name’s cultural, institutional, or linguistic context, improving their accuracy, especially for non-Western names. Despite these advancements, gender identification in bibliometric studies still faces numerous challenges. For instance, handling unknown or ambiguous gender remains a significant issue. Many studies default to a binary gender model, which can inadvertently exclude or misclassify authors who do not identify as male or female. Furthermore, cultural and regional differences in names can complicate gender identification. Names from many Asian cultures, for instance, may not follow Western naming conventions, and it can be difficult to determine gender based solely on these names [90, 88].
2.3 Demographic bias investigation research
Investigating gender bias in scholarly research has far-reaching implications, as the outcomes of such studies can inform academic practices and guide the development or adjustment of policies aimed at promoting equality [5, 30]. The reliability of these studies fundamentally depends on the accuracy of the author name disambiguation and gender prediction methods or algorithms on which they are based. Various approaches to author name disambiguation and gender identification exist, including algorithmic solutions, name-based inference, manual verification, and the use of authors’ self-reported data.
Algorithmic solutions for gender identification, such as Genderize.io, Gender API, the gender package in R, SexMachine in Python, and Namsor, rely on large datasets of name-to-gender associations. Tools like Genderize.io and Gender API predict gender by analyzing the frequency of names in databases and utilizing machine learning techniques. We refer the reader to the survey of these techniques by [77] for details.
Name-based inference methods use predefined name lists or databases to infer gender, but their reliability can be limited due to cultural variability in naming conventions. Manual verification involves researchers assigning gender based on context, such as reviewing academic profiles, but this approach is time-intensive and prone to subjective biases. Alternatively, self-reported data, collected through surveys or directly from authors, offer the highest accuracy but may not always be available or complete. However, these methods exhibit significant variability in reliability and transparency, with many of them providing limited or inconsistent levels of accuracy [76, 49].
2.4 Causal factors influencing gender disparities
Understanding the root causes of gender disparities in academia requires more than identifying gaps — it requires researchers to analyze the underlying variables that may contribute to these disparities [58, 32]. To thoroughly examine gender disparities, it is critical to incorporate causal factors into the analysis to assess whether these factors account for or mitigate observed gender gaps. For example, factors such as career length (e.g., seniority), institutional rank, publication venue, and journal impact factor have been used to explore how structural or individual variables contribute to disparities [48, 35]. By including these variables, researchers can determine whether the disparities are reduced or explained away when considering specific contributing factors, providing a more nuanced understanding of gender bias. We observed that while studies we reviewed tended not to include potential causal factors in their analysis, recent papers increasingly incorporate these variables to add rigor to their conclusions.
Some causal factors frequently included are authors’ career lengths (i.e., seniority), year of publication, and number of authors. Furthermore, the papers studying causal factors tend to consider numerous variables, such as country, affiliation rank (e.g., whether the author is associated with a high-ranking or lower-ranking institution based on metrics such as global university rankings), venue of publication, institutional affiliation, academic roles (e.g., full, associate, or assistant professors), journal impact factors, and area of research. For instance, Caplar et al. [8] included seniority of the first author, number of references, total number of authors, year of publication, journal of publication, field of study, and geographical region of the first author’s institution as causal factors into their analysis to observe the effect size of the gender bias after controlling the aforementioned causal factors. They found gender bias in their analysis, and controlling for causal factors makes their result claims more credible. Similarly, Dworkin et al. [21] and Wang et al. [95] incorporated the year of publication, the number of authors, whether the paper was a review article, and the seniority of the paper’s first and last authors into their analysis.
Papers that included causal factors show mixed results on the effect size, some discovering a decreased gender gap and some reporting no change in gender gap after controlling for causal factors. For instance, Huang et al. [35] reported that after controlling for career length as a causal factor, the gender gap in total productivity reduced from 31.0% to 7.8%, while including the country and affiliation rank did not further significantly affect the gender bias result, and considering total productivity as a causal factor eliminated the gender gap. Dion et al. [19] discovered that more gender-diverse subfields and disciplines produce smaller gender citation gaps. However, Odic and Wojcik [65] observed that male first authors achieve higher publication counts and citation counts even when affiliations are controlled for. Furthermore, they found no evidence for the assumption that the publication rate in individual subfields can account for the observed publication gender gap. This decrease in the gender gap after controlling for some causal factors implies that it is crucial to include causal factors in the analysis to add more rigor to the results.
2.5 Motivation for our work
The choice and application of author name disambiguation and gender identification methods can significantly influence the results and their interpretation [38]. For instance, [38] showed how the academic community’s reliance on the seemingly benign practice of using initials for author name disambiguation leads to significant distortions in the statistical properties of coauthorship networks, including underestimating the number of unique authors and network components and overestimating average productivity, number of co-authors per author, areas of expertise per author, and network density. Such errors caused by methodological flaws in author name disambiguation and gender identification underscore a) the profound influence that these inaccuracies can have on results and subsequent analyses, and b) the need for an in-depth assessment of methodological approaches and their implications on downstream research tasks. To address these needs, our study provides a comprehensive literature review that examines the range of techniques used for author name disambiguation and gender identification. This review spans a variety of disciplines, time periods, and locations, aiming to ensure robust, consistent, and objective findings.
By rigorously investigating these methodologies, our paper seeks to achieve three primary objectives:
-
•
A comprehensive analysis of the methods used for author name disambiguation and gender identification in 70 papers.
-
•
The identification of common challenges and patterns in implementing author name disambiguation and gender identification methods.
-
•
A standardization of the scholarly data analysis process through SoDA Cards as a solution to the outlined issues with variation and transparency in methods.
2.6 Scholarly Data Analysis (SoDA) Cards
Standardized methods for benchmarking and documenting data provenance and measurement choices are integral to many fields [28]. As a prime example, the machine learning community employs practices such as providing Model Cards [59] and Datasheets [28]. Model Cards provide detailed performance reports for trained machine learning models across various conditions, ensuring clear and human-readable communication about a model’s capabilities, intent of use, and limitations. Similarly, Datasheets serve as a standardized means of conveying important characteristics of datasets used in machine learning. Although the machine learning community widely embraces transparent reporting practices, these methods can be overlooked in bibliometrics.
Building upon the conventions of standardized practices in machine learning like Model Cards and Datasheets, this study proposes SoDA Cards, tailored for bibliometric research to foster transparency in reporting practices. While such reporting practices might appear to be predominantly suited for machine learning applications, they are equally relevant for bibliometric and scientometric studies because these fields heavily depend on author name disambiguation and gender identification methods and algorithms. The inherent complexity and variety of these methods accentuate the need for transparent and standardized reporting, which would foster greater reproducibility and comparability across studies. More importantly, it would pave the way for a more comprehensive understanding of the challenges and progress in the ongoing quest to understand and address gender and other demographic biases in the measurement and practices of the evolution of science and the scientific workforce.
From our literature review, we learned that there are inconsistencies within and across methodological pipelines. In this section, we present our framework in the form of concise and standardized summaries documented on the ”Scholarly Data Analysis Card”, which we designed as a potential solution to mitigate inconsistencies in adopting author name disambiguation and gender identification methods. These cards aim to standardize the reporting of methodologies to allow a variety of stakeholders of academia, e.g., universities, funders, policymakers, students, and algorithm developers, to compare results across individual studies and methodological practices that these studies used to arrive at their claims.
3 Patterns in scholarly data analysis for demographic bias
The presented literature review followed a rigorous and systematic approach to ensure comprehensive coverage of relevant research. We utilized Google Scholar as our primary search engine to identify relevant literature. We used a set of keywords selected to maximize the breadth of the search, which we provide in Appendix A.1. Our sample of surveyed papers consisted of published articles that had undergone a peer-review process. We excluded unpublished and preprint papers (e.g., arXiv papers). Details of our paper selection are provided in the following sections.
3.1 Paper search strategy and sampling criteria
There has been significant research on various dimensions of demographic biases in scholarly work in recent years. We employed the following sampling method to retrieve papers likely to have influenced the study of demographic biases in scholarly research:
-
•
Identify papers to include in the review by searching Google Scholar with the keywords (in Appendix A.1). These keywords cover a wide selection of papers, e.g., on Google Scholar, “gender bias scholarly analysis” led to 17K papers 111https://scholar.google.com/scholar?as_ylo=2009&as_yhi=2024&q=gender+bias+scholarly+data&btnG=.
-
•
We selected papers with high citation counts per keyword as reported by Google Scholar. This choice is motivated by the idea that highly cited papers can impact the methodology used in future papers in a field such that prioritizing highly cited papers allows us to detect dominant methodological practices in the field. We are aware that highly cited papers can be cited for reasons other than their methodology. However, our observation of these citations reveals that most citations of these influential papers utilize similar methodological practices as cited papers. This validates our assumption of using these papers to understand dominant practices in scholarly data analysis research.
-
•
To ensure diversity in terms of time and venues of papers in our sample, we selected papers published between 2009 to 2023 from peer-reviewed journals and conferences.
A chart illustrating the distribution of publication years and their corresponding citation counts is shown in Figure 3. It shows that most papers in our sample are from the recent past, primarily between 2018 and 2022. Our sample has a good diversity of papers in terms of citation counts as we have 30% (2̃0) papers with less than 30 citations.
3.2 Annotation
We annotated the papers in our sample, focusing on two key aspects: methods (developed and) used for author name disambiguation (AND) and gender identification. We first discuss how we annotated for AND, followed by gender identification. We identified five commonly used approaches for AND: no disambiguation, algorithmic methods, name-based approaches, manual searches, and self-reported data (gold standards). Each paper was categorized into one of these approaches, with detailed descriptions provided in a codebook (Table 1).
To annotate gender identification methods, we developed a detailed codebook (see Table 1) to account for the diverse approaches used across various studies. Unlike author name disambiguation, which typically relies on a single method, gender identification often involves multiple strategies tailored to effectively identify names from different countries or cultural backgrounds. Consequently, our annotation process for gender identification differs from that of AND. Specifically, we evaluated each paper based on two distinct facets: the number of gender identification methods employed and the types of methods used. Initially, we identified the number of gender identification methods utilized in each paper, categorizing them as either “Single” or “Multiple.” This classification allowed us to capture whether a study relied on one method or incorporated several techniques for gender identification. Subsequently, we categorized each gender identification method into one of four types: algorithmic, self-reported, manual search, and name-based/heuristics. For example, if a paper exclusively employed algorithmic methods for gender identification, we annotated it as using a “Single” method and classified the method as “Algorithmic.” If a paper utilized multiple gender identification methods, we labeled it with each applicable method type, such as both “Algorithmic” and “Manual Search.”
This annotation approach allows us to comprehensively understand the methodological landscape of gender identification in scholarly literature. By distinguishing between the number and types of gender identification methods, we can better analyze trends and preferences in research approaches, highlighting whether studies employ single-method approaches or integrate multiple techniques, e.g., to enhance accuracy and reliability. Detailed descriptions of each gender identification approach and the criteria for categorization are provided in Table 1. During the gender identification process, gender can be classified as a binary value or more than two categories. Some studies also incorporate ”unknown” gender in the analysis.
| Category | Definition |
|---|---|
| 1. Author Name Disambiguation: Methods | |
| No Disambiguation | Studies that 1) conduct analyses at the authorship level, 2) do not explain if and how they used any author name disambiguation methods, and 3) use data from Web of Science or Microsoft Academic Graph (MAG) without specifying whether the unique author identifiers provided in these datasets were used to distinguish between authors with similar or identical names. Simply relying on these datasets does not ensure disambiguation unless the identifiers are explicitly utilized to resolve ambiguities before analysis. |
| Algorithm | Studies that use an algorithmic approach that utilizes name strings and additional features, such as affiliation, to disambiguate authors’ names. |
| Name-based (heuristics) | Studies that disambiguate author names based on name strings (e.g., first name initials or all initials) |
| Manual search | Studies in which human annotators search the web for images and institutional web pages to disambiguate the authors’ names. |
| Gold standards | Studies that use survey data, specifically self-reported data. |
| 2. Gender Identification: Number of Methods Used | |
| Single Method | Studies that used only one method to identify the gender identities of the authors. |
| Multiple Methods | Studies that used more than one method to identify the gender identities of the authors. |
| 3. Gender Identification: Methods | |
| Gold standards | Studies that rely on self-reported gender identities provided by the authors. |
| Manual search | Studies that rely on external sources (e.g., SSN or database), photographs, titles (e.g., Mr., Mrs.), or pronouns (e.g., he, him) to identify the gender identities of the authors. |
| Heuristics | Studies that use only authors’ name strings (e.g., Genderize.io, gender guesser), to identify the gender identities of the authors. |
| Algorithm | Studies that use algorithms that use more than just the authors’ name strings to identify the gender identities of the authors, this includes year of birth, location, e.g. Genni 2.0+Ethnea [89]. |
3.3 Evaluation of annotations using kappa scores
We calculate Cohen’s kappa () to measure the interrater reliability of annotating author name disambiguation and gender identification methods [16]. Cohen’s kappa can range from -1 to 1, where 0 represents the amount of agreement that can be expected by chance, and 1 represents perfect agreement between the raters. If gives the proportion of times two raters agree, and gives the proportion of expected agreement, then can be represented by:
| (1) |
This study employed three doctoral researchers as annotators in a collaborative data coding process. A corpus of 70 papers were analyzed with respect to three key aspects: the Author Name Disambiguation methods implemented, the number and specific type(s) of Gender Identification methods utilized, and instances of gender identification performed using a single method. To ensure methodological transparency, inter-annotator agreement was assessed prior to the resolution of discrepancies, with corresponding Cohen’s kappa coefficients reported for each aspect, as detailed below.
-
•
Author Name Disambiguation: 0.81
-
•
Gender Identification (Number of methods): 0.85
-
•
Gender Identification (Single): 0.71
We consider scores in the range of 0.71 – 0.86 to indicate moderately strong agreement [81, 55], reflecting a generally consistent annotation process. However, disagreements between annotations were resolved through discussion, and the original annotations were adjusted to reflect the consensus, ultimately achieving 100% agreement.
4 Results
The results are organized into subsections that discuss key themes and insights from the review process, focusing on author name disambiguation and gender identification.
4.1 Author Name Disambiguation
This subsection outlines the distribution of the disambiguation methods used in the papers that are included in our sample. Additionally, we discuss how these studies addressed the challenge of disambiguating Asian names, highlighting their approaches and implications.
4.1.1 Analysis of author name disambiguation methods
In this section, we first present the distribution of the AND methods used in the 70 papers in our sample (3), and then explain potential issues with each method. We found that 51.4% (36/70) of the analyzed papers did not use any author name disambiguation (i.e., authorship-level analysis). Among those that employed disambiguation methods, 21.4% (15/70) used algorithmic disambiguation, 12.9% (9/70) name-based (heuristic) approaches, 10.0% (7/70) manual search methods, and 4.3% (3/70) gold-standard author identity data. The precision of AND is essential for conducting analyses at the individual level, rather than just aggregating authorship-level data. However, current practices fall short of achieving this precision. One potential issue is the prevalent underutilization of AND methods, leading to substantial limitations of research analysis results and conclusions. The failure to accurately identify individual authors can narrow the scope of analysis as it may exclude critical causal factors. For a comprehensive and nuanced understanding, it is crucial to account for factors such as an author’s career duration, publication history, and field of work [58]. This level of detail requires a thorough disambiguation of each author’s identity, moving beyond authorship-level analysis.
The practice of relying on methods for AND, such as using full names or combinations of initials with last names, often proves inadequate and leads to errors in correctly identifying authors: taking a name-based approach becomes problematic in cases of name overlap or inconsistent indexing. The resulting errors are the merging of distinct authors with similar names into a single identity and the splitting of a single author’s papers into multiple identities [38]. Such inaccuracies can distort collaboration networks and misattribute scholarly output, which may undermine the reliability of research findings, particularly in gender bias studies that seek to accurately track the contributions and impact of individual researchers. Furthermore, the lack of widespread implementation of robust methods, such as algorithmic disambiguation or the use of gold-standard data, is a limitation in many studies that utilize author name disambiguation. The prevailing reliance on less accurate, heuristic approaches highlights a pressing need for the adoption of more sophisticated and reliable methods in author name disambiguation. This is particularly critical in research analytics and bibliometrics, where the precise attribution of work is essential.
4.1.2 Overview of elaboration on Asian names
Some studies have examined how names are treated across different cultural and geographic regions by specifying the countries included in their analyses. For example, Huang et al. [35] and Mishra et al. [58] listed the countries covered in their research. Huang et al. [35], in particular, considered 83 countries in their country-specific analysis. This level of detail is crucial for understanding the geographic and cultural scope of their analyses, especially when dealing with the diversity and complexity of naming conventions. Names can vary significantly across regions in terms of structure, order of family and given names, and the frequency of certain surnames [54]. Such differences require researchers to adopt culturally sensitive methodologies to ensure accurate identification and categorization. By providing transparent documentation of the countries included in their studies, researchers enhance the interpretability of their findings and allow readers to assess the extent to which their methods account for cultural variations.
| Category | Percentage | List of papers |
|---|---|---|
| No Disambiguation | 36 out of 70 () | [14, 46, 40, 35, 51, 100, 36, 99, 67, 60, 19, 34, 29, 87, 93, 22, 82, 6, 9, 24, 27, 47, 96, 66, 101, 18, 7, 78, 83, 70, 25, 31, 64, 75, 56, 87] |
| Algorithm | 15 out of 70 () | [68, 48, 74, 69, 1, 45, 63, 92, 104, 58, 57, 43, 103, 26, 61] |
| Name-based (heuristics) | 9 out of 70 () | [8, 21, 95, 85, 41, 65, 42, 94, 50] |
| Manual search | 7 out of 70 () | [4, 52, 53, 12, 91, 62, 2] |
| Gold standards | 3 out of 70 () | [23, 84, 71] |
4.2 Gender Identification
4.2.1 Overview of gender identification methods
In this section, we present the result of the distribution of gender identification methods in the papers in our corpus. Compared to the author name disambiguation methods, which usually involve a single method for disambiguating authors’ names, numerous studies relied on multiple methods for identifying authors’ gender identities. We found that 45/70 () [68, 8, 46, 48, 74, 85, 69, 40, 35, 1, 45, 100, 36, 99, 67, 41, 58, 65, 52, 19, 29, 53, 6, 9, 12, 27, 47, 23, 96, 91, 84, 94, 18, 7, 43, 11, 71, 62, 70, 2, 31, 50, 61, 64, 75] of the papers used a single gender identification method, whereas 25/70 () of the papers [14, 21, 95, 51, 4, 63, 92, 104, 60, 34, 22, 82, 57, 9, 24, 42, 101, 78, 83, 103, 87, 93, 26, 25, 56] used more than one method for identifying authors’ gender identities. Moreover, even within the studies using multiple approaches, we observed that the order in which they used each method and the combination of multiple methods varied. To accurately display the distribution of gender identification methods used across the 70 reviewed papers, we counted the frequency of each method (Table 3). This count includes the methods (e.g., algorithm, heuristics, manual search, and gold standard) used in papers we reviewed, with special consideration for papers employing multiple approaches. Each method was counted individually in these cases, allowing a single paper to contribute to multiple method categories. This approach ensures a detailed and nuanced representation of the methods’ distribution, acknowledging the complexities inherent in multi-method research papers.
| Category | Percentage | Single method | Multiple methods |
|---|---|---|---|
| Algorithm | 18 out of 98 () | [48, 74, 85, 69, 40, 100, 58, 6, 61] | [21, 95, 104, 82, 42, 101, 83, 87, 56] |
| Heuristics | 41 out of 98 () | [68, 8, 46, 35, 36, 41, 65, 19, 29, 12, 27, 47, 96, 94, 7, 43, 31, 50] | [14, 21, 95, 51, 4, 92, 104, 60, 34, 22, 82, 57, 9, 24, 42, 101, 78, 83, 103, 87, 93, 26, 25] |
| Manual search | 22 out of 98 () | [52, 53, 70, 2, 64, 75] | [14, 51, 4, 63, 92, 34, 22, 57, 9, 24, 78, 103, 93, 26, 56, 25] |
| Gold standards | 17 out of 98 () | [1, 45, 99, 67, 23, 66, 91, 84, 18, 11, 71, 62] | [51, 63, 60, 4, 78] |
We found that name-based (heuristics) was the most frequently used approach among all gender identification methods ( of papers). This approach involves using authors’ first names to identify associated gender identities. For instance, Genderize.io is also included in this gender identification category, as it relies on authors’ name strings to determine gender identities, though the specific method it employs may vary. The name-based approach has limitations as many names are unisex, and map to different genders depending on the culture or the country of affiliation (e.g., Andrea is typically a male name in Italy and related cultures but is a female name in the U.S. and English cultures, both NamSor and Genderize.io assign it the female gender), and are dependent on the full name (e.g., Harpreet is a unisex name in the Punjab region of India, but the gender can often only be identified after knowing the full name, e.g., Harpreet Kaur is female while Harpreet Singh is more likely to be male). On the other hand, the Gold standard approach, notable for its ideal approach of relying on self-reported gender identity, was utilized least frequently, accounting for only () of papers. Other methods, namely manual search () and algorithm-based (), were also used.
4.2.2 Use of gender labels and handling of unknown gender identities
Gender categories in the identification process are typically binary (i.e., male and female), which were also the most commonly observed identities in our sample. However, the reliance on binary labels for gender categorization presents a significant limitation as it restricts the acknowledgment of diverse gender identities, including non-binary, by forcing them into one of two predefined categories [102]. Further, this binary constraint hinders a more comprehensive understanding of gender bias at a fine-grained level. The complexity of this issue is compounded by the fact that gender identification algorithms predominantly use binary labels, thus perpetuating this narrow classification framework. Nevertheless, some studies have attempted to move beyond binary labels to provide a more comprehensive view of gender inequalities in academia. For instance, Larivière et al. [46] used binary categories and unisex labels for the names appearing in both lists as unisex.
Additionally, we noticed that some studies introduced a separate ‘Unknown’ category for authors whose gender identities could not be determined [69, 58, 60, 82, 94]. This approach, particularly the effort to acknowledge rather than disregard authors with unassigned gender by including a specific label for gender-unidentified authors, represents a step towards inclusivity and a more nuanced understanding of gender dynamics in academia. We also noted a methodological concern: numerous studies tend to exclude authors with indeterminate gender identities when analyzing gender disparities. This approach may have drawbacks, particularly in the context of Asian names, such as those of Chinese or Korean origin, where determining gender identity is often more challenging. For instance, excluding authors whose gender cannot be ascertained disproportionately impacts different ethnicities differently, potentially skewing the sample representation and further biasing the study outcomes. This exclusionary practice also diminishes the inclusivity of research, as it overlooks valuable contributions from gender-unclassified authors, thereby underrepresenting the diversity of scholarly efforts. Therefore, this method risks perpetuating biases, ethnic bias, as certain groups are more likely to be excluded due to cultural naming conventions; and systemic bias, where existing methodologies fail to evolve with changing societal and academic norms. These biases collectively limit the scope of analysis, hinder equitable representation, and potentially obscure critical insights into the intersections of identity and academic contributions.
4.2.3 Considerations of Asian names
Our literature review highlighted challenges in handling Asian-origin names, particularly in the context of gender identification. The prevalence of unisex names for scholars from these regions makes it difficult to accurately determine the gender of authors with Asian names. We observed that some studies opted to exclude authors with Asian names from their analyses. This exclusion, which impacts authors from certain ethnic groups, is sometimes explicitly mentioned in the papers and introduces biases in the analysis. For instance, overlooking influential scholars with high citation counts from specific fields can distort the representation of academic impact and productivity. In studies focusing on citation counts or publication volumes, such exclusions may lead to incomplete or misleading conclusions about disparities in academic contributions across genders and ethnic groups.
Moreover, we noticed that most papers lacked detailed information about how names from different regions, particularly Asian countries, were handled. Specifically, there’s often little to no information on the proportion of names from Asian countries or the differential treatment of authors with Asian origins. However, some studies have addressed this issue directly. For instance, Huang et al. [35] provided a comprehensive explanation for their decision to exclude certain ethnic groups from their analysis: they acknowledged that while many name disambiguation algorithms effectively help to reconstruct the careers of authors with European names, they struggle with those of Asian origin. This, combined with the difficulty in inferring gender from Asian names, led them to exclude researchers from China (including mainland, Hong Kong, Macau, and Taiwan), the Democratic People’s Republic of Korea (North Korea), Japan, Malaysia, the Republic of Korea (South Korea), and Singapore [35]. Such disclosures are vital as they inform readers about potential limitations to the generalizability of a study’s results.
5 Scholarly Data Analysis (SoDA) Cards
In this section, we explain how we developed the concept of SoDA Cards and how we suggest that researchers and practitioners use them. We acknowledge that relying on the authors’ self-reported data, which is the most accurate method, becomes impractical when working with large datasets. Our review of the literature has revealed that many studies adopt alternative methods for author name disambiguation and gender identity determination. Especially, we observed the lack of clear descriptions of implementation details for both tasks, which complicates comparisons between studies. To address this issue in future research, we introduce the SoDA Cards.
The SoDA Cards are designed to provide comprehensive and concise information about study objectives and the data analysis process. These cards entail several subsections, including study specifications, corpus profiling, and detailed descriptions of the methods used for author name disambiguation and gender identification. The cards also cover the causal factors and methodologies used to analyze gender bias, along with the results of these analyses. We provide an unfilled template of the Scholarly Analysis Card in Figure 7 and a filled card for illustrative purposes in Figure 6. The ”Scholarly Data Analysis Card” enables researchers to transparently and accessibly document their analytical workflows, enhancing reproducibility in the field of bibliometrics. The SoDA Cards are available in the GitHub repository: https://github.com/HaeJinLee41/scholarly_bias_study.
5.1 Study specification
In the study specification section, authors are encouraged to clearly define the performance metrics used in their study, such as the number of citations, author composition, impact factors, and so on. Additionally, they should indicate whether the study’s code or data is publicly accessible and, if so, provide relevant links or references.
5.2 Corpus profile
In this section, authors are expected to provide details about the dataset used in their study.
Data source(s): The source(s) of the dataset, including whether they merged data from multiple sources. Furthermore, authors should document the data sources used with data citations.
Domain: Whether the dataset represents a specific domain or field, or if it encompasses interdisciplinary perspectives to gain broader insights into gender bias across various disciplines.
Corpus volume: The initial and final size of the dataset used for analysis.
Geographic scope: The geographical focus of their study. Is it exploring gender bias in a specific country or a particular continent, or does it aim to understand broader bias trends across multiple countries?
Temporal Scope: The period of the dataset should be clearly stated.
5.3 Author name disambiguation
Comprehensive information about the author name disambiguation methods they used and the methods used to evaluate it.
Disambiguation method: The techniques used for author name disambiguation. This could include heuristic approaches, specific algorithms, or manual searches. If multiple methods were used, all should be detailed. In cases where disambiguation was not performed, an explanation is required. For example, reliance on self-reported data might justify the absence of disambiguation.
Evaluation method: If the effectiveness of the disambiguation technique was assessed, authors should describe the methods employed for this evaluation.
Evaluation data: The dataset(s) used to assess the disambiguation method, whether a random sample check or a different dataset.
Accuracy: If the disambiguation method was evaluated, the achieved accuracy should be stated.
Handling on names across diverse cultural and ethnic backgrounds: Authors should clarify how they addressed variations in author names from diverse cultural and ethnic backgrounds. This includes considering naming conventions, such as the placement of family names and prevalent surnames, and whether specialized algorithms or datasets for specific cultural or linguistic contexts were used.
5.4 Gender idenitfication
Detailed information on the methods used for gender identification, including any limitations faced, should be outlined.
Gender identification method: The approaches used for determining gender identities and the number and type of methods (algorithms, gold standards, etc.) employed.
Name part used for gender identification: Which part of the name (first, last) was utilized to identify gender.
Gender categories used: The labels for gender categories. This includes any unique categories, like ’unidentified’.
Percentage of unidentified gender: The proportion of gender identities that remained unidentified.
Handling of unknown gender: How data from authors with unidentified gender identities was managed - whether they were excluded or categorized separately.
Evaluation method, data, and accuracy: The evaluation method, data, and accuracy of the gender identification process.
Handling of names across different ethnic groups for gender identification: for names, including specific considerations or methodologies used to account for variations across different ethnic or cultural groups. For instance, authors could elaborate on how they handled names of Asian origin (e.g., did they disregard them or use a different approach for this group?).
5.5 Analysis
: The methods used to analyze gender bias, including relevant specifics, causal factors considered, and the analytical pipeline.
Method: The methods they employed in their gender bias analysis. If multiple methods were used, each should be thoroughly described.
Control for causal factors: Whether the analysis accounted for factors like the year of publication, the author’s career age, the field of work, and prior publication history.
5.6 Results
: The findings of the study, with a particular emphasis on results related to gender bias.
Presence of gender bias: Whether their analysis revealed any gender bias. If the findings were insignificant, they should provide insights into potential reasons.
Overall effect size and effect size post-control: The overall effect size that was discovered in the study. Additionally, authors need to detail the effect size after considering various causal factors. This includes explaining how the results were influenced by adjustments for specific factors and if such adjustments were made in the analysis.
6 Limitations
This study presents a framework for accurate scholarly data analysis; however, both the findings derived from prior research and the proposed framework have limitations. Regarding the analysis of existing scholarly literature, while the results are considered representative, they are subject to limitations imposed by the sample size. The papers were selected based on representative keywords related to scholarly data analysis and demographic bias, identified through standard search terms. While this approach captures prominent research in the field, it may exclude relevant publications that do not explicitly feature these keywords or are not readily indexed by Google Scholar. Given the prevalence of the chosen keywords, this potential omission is not expected to impact the overall findings significantly. Furthermore, the selection process prioritized highly cited papers (as detailed in Section 3.1). These influential papers are frequently referenced for their methodological contributions, suggesting that subsequent studies adopting similar methodologies will likely exhibit comparable characteristics to those identified in our analysis.
The proposed framework also has inherent limitations. Specifically, the emphasis on demographic attributes such as gender may pose challenges in contexts where these attributes cannot be reliably inferred due to privacy concerns or local regulations. However, this limitation is mitigated by the understanding that datasets with such restrictions should not be used for aggregated demographic scholarly data analysis in the first place. Therefore, the framework’s focus on transparency regarding gender assignment methodologies can contribute to compliance with relevant policies. While not a primary objective, this potential framework application warrants further investigation.
7 Discussion and Conclusion
Our study highlights the critical need to document and refine author name disambiguation and gender identification practices in scholarly research. Improving these processes is essential to enhance the accuracy and reliability of studies investigating gender biases. To conduct reliable investigations of gender biases at the authors’ career level, it is vital to adhere to key methodological steps. These include rigorous author identification and disambiguation and the accurate attribution of demographic details such as gender, race, and nationality. Moreover, it’s crucial to control for causal factors that might impact the variables under study. These essential steps are illustrated in Figure 1. Inaccuracies in author identification not only compromise the integrity of data but also lead to skewed analyses, which can yield biased interpretations of gender disparities in academia. This hinders a comprehensive understanding of the factors driving these disparities and might adversely impact policymaking.
Given these observations, our findings have significant implications for future research in this domain. Emphasizing precise author attribution and acknowledging individual contributions become pivotal in studies exploring gender bias. As the field progresses, developing more sophisticated and reliable methods for author name disambiguation is essential to validate findings in gender bias research. To move forward, we propose the introduction of a model card system for bibliometric studies. These SoDA Cards serve as a standardized tool for researchers to document and share their methodologies and results. They include detailed information about the data, analytical methods employed, and the controls implemented for causal factors. Such a system would not only promote transparency and replicability in gender bias research but also encourage the adoption of best practices across the field, ultimately leading to more accurate and insightful findings.
1 Paper Title
Quantitative evaluation of gender bias in astronomical publications from
citation counts (Caplar et al., 2017)
2 Study Specification
•
Performance metric(s): number of citations
•
Code/Data accessibility: code and data are accessible
3 Corpus Profile
•
Domain: astronomy
•
Data source: SAO, NASA, ADS, arXiv
•
Corpus volume: 149,741 papers
•
Geographical scope: global-level
•
Temporal scope: 1950-2015
4 Author Name Disambiguation
•
Disambiguation method: name-based method
•
Evaluation method: did not evaluate
•
Evaluation data: not applicable
•
Accuracy: not applicable
•
Elaboration on names with Asian origins: not applicable
5 Gender Identification
•
Gender identification method: algorithmic
•
Name part used for gender identification: first name
•
Gender categories used: binary (female and male)
•
Percentage of unidentified gender: 1.5% (2260)
•
Handling of unknown gender: excluded from analysis
•
Evaluation method: did not evaluate
•
Evaluation data: not applicable
•
Accuracy: not applicable
•
Elaboration on names with Asian origins: not applicable
6 Analysis
•
Method: machine learning (random forest model)
•
Control for year of publication?: yes
•
Control for author career age?: yes (seniority)
•
Control for field of work?: yes
•
Control for author prior publication?: no
7 Results
•
Presence of gender bias: yes
•
Overall effect size: men received around 6% more citations
on average then women (considering year)
•
Effect size after controlling causal factors: papers authored
by women receive 10.4 ± 0.9% fewer citations than would be expected
if the papers with the same non-gender-specific properties were
written by men
1 Paper Title
2 Study Specification
•
Performance metric(s):
•
Code/Data accessibility:
3 Corpus Profile
Information about data
•
Domain:
•
Data source:
•
Corpus volume:
•
Geographical scope:
•
Temporal scope:
4 Author Name Disambiguation
Information about author name disambiguation process, evaluation
method, and outputs
•
Disambiguation method:
•
Evaluation method:
•
Evaluation data:
•
Accuracy:
•
Elaboration on names with Asian origins:
5 Gender Identification
Information about author gender identification process, evaluation
method, and outputs
•
Gender identification method:
•
Name part used for gender identification:
•
Gender categories used:
•
Percentage of unidentified gender:
•
Handling of unknown gender:
•
Evaluation method:
•
Evaluation data:
•
Accuracy:
•
Elaboration on names with Asian origins:
6 Analysis
Information about causal factors and analysis
•
Method:
•
Control for year of publication?:
•
Control for author career age?:
•
Control for field of work?:
•
Control for author prior publication?:
7 Results
Information about study results
•
Presence of gender bias:
•
Overall effect size:
•
Effect size after controlling causal factors:
References
- Abramo et al. [2009] Giovanni Abramo, Ciriaco Andrea D’Angelo, and Alessandro Caprasecca. Gender differences in research productivity: A bibliometric analysis of the Italian academic system. Scientometrics, 79(3):517–539, June 2009. ISSN 1588-2861. doi: 10.1007/s11192-007-2046-8. URL https://doi.org/10.1007/s11192-007-2046-8.
- Ahmadia et al. [2021] Gabby N Ahmadia, Samantha H Cheng, Dominic A Andradi-Brown, Stacy K Baez, Megan D Barnes, Nathan J Bennett, Stuart J Campbell, Emily S Darling, Estradivari, David Gill, et al. Limited progress in improving gender and geographic representation in coral reef science. Frontiers in Marine Science, 8:731037, 2021.
- American Association of University Women (2024) [AAUW] American Association of University Women (AAUW). Fast facts: Academia, 2024. URL https://www.aauw.org/resources/article/fast-facts-academia/. Accessed: 2024-12-12.
- Azoulay and Lynn [2020] Pierre Azoulay and Freda Lynn. Self-Citation, Cumulative Advantage, and Gender Inequality in Science. Sociological Science, 7:152–186, 2020. ISSN 23306696. doi: 10.15195/v7.a7. URL https://www.sociologicalscience.com/articles-v7-7-152/.
- Belingheri et al. [2021] Paola Belingheri, Filippo Chiarello, Andrea Fronzetti Colladon, and Paola Rovelli. Twenty years of gender equality research: A scoping review based on a new semantic indicator. Plos one, 16(9):e0256474, 2021. doi: https://doi.org/10.1371/journal.pone.0256474.
- Bendels et al. [2018] Michael HK Bendels, Ruth Müller, Doerthe Brueggmann, and David A Groneberg. Gender disparities in high-quality research revealed by nature index journals. PloS one, 13(1):e0189136, 2018.
- Benjamens et al. [2020] Stan Benjamens, Louise BD Banning, Tamar AJ van den Berg, and Robert A Pol. Gender disparities in authorships and citations in transplantation research. Transplantation direct, 6(11), 2020.
- Caplar et al. [2017] Neven Caplar, Sandro Tacchella, and Simon Birrer. Quantitative evaluation of gender bias in astronomical publications from citation counts. Nature Astronomy, 1(6):1–5, May 2017. ISSN 2397-3366. doi: https://doi.org/10.1038/s41550-017-0141. URL https://www.nature.com/articles/s41550-017-0141. Number: 6 Publisher: Nature Publishing Group.
- Card et al. [2020] David Card, Stefano DellaVigna, Patricia Funk, and Nagore Iriberri. Are referees and editors in economics gender neutral? The Quarterly Journal of Economics, 135(1):269–327, 2020.
- Carlsson et al. [2021] Magnus Carlsson, Henning Finseraas, Arnfinn H Midtbøen, and Guðbjörg Linda Rafnsdóttir. Gender bias in academic recruitment? evidence from a survey experiment in the nordic region. European Sociological Review, 37(3):399–410, 2021.
- Carr et al. [2018] Phyllis L Carr, Anita Raj, Samantha E Kaplan, Norma Terrin, Janis L Breeze, and Karen M Freund. Gender differences in academic medicine: retention, rank, and leadership comparisons from the national faculty survey. Academic medicine: journal of the Association of American Medical Colleges, 93(11):1694, 2018.
- Carter et al. [2017] T Edison Carter, Thomas E Smith, and Philip J Osteen. Gender comparisons of social work faculty using h-index scores. Scientometrics, 111:1547–1557, 2017.
- Casad et al. [2021] Bettina J Casad, Jillian E Franks, Christina E Garasky, Melinda M Kittleman, Alanna C Roesler, Deidre Y Hall, and Zachary W Petzel. Gender inequality in academia: Problems and solutions for women faculty in stem. Journal of neuroscience research, 99(1):13–23, 2021. doi: 10.1002/jnr.24631.
- Chatterjee and Werner [2021] Paula Chatterjee and Rachel M. Werner. Gender Disparity in Citations in High-Impact Journal Articles. JAMA Network Open, 4(7):e2114509, July 2021. ISSN 2574-3805. doi: https://doi.org/10.1001/jamanetworkopen.2021.14509. URL https://doi.org/10.1001/jamanetworkopen.2021.14509.
- Cheryan and Markus [2020] Sapna Cheryan and Hazel Rose Markus. Masculine defaults: Identifying and mitigating hidden cultural biases. Psychological Review, 127(6):1022, 2020.
- Cohen [1960] Jacob Cohen. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46, 1960.
- Computing Research Association (2024) [CRA] Computing Research Association (CRA). 2023 cra taulbee survey report, 2024. URL https://cra.org/wp-content/uploads/2024/05/2023-CRA-Taulbee-Survey-Report.pdf. Accessed: 2024-12-12.
- Copenheaver et al. [2010] Carolyn A Copenheaver, Kyrille Goldbeck, and Paolo Cherubini. Lack of gender bias in citation rates of publications by dendrochronologists: What is unique about this discipline? Tree-Ring Research, 66(2):127–133, 2010.
- Dion et al. [2018] Michelle L Dion, Jane Lawrence Sumner, and Sara McLaughlin Mitchell. Gendered citation patterns across political science and social science methodology fields. Political analysis, 26(3):312–327, 2018.
- Du and Zhang [2024] Xiaocong Du and Haipeng Zhang. For the misgendered chinese in gender bias research: Multi-task learning with knowledge distillation for pinyin name gender prediction. In Kate Larson, editor, Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, pages 7233–7241. International Joint Conferences on Artificial Intelligence Organization, 8 2024. doi: 10.24963/ijcai.2024/800. URL https://doi.org/10.24963/ijcai.2024/800. AI for Good.
- Dworkin et al. [2020] Jordan D. Dworkin, Kristin A. Linn, Erin G. Teich, Perry Zurn, Russell T. Shinohara, and Danielle S. Bassett. The extent and drivers of gender imbalance in neuroscience reference lists. Nature Neuroscience, 23(8):918–926, August 2020. ISSN 1546-1726. doi: https://doi.org/10.1038/s41593-020-0658-y. URL https://www.nature.com/articles/s41593-020-0658-y. Number: 8 Publisher: Nature Publishing Group.
- Edwards et al. [2018] Hannah A Edwards, Julia Schroeder, and Hannah L Dugdale. Gender differences in authorships are not associated with publication bias in an evolutionary journal. PLoS One, 13(8):e0201725, 2018.
- Eloy et al. [2013] Jean Anderson Eloy, Peter Svider, Sujana S Chandrasekhar, Qasim Husain, Kevin M Mauro, Michael Setzen, and Soly Baredes. Gender disparities in scholarly productivity within academic otolaryngology departments. Otolaryngology–head and neck surgery, 148(2):215–222, 2013.
- Filardo et al. [2016] Giovanni Filardo, Briget Da Graca, Danielle M Sass, Benjamin D Pollock, Emma B Smith, and Melissa Ashley-Marie Martinez. Trends and comparison of female first authorship in high impact medical journals: observational study (1994-2014). bmj, 352, 2016.
- Fox et al. [2018] Charles W Fox, Josiah P Ritchey, and CE Timothy Paine. Patterns of authorship in ecology and evolution: First, last, and corresponding authorship vary with gender and geography. Ecology and evolution, 8(23):11492–11507, 2018.
- Fulvio et al. [2021] Jacqueline M Fulvio, Ileri Akinnola, and Bradley R Postle. Gender (im) balance in citation practices in cognitive neuroscience. Journal of Cognitive Neuroscience, 33(1):3–7, 2021.
- Gayet-Ageron et al. [2019] Angele Gayet-Ageron, Antoine Poncet, and Thomas Perneger. Comparison of the contributions of female and male authors to medical research in 2000 and 2015: a cross-sectional study. BMJ open, 9(2):e024436, 2019.
- Gebru et al. [2021] Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. Datasheets for datasets. Communications of the ACM, 64(12):86–92, 2021. doi: https://doi.org/10.1145/3458723.
- González-Álvarez and Cervera-Crespo [2019] Julio González-Álvarez and Teresa Cervera-Crespo. Contemporary psychology and women: A gender analysis of the scientific production. International Journal of Psychology, 54(1):135–143, 2019.
- Guthridge et al. [2022] Michaela Guthridge, Maggie Kirkman, Tania Penovic, and Melita J. Giummarra. Promoting Gender Equality: A Systematic Review of Interventions. Social Justice Research, 35(3):318–343, September 2022. ISSN 0885-7466, 1573-6725. doi: 10.1007/s11211-022-00398-z. URL https://link.springer.com/10.1007/s11211-022-00398-z.
- Hagan et al. [2020] Ada K Hagan, Begüm D Topçuoğlu, Mia E Gregory, Hazel A Barton, and Patrick D Schloss. Women are underrepresented and receive differential outcomes at asm journals: a six-year retrospective analysis. MBio, 11(6):10–1128, 2020.
- Hannak et al. [2020] Aniko Hannak, Kenneth Joseph, Andrei Cimpian, and Daniel B Larremore. Explaining gender differences in academics’ career trajectories. arXiv preprint arXiv:2009.10830, 2020.
- Heilman [2001] Madeline E Heilman. Description and prescription: How gender stereotypes prevent women’s ascent up the organizational ladder. Journal of social issues, 57(4):657–674, 2001. doi: https://doi.org/10.1111/0022-4537.00234.
- Holman et al. [2018] Luke Holman, Devi Stuart-Fox, and Cindy E Hauser. The gender gap in science: How long until women are equally represented? PLoS biology, 16(4):e2004956, 2018.
- Huang et al. [2020] Junming Huang, Alexander J. Gates, Roberta Sinatra, and Albert-László Barabási. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proceedings of the National Academy of Sciences, 117(9):4609–4616, March 2020. doi: 10.1073/pnas.1914221117. URL https://www.pnas.org/doi/full/10.1073/pnas.1914221117. Publisher: Proceedings of the National Academy of Sciences.
- Jemielniak et al. [2023] Dariusz Jemielniak, Agnieszka Sławska, and Maciej Wilamowski. Covid-19 effect on the gender gap in academic publishing. Journal of Information Science, 49(6):1587–1592, 2023.
- Kim and Diesner [2015] Jinseok Kim and Jana Diesner. The effect of data pre-processing on understanding the evolution of collaboration networks. Journal of Informetrics, 9(1):226–236, 2015.
- Kim and Diesner [2016] Jinseok Kim and Jana Diesner. Distortive effects of initial-based name disambiguation on measurements of large-scale coauthorship networks. Journal of the Association for Information Science and Technology, 67(6):1446–1461, 2016. doi: https://doi.org/10.1002/asi.23489.
- Kim et al. [2014] Jinseok Kim, Heejun Kim, and Jana Diesner. The impact of name ambiguity on properties of coauthorship networks. Journal of Information Science Theory and Practice, 2(2):6–15, 2014.
- King and Frederickson [2021] Molly M. King and Megan E. Frederickson. The Pandemic Penalty: The Gendered Effects of COVID-19 on Scientific Productivity. Socius: Sociological Research for a Dynamic World, 7:237802312110069, January 2021. ISSN 2378-0231, 2378-0231. doi: 10.1177/23780231211006977. URL http://journals.sagepub.com/doi/10.1177/23780231211006977.
- King et al. [2017] Molly M. King, Carl T. Bergstrom, Shelley J. Correll, Jennifer Jacquet, and Jevin D. West. Men Set Their Own Cites High: Gender and Self-citation across Fields and over Time. Socius: Sociological Research for a Dynamic World, 3:237802311773890, January 2017. ISSN 2378-0231, 2378-0231. doi: 10.1177/2378023117738903. URL http://journals.sagepub.com/doi/10.1177/2378023117738903.
- Kong et al. [2022] Hyunsik Kong, Samuel Martin-Gutierrez, and Fariba Karimi. Influence of the first-mover advantage on the gender disparities in physics citations. Communications Physics, 5(1):243, 2022.
- Kozlowski et al. [2022] Diego Kozlowski, Vincent Larivière, Cassidy R Sugimoto, and Thema Monroe-White. Intersectional inequalities in science. Proceedings of the National Academy of Sciences, 119(2):e2113067119, 2022.
- Kramer [2023] Andrea S. Kramer. Beyond Bias: How to Fix the System, Not the Symptoms of Gender Inequality at Work. John Murray Press, 2023. URL https://uk.bookshop.org/p/books/beyond-bias-how-to-fix-the-system-not-the-symptoms-of-gender-inequality-at-work-andrea-s-kramer/7445884.
- Larivière et al. [2011] Vincent Larivière, Etienne Vignola-Gagné, Christian Villeneuve, Pascal Gélinas, and Yves Gingras. Sex differences in research funding, productivity and impact: an analysis of Québec university professors. Scientometrics, 87(3):483–498, June 2011. ISSN 0138-9130, 1588-2861. doi: 10.1007/s11192-011-0369-y. URL http://link.springer.com/10.1007/s11192-011-0369-y.
- Larivière et al. [2013] Vincent Larivière, Chaoqun Ni, Yves Gingras, Blaise Cronin, and Cassidy R. Sugimoto. Bibliometrics: Global gender disparities in science. Nature, 504(7479):211–213, December 2013. ISSN 1476-4687. doi: https://doi.org/10.1038/504211a. URL https://www.nature.com/articles/504211a. Number: 7479 Publisher: Nature Publishing Group.
- Lerchenmueller et al. [2019] Marc J Lerchenmueller, Olav Sorenson, and Anupam B Jena. Gender differences in how scientists present the importance of their research: observational study. bmj, 367, 2019.
- Liu et al. [2023] Fengyuan Liu, Petter Holme, Matteo Chiesa, Bedoor AlShebli, and Talal Rahwan. Gender inequality and self-publication are common among academic editors. Nature Human Behaviour, 7(3):353–364, January 2023. ISSN 2397-3374. doi: 10.1038/s41562-022-01498-1. URL https://www.nature.com/articles/s41562-022-01498-1.
- Lockhart et al. [2023] Jeffrey W Lockhart, Molly M King, and Christin Munsch. Name-based demographic inference and the unequal distribution of misrecognition. Nature Human Behaviour, 7(7):1084–1095, 2023.
- Maggio et al. [2023] Lauren A Maggio, Joseph A Costello, Anton Boudreau Ninkov, Jason R Frank, and Anthony R Artino Jr. The voices of medical education scholarship: Describing the published landscape. Medical education, 57(3):280–289, 2023.
- Maliniak et al. [2013] Daniel Maliniak, Ryan Powers, and Barbara F. Walter. The Gender Citation Gap in International Relations. International Organization, 67(4):889–922, October 2013. ISSN 0020-8183, 1531-5088. doi: 10.1017/S0020818313000209. URL https://www.cambridge.org/core/journals/international-organization/article/gender-citation-gap-in-international-relations/3A769C5CFA7E24C32641CDB2FD03126A. Publisher: Cambridge University Press.
- Mayer and Rathmann [2018] Sabrina J Mayer and Justus MK Rathmann. How does research productivity relate to gender? analyzing gender differences for multiple publication dimensions. Scientometrics, 117(3):1663–1693, 2018.
- McDermott et al. [2018] Mollie McDermott, Douglas J Gelb, Kelsey Wilson, Megan Pawloski, James F Burke, Anita V Shelgikar, and Zachary N London. Sex differences in academic rank and publication rate at top-ranked us neurology programs. Jama Neurology, 75(8):956–961, 2018.
- McElduff et al. [2008] Fiona McElduff, Pablo Mateos, Angie Wade, and Mario Cortina Borja. What’s in a name? the frequency and geographic distributions of uk surnames. Significance, 5(4):189–192, 2008.
- McHugh [2012] Mary L McHugh. Interrater reliability: the kappa statistic. Biochemia medica, 22(3):276–282, 2012.
- Merriman et al. [2021] Rebekah Merriman, Ilaria Galizia, Sonja Tanaka, Ashley Sheffel, Kent Buse, and Sarah Hawkes. The gender and geography of publishing: a review of sex/gender reporting and author representation in leading general medical and global health journals. BMJ global health, 6(5):e005672, 2021.
- Mihaljević-Brandt et al. [2016] Helena Mihaljević-Brandt, Lucía Santamaría, and Marco Tullney. The effect of gender in the publication patterns in mathematics. PLoS One, 11(10):e0165367, 2016.
- Mishra et al. [2018] Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle I. Torvik. Self-citation is the hallmark of productive authors, of any gender. PLOS ONE, 13(9):e0195773, September 2018. ISSN 1932-6203. doi: 10.1371/journal.pone.0195773. URL https://dx.plos.org/10.1371/journal.pone.0195773.
- Mitchell et al. [2019] Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency, pages 220–229, 2019. doi: https://doi.org/10.1145/3287560.3287596.
- Mohammad [2020] Saif M. Mohammad. Gender gap in natural language processing research: Disparities in authorship and citations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7860–7870, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.702. URL https://aclanthology.org/2020.acl-main.702.
- Murphy et al. [2020] Mary C Murphy, Amanda F Mejia, Jorge Mejia, Xiaoran Yan, Sapna Cheryan, Nilanjana Dasgupta, Mesmin Destin, Stephanie A Fryberg, Julie A Garcia, Elizabeth L Haines, et al. Open science, communal culture, and women’s participation in the movement to improve science. Proceedings of the National Academy of Sciences, 117(39):24154–24164, 2020.
- Ni et al. [2021] Chaoqun Ni, Elise Smith, Haimiao Yuan, Vincent Larivière, and Cassidy R Sugimoto. The gendered nature of authorship. Science advances, 7(36):eabe4639, 2021.
- Nielsen [2016] Mathias Wullum Nielsen. Gender inequality and research performance: moving beyond individual-meritocratic explanations of academic advancement. Studies in Higher Education, 41(11):2044–2060, November 2016. ISSN 0307-5079. doi: 10.1080/03075079.2015.1007945. URL https://doi.org/10.1080/03075079.2015.1007945. Publisher: Routledge _eprint: https://doi.org/10.1080/03075079.2015.1007945.
- Nunkoo et al. [2020] Robin Nunkoo, Michael Thelwall, Jeynakshi Ladsawut, and Sandhiya Goolaup. Three decades of tourism scholarship: Gender, collaboration and research methods. Tourism management, 78:104056, 2020.
- Odic and Wojcik [2020] Darko Odic and Erica H. Wojcik. The publication gender gap in psychology. American Psychologist, 75(1):92–103, January 2020. ISSN 1935-990X, 0003-066X. doi: 10.1037/amp0000480. URL http://doi.apa.org/getdoi.cfm?doi=10.1037/amp0000480.
- Østby et al. [2013] Gudrun Østby, Håvard Strand, Ragnhild Nordås, and Nils Petter Gleditsch. Gender gap or gender bias in peace research? publication patterns and citation rates for journal of peace research, 1983–2008. International Studies Perspectives, 14(4):493–506, 2013.
- Palser et al. [2022] Eleanor R Palser, Maia Lazerwitz, and Aikaterini Fotopoulou. Gender and geographical disparity in editorial boards of journals in psychology and neuroscience. Nature Neuroscience, 25(3):272–279, 2022.
- Pilkina and Lovakov [2022] Marina Pilkina and Andrey Lovakov. Gender disparities in Russian academia: a bibliometric analysis. Scientometrics, 127(6):3577–3591, June 2022. ISSN 1588-2861. doi: https://doi.org/10.1007/s11192-022-04383-w. URL https://doi.org/10.1007/s11192-022-04383-w.
- Pinheiro et al. [2022] Henrique Pinheiro, Matt Durning, and David Campbell. Do women undertake interdisciplinary research more than men, and do self-citations bias observed differences? Quantitative Science Studies, 3(2):363–392, June 2022. ISSN 2641-3337. doi: 10.1162/qss˙a˙00191. URL https://direct.mit.edu/qss/article/3/2/363/111199/Do-women-undertake-interdisciplinary-research-more.
- Qamar et al. [2020] Sadia Raheez Qamar, Kiran Khurshid, Sabeena Jalal, Matthew DF McInnes, Linda Probyn, Karen Finlay, Cameron J Hague, Rebecca M Hibbert, Manish Joshi, Frank J Rybicki, et al. Gender disparity among leaders of canadian academic radiology departments. American Journal of Roentgenology, 214(1):3–9, 2020.
- Raj et al. [2016] Anita Raj, Phyllis L Carr, Samantha E Kaplan, Norma Terrin, Janis L Breeze, and Karen M Freund. Longitudinal analysis of gender differences in academic productivity among medical faculty across 24 medical schools in the united states. Academic medicine: journal of the Association of American Medical Colleges, 91(8):1074, 2016.
- Rao and Taboada [2021] Prashanth Rao and Maite Taboada. Gender Bias in the News: A Scalable Topic Modelling and Visualization Framework. Frontiers in Artificial Intelligence, 4:664737, June 2021. ISSN 2624-8212. doi: 10.3389/frai.2021.664737. URL https://www.frontiersin.org/articles/10.3389/frai.2021.664737/full.
- Ridgeway [2014] Cecilia L Ridgeway. Why status matters for inequality. American sociological review, 79(1):1–16, 2014.
- Ross et al. [2022] Matthew B. Ross, Britta M. Glennon, Raviv Murciano-Goroff, Enrico G. Berkes, Bruce A. Weinberg, and Julia I. Lane. Women are credited less in science than men. Nature, 608(7921):135–145, August 2022. ISSN 0028-0836, 1476-4687. doi: 10.1038/s41586-022-04966-w. URL https://www.nature.com/articles/s41586-022-04966-w.
- Salerno et al. [2019] Patricia E Salerno, Mónica Páez-Vacas, Juan M Guayasamin, and Jennifer L Stynoski. Male principal investigators (almost) don’t publish with women in ecology and zoology. PloS one, 14(6):e0218598, 2019.
- Santamaría and Mihaljević [2018] Lucía Santamaría and Helena Mihaljević. Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science, 4:e156, 2018.
- Santamaría and Mihaljević [2018] Lucía Santamaría and Helena Mihaljević. Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science, 4:e156, July 2018. ISSN 2376-5992. doi: 10.7717/peerj-cs.156. URL https://doi.org/10.7717/peerj-cs.156.
- Schisterman et al. [2017] Enrique F Schisterman, Chandra W Swanson, Ya-Ling Lu, and Sunni L Mumford. The changing face of epidemiology: gender disparities in citations? Epidemiology (Cambridge, Mass.), 28(2):159, 2017.
- Schmaling and Gallo [2023] Karen B Schmaling and Stephen A Gallo. Gender differences in peer reviewed grant applications, awards, and amounts: a systematic review and meta-analysis. Research integrity and peer review, 8(1):2, 2023.
- Shen [2013] Helen Shen. Inequality quantified: Mind the gender gap. Nature News, 495(7439):22, 2013.
- Sim and Wright [2005] Julius Sim and Chris C Wright. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical therapy, 85(3):257–268, 2005.
- Squazzoni et al. [2021a] Flaminio Squazzoni, Giangiacomo Bravo, Mike Farjam, Ana Marusic, Bahar Mehmani, Michael Willis, Aliaksandr Birukou, Pierpaolo Dondio, and Francisco Grimaldo. Peer review and gender bias: A study on 145 scholarly journals. Science advances, 7(2):eabd0299, 2021a.
- Squazzoni et al. [2021b] Flaminio Squazzoni, Giangiacomo Bravo, Francisco Grimaldo, Daniel García-Costa, Mike Farjam, and Bahar Mehmani. Gender gap in journal submissions and peer review during the first wave of the covid-19 pandemic. a study on 2329 elsevier journals. PloS one, 16(10):e0257919, 2021b.
- Tao et al. [2017] Yu Tao, Wei Hong, and Ying Ma. Gender differences in publication productivity among academic scientists and engineers in the us and china: similarities and differences. Minerva, 55:459–484, 2017.
- Teich et al. [2022] Erin G. Teich, Jason Z. Kim, Christopher W. Lynn, Samantha C. Simon, Andrei A. Klishin, Karol P. Szymula, Pragya Srivastava, Lee C. Bassett, Perry Zurn, Jordan D. Dworkin, and Dani S. Bassett. Citation inequity and gendered citation practices in contemporary physics. Nature Physics, 18(10):1161–1170, October 2022. ISSN 1745-2473, 1745-2481. doi: 10.1038/s41567-022-01770-1. URL https://www.nature.com/articles/s41567-022-01770-1.
- Thelwall [2020] Mike Thelwall. Gender differences in citation impact for 27 fields and six english-speaking countries 1996–2014. Quantitative Science Studies, 1(2):599–617, 2020.
- Thomas et al. [2019] Emma G Thomas, Bamini Jayabalasingham, Tom Collins, Jeroen Geertzen, Chinh Bui, and Francesca Dominici. Gender disparities in invited commentary authorship in 2459 medical journals. JAMA network open, 2(10):e1913682–e1913682, 2019.
- Tien Pham and Thanh Nguyen [2023] Duong Tien Pham and Luan Thanh Nguyen. Gendec: A machine learning-based framework for gender detection from japanese names. In International Conference on Intelligent Systems Design and Applications, pages 235–244. Springer, 2023.
- Torvik and Agarwal [2016] Vetle I Torvik and Sneha Agarwal. Ethnea–an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database. 2016.
- Van de Weijer et al. [2020] Jeroen Van de Weijer, Guangyuan Ren, Joost van de Weijer, Weiyun Wei, and Yumeng Wang. Gender identification in chinese names. Lingua, 234:102759, 2020.
- Van den Besselaar and Sandström [2016] Peter Van den Besselaar and Ulf Sandström. Gender differences in research performance and its impact on careers: a longitudinal case study. Scientometrics, 106:143–162, 2016.
- Van den Besselaar and Sandström [2017] Peter Van den Besselaar and Ulf Sandström. Vicious circles of gender bias, lower positions, and lower performance: Gender differences in scholarly productivity and impact. PloS one, 12(8):e0183301, 2017.
- Vranas et al. [2020] Kelly C Vranas, David Ouyang, Amber L Lin, Christopher G Slatore, Donald R Sullivan, Meeta Prasad Kerlin, Kathleen D Liu, Rebecca M Baron, Carolyn S Calfee, Lorraine B Ware, et al. Gender differences in authorship of critical care literature. American journal of respiratory and critical care medicine, 201(7):840–847, 2020.
- Vásárhelyi et al. [2021] Orsolya Vásárhelyi, Igor Zakhlebin, Staša Milojević, and Emőke-Ágnes Horvát. Gender inequities in the online dissemination of scholars’ work. Proceedings of the National Academy of Sciences, 118(39):e2102945118, September 2021. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.2102945118. URL https://pnas.org/doi/full/10.1073/pnas.2102945118.
- Wang et al. [2021] X. Wang, J. D. Dworkin, D. Zhou, J. Stiso, E. B. Falk, D. S. Bassett, P. Zurn, and D. M. Lydon-Staley. Gendered citation practices in the field of communication. Annals of the International Communication Association, 45(2):134–153, April 2021. ISSN 2380-8985. doi: https://doi.org/10.1080/23808985.2021.1960180. URL https://doi.org/10.1080/23808985.2021.1960180. Publisher: Routledge _eprint: https://doi.org/10.1080/23808985.2021.1960180.
- West et al. [2013] Jevin D West, Jennifer Jacquet, Molly M King, Shelley J Correll, and Carl T Bergstrom. The role of gender in scholarly authorship. PloS one, 8(7):e66212, 2013.
- Williams and Ceci [2015] Wendy M Williams and Stephen J Ceci. National hiring experiments reveal 2: 1 faculty preference for women on stem tenure track. Proceedings of the National Academy of Sciences, 112(17):5360–5365, 2015. doi: https://doi.org/10.1073/pnas.1418878112.
- Wu [2024] Cary Wu. The gender citation gap: Approaches, explanations, and implications. Sociology Compass, 18(2):e13189, 2024. doi: https://doi.org/10.1111/soc4.13189. URL https://compass.onlinelibrary.wiley.com/doi/abs/10.1111/soc4.13189.
- Wu et al. [2020] Cary Wu, Sylvia Fuller, Zhilei Shi, and Rima Wilkes. The gender gap in commenting: Women are less likely than men to comment on (men’s) published research. PLoS One, 15(4):e0230043, 2020.
- Yalamanchali et al. [2021] Anirudh Yalamanchali, Emily S Zhang, and Reshma Jagsi. Trends in female authorship in major journals of 3 oncology disciplines, 2002-2018. JAMA Network Open, 4(4):e212252–e212252, 2021.
- Yang et al. [2022] Yang Yang, Tanya Y Tian, Teresa K Woodruff, Benjamin F Jones, and Brian Uzzi. Gender-diverse teams produce more novel and higher-impact scientific ideas. Proceedings of the National Academy of Sciences, 119(36):e2200841119, 2022.
- You et al. [2024] Zhiwen You, HaeJin Lee, Shubhanshu Mishra, Sullam Jeoung, Apratim Mishra, Jinseok Kim, and Jana Diesner. Beyond binary gender labels: Revealing gender bias in LLMs through gender-neutral name predictions. In Agnieszka Faleńska, Christine Basta, Marta Costa-jussà, Seraphina Goldfarb-Tarrant, and Debora Nozza, editors, Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 255–268, Bangkok, Thailand, August 2024. Association for Computational Linguistics. doi: 10.18653/v1/2024.gebnlp-1.16. URL https://aclanthology.org/2024.gebnlp-1.16.
- Zhang et al. [2022] Lin Zhang, Yuanyuan Shang, Ying Huang, and Gunnar Sivertsen. Gender differences among active reviewers: an investigation based on publons. Scientometrics, 127(1):145–179, 2022.
- Zhao et al. [2023] Xinyi Zhao, Aliakbar Akbaritabar, Ridhi Kashyap, and Emilio Zagheni. A gender perspective on the global migration of scholars. Proceedings of the National Academy of Sciences, 120(10):e2214664120, March 2023. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.2214664120. URL https://pnas.org/doi/10.1073/pnas.2214664120.
Appendix A Appendix
A.1 Paper sampling criteria
In this subsection, we list the keywords used to sample papers for this study. To capture a wide range of discussions on gender biases and related scholarly performance, we combined terms such as “gender bias,” “gender gap,” “gender disparities,” “gender differences,” “publication/academic/scientific performance,” “citation(s)/gendered citation,” “demographic bias,” “productivity,” and “scholarly analysis.” These keywords were applied individually and in combination to maximize the breadth and relevance of our literature search, while minimizing redundancy. For example, we paired “gender gap” with “citations” or “gender bias” with “productivity” to capture a variety of relevant contexts.
A.2 Overview of trends in gender bias research
We analyzed themes and trends across domains, data sources, geographical and temporal data scopes, and scholarly performance indicators for the papers in our sample. We note that we do not provide specific percentages or proportions of papers for these dimensions, as our emphasis is on uncovering general trends rather than presenting granular quantitative breakdowns.
A.2.1 Domain
Gender biases refer to the systemic favoring of one gender over others, often resulting in unequal opportunities, representation, or treatment in societal, academic, or professional contexts. Gender bias has been studied in a wide range of fields. For instance, numerous studies in medicine [14, 24, 27] used bibliometric analyses to detect gender-based disparities in publication patterns. Similar large-scale data analyses have been conducted in astronomy [8, 95, 21, 53], physics [85] and psychology [65, 29, 52], revealing patterns of underrepresentation, differential citation rates, and uneven editorial review processes.
While many studies in our sample focus on gender inequalities within a specific field, others adopt interdisciplinary approaches by examining the issue across multiple disciplines. In fields such as Science, Technology, Engineering, and Mathematics (STEM) [40, 35, 34] and Science and Technology Studies (STS)[1], scholars have employed mixed methods approaches for AND by combining quantitative citation metrics with qualitative assessments of institutional practices to uncover persistent gender imbalances. Cross-domain analyses further illustrate that gender bias manifests in multiple contexts, consistently affecting authorship credit, editorial decisions, and scholarly influence [68, 46, 48, 74, 63, 41, 83]. In summary, the studies investigating gender bias span multiple domains, reflecting widespread interest across academic fields. This breadth of research demonstrates that gender bias is a pervasive issue, with researchers employing varied analyses to uncover patterns of inequity in representation, research visibility, and scholarly recognition.
A.2.2 Geographical and temporal scope of data used in studies
The geographical scope of the studies on gender inequality in academia varies widely, with some papers adopting a global perspective while others opt for a country-specific approach. This breadth mirrors the broader quest for a comprehensive view of this persistent issue as well as a detailed, context-specific understanding that could inform policy changes and support initiatives attracting and retaining women in academia. Studies that have adopted a global perspective (i.e. covering multiple nations) have unveiled international trends and contributed to a broad overview of the gender inequality landscape in academia through comparative analyses [8, 46, 85, 40, 35, 41, 58, 65, 60, 34, 29, 57, 6, 9, 24, 47, 101, 94].
Several studies have focused on gender inequality in academia within particular countries. For instance, Pilkina and Lovakov [68] focused on Russia, while Ross et al. [74], McDermott et al. [53], Carter et al. [12], Gayet-Ageron et al. [27] analyzed the gender imbalance within the context of the United States. Abramo et al. [1] concentrated on Italy, Larivière et al. [45] on Quebec, Rao and Taboada [72] on Canada, Nielsen [63] on Denmark, and Mayer and Rathmann [52] on Germany. These country-specific studies highlight the importance of a locally contextualized understanding of gender inequality, considering the unique cultural, social, and legislative landscapes that might influence this issue. They are an essential complement to global studies, providing a richer and more detailed picture of gender inequalities in academia.
A.2.3 Scholarly performance indicators
Numerous bibliometric indicators have been used to investigate gender bias. Many studies have scrutinized female and male researchers’ scholarly influence and prominence, and these assessments often employ a set of bibliometric and altmetric indicators [71, 26, 87]. Traditional bibliometric metrics include citation counts, patterns of self-citation and citing behavior, the impact factor of the publishing journal, and the h-index. More contemporary measures, such as social media metrics, are increasingly being utilized to capture a broader academic impact, visibility, and attention that publications achieve [53, 12]. We observed that the most frequently analyzed scholarly performance indicators included the number of publications, citations, self-citations, journal impact factor, h-index, and author by-line order.
Analyzing authors’ productivity, we found that the number of publications was frequently used as a proxy for productivity [103, 65, 1, 45, 68, 6]. For instance, Abramo et al. [1] found that male researchers outperformed female researchers in productivity by an average output of 16.8% more papers. Similarly, Odic and Wojcik [65] found that male first authors achieve higher publication and citation counts even when affiliations are controlled for. Furthermore, Bendels et al. [6] found that women publish fewer articles than men, hold fewer prestigious authorship positions and that articles with female key (first or last) authors are cited less frequently than those with male key authors, with these disparities being most pronounced in high-impact journals and highly collaborative articles.
To measure the impact of research paper, we observed that papers used numerous metrics such as the number of citations [43, 42, 9, 60, 63, 51, 35, 69, 85, 14, 8, 21, 95], h-index [12, 53], self-citation [58, 41, 4], and journal impact factor [57, 52]. For the number of citations, Chatterjee and Werner [14] found that articles written by women as first authors had fewer median citations than those written by men as first authors. Similarly, Caplar et al. [8] found that papers authored by women receive 10.4 ± 0.9% fewer citations than would be expected if men wrote the papers with the same non-gender-specific properties. For the h-index, Carter et al. [12] found that the gender gap in the h-index was the largest at the full professor level and smallest at the associate professor level, where women’s h-index scores were close to those of men. Also, Mihaljević-Brandt et al. [57] discovered that female authors published significantly less in top journals than their male counterparts.
Some studies [2, 25, 64, 75, 56, 67, 29, 22, 46, 40, 61] looked into the authorship positions and author by-line order to investigate gender biases. Authorship positions, such as first, middle, or last author, often carry distinct levels of recognition and prestige within academic publications. In contrast, the author by-line order refers to the sequence in which authors are listed, providing insight into collaboration dynamics and the relative contributions of each author. By analyzing these aspects, these studies aimed to uncover patterns of gender disparity in recognition, leadership roles, and collaborative influence within scholarly publishing.
Larivière et al. [46]found that articles first-authored by men outnumber those first-authored by women by nearly two to one (1.93). Filardo et al. [24] observed that the representation of women among first authors of original research in high-impact general medical journals was significantly higher in 2014 than 20 years ago, but has plateaued in recent years and declined in some journals. Similarly, Murphy et al. [61] discovered that women are more likely to be represented in high-status author positions in open science.