Twitter Analysis Can Help Practitioners, Policymakers, and Researchers Better Understand Topics Relevant to American Indian/Alaska Native Youth
Children and youth under age 18 constitute nearly one third of the American Indian/Alaska Native (AI/AN) population in the United States, yet data on this sub-population’s needs and interests can be hard to come by. Social media data represent an increasingly popular source of information for many populations and topics. This brief and data visualization provide an overview of learnings from a recent analysis of Twitter engagement around AI/AN-focused hashtags. The interactive data visualization displays the top 50 hashtags for each month in our analysis (January 2015 through July 2020) with important AI/AN-specific and U.S. events provided on a timeline slider for context.
In this brief we share key implications for practitioners and policy stakeholders who work with AI/AN communities or on issues relevant to these communities. We also offer recommendations for interpreting these Twitter data that may be useful to researchers and others who seek to use similar social media data in their work. Exploring this data visualization may provide additional insight to practitioners, policymakers, and researchers interested in focusing in more depth on a specific topic or issue.
An estimated 5.6 million people in the United States (or 1.7% of the total population) self-identify as American Indian/Alaska Native (AI/AN), either alone or in combination with one or more other races; furthermore, approximately 29.3 percent of AI/AN individuals are under age 18 (versus 22.8% of the total U.S. population). As a relatively small sub-population within the United States, it can be difficult to identify high-quality, timely data for AI/AN children and youth (i.e., individuals under age 18). Nationally representative data sets rarely include AI/AN samples large enough to present disaggregated results. And while recommended community-based and community-engaged approaches provide rich information, these are time- and resource-intensive and are not designed to produce an understanding of larger population trends.
Big data—particularly the unstructured data produced when people use social media—present an opportunity to efficiently learn about population-level trends worthy of greater focus in research, policy, and practice. While AI/AN issues have historically been marginalized within public discourse in the United States and elsewhere, social media provides an easily accessible avenue for AI/AN individuals, communities, and organizations to connect with one another and amplify their voices in broader conversations.
In addition, when content sharing and dialogue occurs between AI/AN peoples and non-AI/AN entities, social media provides opportunities for collaborative relationships to form. AI/AN organizations and individuals, including youth, are increasingly active online and on social media., Social media has also played a critical role in mobilizing AI/AN communities around a number of important issues, with some causes championed by AI/AN communities—such as the youth-led pipeline protest on the Standing Rock reservation—recently gaining substantial traction with wider audiences.
Although social media offers users new connections and venues for dialogue, and researchers new opportunities to study those interactions, it can also reify detrimental and oppressive practices. AI/AN scholars have shown that social media platforms can be a space in which harmful stereotypes about AI/AN peoples are perpetuated, and where questionable claims of AI/AN identity may be prevalent.
Understanding trends in social media conversations about AI/AN issues could enhance dissemination of resources and information that both address emerging and underrepresented needs and speak to the strengths and interests of AI/AN peoples. Without an understanding of the social media environment in which these conversations take place, researchers, policymakers, and practitioners will be ill-equipped to take part in important engagement opportunities.
To gain an understanding of social media conversations on topics relevant to AI/AN children and youth, we first generated a list of AI/AN-focused hashtags (PDF). Then, we scraped from Twitter any tweet posted from January 1, 2015 to July 31, 2020 that included these hashtags. (Data scraping refers to the use of data science methods to extract data directly from the source—usually websites. In this case, we created a database of tweets that use specific hashtags.) Our data set also included initial replies to these tweets. To focus on conversations relevant to AI/AN children and youth, we narrowed the data set to tweets containing key age-related terms (PDF) and examined hashtags used in those tweets.
We used qualitative methods to identify themes across the hashtags and incorporated primary themes (i.e., Activism; Identity, Nationality, or Nation; Politics; and Other) into the data visualization through color coding. In addition to facilitating visualization of trends over time, the themes may guide readers who are less familiar with some hashtags and topics (e.g., #mniwiconi is coded as Activism because mní wičóni translates to “water is life” in the Lakota language, and the phrase was used in protests against the Dakota Access Pipeline). Engagement metrics for each hashtag (i.e., number of users, tweets, retweets, and favorites) are also displayed in the data visualization.
In addition, our qualitative analysis allowed for the identification of a secondary theme and up to two sub-themes for each hashtag, as well as an assessment of whether each hashtag likely originated in or focused on Indigenous communities (e.g., #mniwiconi was coded Activism – Pipelines – DAPL – Mní wičóni; Indigenous); however, these results were not incorporated into the data visualization. More details on the methodology can be found in the Methodology Note below.
The primary themes among hashtags appearing in the data visualization were Activism (35%); Identity, Nationality, or Nation (19%); and Politics (11%). Hashtags with the primary theme Other (35%) had a variety of secondary themes, the most common of which was Artistic or Cultural Expression (20% of Other hashtags).
Among hashtags with the primary theme Activism, the most common secondary theme was Pipelines (22% of Activism hashtags). For hashtags with the primary theme Identity, Nationality, or Nation, the most common secondary theme was Racial or Ethnic Group (35% of Identity, Nationality, or Nation hashtags); and for hashtags with the primary theme Politics, the most common secondary theme was U.S. Politics (45% of Politics hashtags).
While we intentionally chose AI/AN-focused hashtags to create our initial data set, 69 percent of the hashtags in our final analysis are classified as non-Indigenous-specific. Examples of frequent non-Indigenous-specific hashtags include #cdnpoli (Canadian politics), #education, and #blacklivesmatter. Readers interested in learning more about the themes identified in the qualitative coding process are encouraged to review the supplemental file included in the Methodology Note below, which provides the full qualitative coding for each hashtag.
- As with any population, practitioners can use different social media platforms in different ways, depending on the specific AI/AN-related topic(s).
- Policy stakeholders should keep in mind that many people engaged in Twitter conversations about AI/AN children and youth use Activism- and Politics-themed hashtags.
- Researchers must understand that Indigenous identities, nations, and nationalities are intertwined and nuanced.
As with any population, practitioners can use different social media platforms in different ways, depending on the specific AI/AN-related topic(s).
While topics such as activism, politics, and identity were clearly featured in our analyses, only 7 percent of all hashtags were assigned the secondary theme Artistic or Cultural Expression. This result may reflect the selection of Twitter as the social media platform chosen for this study. For instance, Social Distance Powwow was launched as a Facebook page in March 2020 and quickly became a cultural phenomenon due to the COVID-19 pandemic undermining the safety of large gatherings.,
Although the event has transitioned to a movement that encompasses much more than cultural dancing and songs, and that often includes posts from AI/AN youth, it appears to have remained far more popular on Facebook than on Twitter; indeed, the hashtag #socialdistancepowwow was not a top-50 hashtag in any of the months we studied. An understanding of the social media landscape requires knowing not only what topics gain traction, but where—cultural events and posts, for example, may be more commonly featured on Facebook than on Twitter.
Policy stakeholders should keep in mind that many people engaged in Twitter conversations about AI/AN children and youth use Activism- and Politics-themed hashtags.
While the initial hashtag list used to generate our data set included several Activism hashtags, many additional Activism hashtags appeared in our final analysis, including #climatechange and #residentialschools. In fact, hashtags with the primary theme Activism (35% of all displayed hashtags) could be viewed as an undercount, given that some Twitter users may have used hashtags with the primary theme Politics for activism purposes. Politics hashtags represented only 4 percent of Indigenous-specific hashtags, but 14 percent of non-Indigenous-specific hashtags. We picked very few Politics hashtags to generate our data set (e.g., #NativeVote18), and only one (#nativevote) ended up in the final data visualization as one of the top-50 hashtags for any particular month. The Activism hashtags address a variety of topics, such as the environment, child welfare, criminal justice, and health. Twitter users who tweet about AI/AN issues may often use the platform to organize and discuss activism-related topics.
- When using social media to address AI/AN topics, stakeholders should be aware of intersectional identities and movements.
- Practitioners and policy stakeholders should consider prior trends in social media conversations to tailor future engagement with AI/AN communities.
- Researchers must remember that their assumptions and decision making can impact analyses, data visualization, and interpretation.
- Researchers should consider geography, privacy, and confidentiality when using social media data.
- Researchers also need to acknowledge the limitations of assessing identity when analyzing social media data.
- Finally, researchers should consider analyzing hashtags and metadata (data that describe other information about or characteristics of the data, such as what month a hashtag was tweeted).
When using social media to address AI/AN topics, stakeholders should be aware of intersectional identities and movements.
Our work demonstrates that topics not commonly associated with Indigenous populations are likely relevant to AI/AN individuals and communities. The plethora of non-Indigenous-specific hashtags prominent in our analysis highlights the ways in which AI/AN issues interact with issues critical to other populations, as well as with concerns relevant to everyone. When practitioners and policy stakeholders—including grassroots organizers, advocacy groups, and others who may engage in the policymaking process—create and share resources via social media, they should recognize that initiatives developed outside of (or that are non-specific to) AI/AN communities may still have relevance or application in AI/AN contexts.
The intersectional nature of our results is also relevant for research. When analyzing social media data to address AI/AN topics, researchers should look beyond AI/AN-focused hashtags to truly capitalize on a major benefit of unstructured data—the ability to observe intersecting interests and identities that exist in the real world, rather than an environment primarily driven by the research process. Our study started with a database of tweets generated from an initial list of AI/AN-focused hashtags. However, we deliberately chose to scrape Twitter for replies to those tweets to broaden the scope of our analysis, which produced a final list of hashtags that was largely non-Indigenous-specific.
Researchers should consider not only how AI/AN-specific needs and perspectives have often been marginalized, but also how AI/AN populations have often been represented as one-dimensional in the public sphere.,,. Rather than control against complexity, similar use of big data techniques should seek to incorporate the complexity of lived experiences and interests to better capture the needs and perspectives of AI/AN populations.
Practitioners and policy stakeholders should consider prior trends in social media conversations to tailor future engagement with AI/AN communities.
Adding a temporal element to our analysis was important for understanding how conversations about issues salient to AI/AN communities have grown and changed over time. For instance, #MMIW (Missing and Murdered Indigenous Women) grew from nine users tweeting the hashtag in January 2015 to 238 in July 2020, and a number of related hashtags, such as #MMIWG (Missing and Murdered Indigenous Women and Girls) and #MMIP (Missing and Murdered Indigenous People), also began to show up on the data visualization over time.
In addition, common hashtags used in conjunction with recurring annual events are prominent in the data visualization, such as LGBT Pride Month (June), Indigenous Peoples Day (October), Native American Heritage Month (November), and a variety of conferences hosted by tribal organizations (multiple months). Understanding not just what—but when and to what extent—hashtags have been used on Twitter can facilitate a better connection with audiences engaged in these conversations.
For example, knowing which moments lead to increased engagement may be helpful for practitioners who aim to learn about specific AI/AN topics or disseminate new resources. Likewise, policy stakeholders—such as advocates and grassroots organizers already engaged with or hoping to engage AI/AN communities—could use the cadence and engagement metrics from our data visualization to inform their selection of hashtags and timing of social media posts for future outreach efforts.
Researchers must remember that their assumptions and decision making can impact analyses, data visualization, and interpretation.
Although the use of big data techniques in this project (namely, scraping Twitter data for trend analyses) may imply a straightforward, quantitative approach, our team remained keenly aware that our assumptions and decision making had implications for our analyses, as well as the resulting data visualization and its potential interpretation. For example, while our team included two individuals who identify as American Indian and First Nations (first author and expert consultant, respectively), there are 574 federally recognized tribes in the United States alone, and it would be impossible for two people to represent and/or possess deep knowledge of them all—not to mention the many other tribes that are not yet federally recognized or primarily reside in countries outside the United States (e.g., Canada, Australia).
The list of hashtags used to develop our data set of tweets likely missed key hashtags relevant to AI/AN peoples, such as those that use tribal languages or reflect specific cultural events and/or terminology. Even with some tribal languages present in our initial and final hashtag lists, careful human monitoring of automated text analyses was necessary to appropriately account for special characters and text combinations, highlighting the value of our mixed methods approach.
Finally, our team chose to analyze data from Twitter. Because the demographic characteristics of social media users varies by platform, the trends we observed in this analysis may not hold for similar analyses of data scraped from other social media platforms. Researchers should consider the broader identity characteristics of various social media platforms as they define their research questions and analyses.
Researchers should consider geography, privacy, and confidentiality when using social media data.
Our interest in social media conversations on topics relevant to AI/AN children and youth meant that our team had to consider both geography and the privacy and confidentiality of Twitter data. For example, how does one draw geographic borders around peoples and movements that are not limited to specific areas, and what geographic frameworks are appropriate when discussing AI/AN populations?
As noted above, although our database was generated from an initial list of AI/AN-focused hashtags, several hashtags related to Indigenous populations and movements in Canada were present in our analyses. Rather than attempt to remove these hashtags, we elected to leave them in, acknowledging that their presence likely reflects the shared histories and present-day connections that exist among Indigenous populations in North America, despite colonial borders.
Our team also considered geographic displays of the data and the inclusion of example tweets to demonstrate our qualitative coding procedures, but we refrained from both approaches given that, in some instances, the low number of users tweeting about specific topics raised identifiability concerns. Researchers must be cognizant of how social media data and analyses could be misused—especially when those data are related to populations for whom data have historically been collected for violent and oppressive purposes, and through exploitative means.
Researchers also need to acknowledge the limitations of assessing identity when analyzing social media data.
An important, related consideration is that our project does not allow for any conclusions about the identity of the Twitter users who posted the tweets in our analyses. While this is important in regard to identity protection for Twitter users who may belong to a relatively small population, it is also possible that some Twitter users have co-opted AI/AN identities or chosen to use AI/AN- or Indigenous-focused hashtags to capitalize on community values or movements.
For example, some hashtags in our analysis had a high volume of tweets but a very low number of users. Upon a review of the hashtag topic and example tweets, it became clear that some of these hashtags were likely related to advertising for goods and services that may not have originated with AI/AN individuals and companies, despite using terms such as Native or Indigenous. This consideration highlights the importance of reviewing both the number of users and the number of tweets when considering relevance.
Finally, researchers should consider analyzing hashtags and metadata.
Although an analysis of hashtags provides a useful gauge of the issues that Twitter users are promoting and the conversations in which they are engaging, tweets are often comprised of text and images that go beyond and frequently do not include hashtags. Metadata, in addition to the timestamps used in this data visualization, can also offer rich insights into how these online conversations take shape. Our team hopes to conduct future analyses to further explore how social media data can be used in ways that are meaningful and appropriate for studying issues important to AI/AN populations—especially AI/AN children and youth.
This analysis offers insights into the varied and dynamic conversations that happen on Twitter related to topics relevant to AI/AN children and youth. Researchers need new tools for analyzing discussions of topics important to populations that are often left out of social science research and policy. Social media analyses present opportunities to dig deeper into these conversations as they play out over time, but researchers must remain aware of how to wield these analyses in a robust and ethical manner.
When studying social media conversations relevant to AI/AN populations, researchers must consider the history and fluidity of geographic borders, the social media platform they use for analysis (and how its particularities will affect the data), the co-opting of identities and online cultural appropriation that can muddy results, and the legacies of abusive data collection and analysis. Navigating these challenges allows for opportunities to create more inclusive data science methods.
Researchers are likely to find that, when analyzing the unstructured data of online conversations, many topics arise that are not specific to AI/AN populations; this is because conversations, like identities, are intersectional. Indeed, the phrase “topics relevant to AI/AN children and youth” may be a misnomer, because AI/AN individuals carry many identities, and because AI/AN communities interact and intersect with other communities and the topics relevant to them. In addition, Twitter is often used for collaboration and coalition-building across political movements and trends. Researchers should embrace this complexity, although it may add new challenges even while opening doors.
Finally, one major benefit of social media analysis is that it allows us to not only produce snapshots of online conversations, but also to track conversations over time. To engage with the needs and interests of AI/AN communities, policymakers and practitioners must remember that online conversations are not static. Understanding how these conversations evolve is another part of seeing AI/AN communities in a complex light as needs shift, new movements take hold, and trends wax and wane.
Our team is grateful to Erik Stegman, executive director of Native Americans in Philanthropy, for serving as an expert consultant on this project.
In collaboration with our expert consultant, we generated a list of 69 AI/AN-focused hashtags (PDF). Tweets including any hashtag from our list that were posted from January 1, 2015 to July 31, 2020—as well as the tweet’s initial reply—were extracted from Twitter to create a data set of 2,994,967 tweets. The data set was then restricted to tweets that included key age-related terms (PDF), resulting in an analytic sample of 323,222 tweets from 96,186 users with 91,928 unique hashtags. Hashtags were sorted by month and by number of users. The interactive data visualization displays the top 50 hashtags by number of users for each month and contains 515 total hashtags across all months. In cases where the number of users was the same for hashtags directly following the 50th rank-ordered hashtag, the subsequent hashtag(s) were also included in the data visualization, resulting in some months having 51 to 54 hashtags.
We used an iterative, inductive process that allowed themes to emerge from the data to qualitatively code hashtags appearing in the data visualization. The hashtags, along with three example tweets for each hashtag, were reviewed by two coders to identify primary themes and develop an exploratory set of thematic codes. A third coder reviewed the hashtags, example tweets, and exploratory codes to develop a formal coding strategy, which was then reviewed and finalized in collaboration with the primary author. The final coding strategy required that each hashtag be classified as likely originating from or focusing on Indigenous communities, or not; then, it would be coded according to the following levels:
- Level 1: Primary theme (i.e., Activism; Identity, Nationality, or Nation; Politics; or Other)
- Activism: hashtag is associated with tweets that include a need for some type of change (e.g., social, political); or with historical inequity
- Identity, Nationality, or Nation: hashtag could be used to describe a group of people, including a sovereign nation
- Politics: hashtag is associated with political or government activities
- Other: hashtag does not fit into the above themes
- Level 2: Secondary (content or topical) theme
- Provides detail or clarification for Level 1 (e.g., Education, Health, Pipelines, Cultural appropriation)
- Level 3: Sub-theme (if applicable)
- Level 4: Sub-theme (if applicable)
The two initial coders then recoded the hashtags using the updated coding strategy. The third coder assessed inter-rater reliability for the Level 1 codes and the classification of hashtags as having an Indigenous origin/focus or not. The third coder also reviewed the coded hashtags to settle discrepancies between the codes applied by the two initial coders. In instances where resolving a discrepancy was challenging, the third coder consulted the primary author.
With an understanding that hashtags can be used and interpreted in many ways, our team developed and applied the coding strategy based on the context available and our focus on Twitter conversations relevant to AI/AN children and youth. For example, we inferred that #geni was primarily used in reference to the Generation Indigenous Initiative and not the Gen-I social media platform.
The data visualization displays the Level 1 code for each hashtag using color coding. As Other was our most common Level 1 theme, we encourage readers interested in learning more to review our supplemental file (PDF). The supplemental file provides additional details for each hashtag in the data visualization, including the full qualitative coding, number of tweets, number of users, and whether the hashtag appeared in our initial list of hashtags used to scrape tweets from Twitter.
 U.S. Census Bureau (2018). American Indian and Alaska Native Alone or in Combination with One or More Races, 2013-2018 American Community Survey 5-year estimates (B02010). Retrieved from https://data.census.gov/cedsci/
 U.S. Census Bureau (2016). Selected Population Profile in the United States, 2016 American Community Survey 1-year estimates (S0201).
 NCAI Policy Research Center. (2016) Disaggregating American Indian & Alaska Native data: A review of the literature. Washington D.C.: National Congress of American Indians. https://www.policylink.org/sites/default/files/AIAN-report.pdf
 Tom-Orme, L. (2014). Guidelines for Conducting Successful Community-based Participatory Research in American Indian and Alaska Native Communities. Chapter 7. In Solomon, T. G. & Randall, L. L., Conducting Health Research with Native American Communities. Alpha Press. https://doi.org/10.2105/9780875532028
 Sabato, T. M. (2018). Utilization of Media-Driven Technology for Health Promotion and Risk Reduction among American Indian and Alaska Native Young Adults: An Exploratory Study. Journal of Health Disparities Research and Practice, 12(1), 4. https://digitalscholarship.unlv.edu/cgi/viewcontent.cgi?article=1815&context=jhdrp
 Northwest Portland Area Indian Health Board. (2016). We R Social: Findings from the 2016 Youth-Health-Tech Survey. http://www.npaihb.org/wpfb-file/we-r-social-youth-health-tech-survey-2016-pdf/
 Duarte, M. (2017). Connected activism: Indigenous uses of social media for shaping political change. Australasian Journal of Information Systems, 21: 1-12. http://marisaduarte.net/Duarte_Connected%20Activism.pdf
 Reclaiming Native Truth (2018a). Lessons Learned from Standing Rock. First Nations Development Institute. https://www.firstnations.org/publications/lessons-learned-from-standing-rock/
 Reclaiming Native Truth (2018b). Changing the Narrative about Native Americans: A Guide for Native Peoples and Organizations. First Nations Development Institute. https://www.firstnations.org/publications/changing-the-narrative-about-native-americans-a-guide-for-native-peoples-and-organizations/
 Reclaiming Native Truth (2018c). Compilation of All Research from the Reclaiming Native Truth Project. First Nations Development Institute. https://www.firstnations.org/publications/compilation-of-all-research-from-the-reclaiming-native-truth-project/
 Cox, Susanna. (n.d.) What Does it Mean to Be Cherokee Online? Indigenous Engineering. https://indigenous.engineering/projects/Cherokee-Online.html
 Abourezk, K. (2020, March 30). We’re building faith: Social distance powwow brings Indian Country together despite coronavirus. Indianz. https://www.indianz.com/News/2020/03/30/were-building-faith-indian-country-share.asp
 Fonseca, F. (2020, April 12). Beautiful powwow there. Indian Country Today. https://indiancountrytoday.com/news/beautiful-powwow-there-yBmAwzhVN0iubkZkKtz8KQ
 Auxier, B. (2020). Activism on social media varies by race and ethnicity, age, political party, Fact Tank, Pew Research Center. https://www.pewresearch.org/fact-tank/2020/07/13/activism-on-social-media-varies-by-race-and-ethnicity-age-political-party/
 Pew Research Center (2019). Social Media Fact Sheet. Washington, D.C. https://www.pewresearch.org/internet/fact-sheet/social-media/
 Pacheco, C. M., Daley, S. M., Brown, T., Filippi, M., Greiner, K. A., & Daley, C. M. (2013). Moving forward: breaking the cycle of mistrust between American Indians and researchers. American Journal of Public Health, 103(12), 2152-2159. https://doi.org/10.2105/AJPH.2013.301480
 Kaity, M., & Balakrishnan, V. (2020). Sentiment lexicons and non-English languages: a survey. Knowledge and Information Systems, 1-36.