Search

Search Results

Toronto Metropolitan University Dataverse Logo
Soares, Felipe Bonow; Saiphoo, Alyssa; Gruzd, Anatoliy; Mai, Philip 2022-08-03 As part of our ongoing research on misinformation and disinformation of various types, we conducted an exploratory analysis of tweets discussing an unverified report that Russian forces engaged in a chemical attack in Mariupol, Ukraine. This claim was made on April 11 by Ukraine’s Azov regiment. At the time when this claim was first reported, Mariupol was surrounded by Russian troops, making it difficult, if not nearly impossible, for journalists to gain access to the city and to interview local sources. We were interested in examining how this claim was discussed on social media because if it was true, it had the potential to galvanize the world’s sentiments in support of Ukraine and against Russia. Using Twitter’s Academic Track API, we retroactively collected 246,189 public tweets posted between April 6 and 13, 2022 to analyze how Twitter users were discussing this claim. We collected tweets related to this case a few days before and after April 11 to capture speculation before the accusation, and the reaction to it. We used the search query “chemical (weapons OR weapon) (Mariupol OR Ukraine)” to collect data. (For data completeness, we kept 12,193 tweets referenced by one of the tweets in the search results.)
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy 2021-01-07 On October 2nd, 2020, U.S. President Donald Trump tweeted that he and the First Lady of the United States (FLOTUS), Melania Trump, both tested positive for COVID-19 (https://twitter.com/realDonaldTrump/status/1311892190680014849). Within seconds, his tweet received thousands of replies. The dataset consists of 298,172 replies to Donald Trump’s tweet announcing his COVID-19 diagnosis, posted on October 2nd between 6am and 12:30pm (ET). It contains tweet ids, the toxicity scores (as calculated by Google's Perspective API via https://Communalytic.com) and tweet availability status values (via twarc library). Following Twitter’s API policy, we stripped metadata associated with each tweet. So, if you’d like to examine potential relationships between other metadata elements, you would need to recollect original tweets using tools like DocNow’s Hydrator first. The only downside of this approach is that tweets that have been blocked or deleted will not be recollected. To help you get started, we also shared our Exploratory Data Analysis (EDA) Python script at https://github.com/RUSocialMediaLab/toxicityanalysis
Toronto Metropolitan University Dataverse Logo
Penfound, Elissa; Vaz, Eric 2020-09-30 This is a dataset of historic southern Ontario road (line), rail (line), municipality, (point) county (polygon) and district (polygon) shapefiles. The shapefiles were created based on georeferenced historic maps of Ontario from the years 1800, 1818, 1837, 1852, 1861, 1879, 1901, 1922, and 1955. Historic Maps of southern Ontario, between the years 1800-1970, were collected from oldmapsonline.org to measure for land use and land cover change in roughly 20-year intervals (1800, 1818, 1837, 1852, 1861, 1879, 1901, 1922, and 1955). Each historic map was georeferenced in QGIS with the georeferencer tool. The reference map used included a Great Lakes shapefile, an Ontario Lakes and Rivers shapefile and an Ontario Boundary shapefile. Bodies of water were used for the georeferencing to ensure consistency and accuracy over the extensive time period. 200 reference points were used for each map, 150 points were used for the initial georeferencing and an additional 50 points were added to account for errors. Each map was projected in the same coordinate reference system (EPSG:32617 – WGS 84 / UTM zone 17N). Once each map was georeferenced it was exported as a geo.tiff image. The geo.tiff images were added to QGIS and present day (2011-2020) road network, rail network, municipality and county shapefile reference layers were layered on top of the geo.tiff images. When creating the historic road and rail network shapefile layers, lines were drawn over the reference (present-day) shapefile lines that corresponded with the road and rail lines on the historic geo.tiff maps. This was done to account for any distortion that remained from georeferencing. When creating the historic municipality point shapefile layers, points were drawn over the municipality points on the historic geo.tiff maps. To account for distortion from georeferencing, the drawn historic points were matched to the reference (present-day) shapefile points with the NNJoin (nearest neighbour join) algorithm. When creating the historic county/district polygon shapefile layers, reference (present-day) county polygons were duplicated and the boundaries edited to match the boundaries of the historic geo.tiff maps. Boundaries were not edited to align perfectly with the historic geo.tiff maps, but rather to mirror historic boundaries, but remain un-distorted. All non-spatial data was removed from the attribute tables of reference shapefiles used. Match: Y (yes) or N (no) refers to whether or not the shapefile feature is based off of a feature in the present-day reference shapefile. Y indicates that it is based off of a feature in the present-day reference shapefile, N indicates that it is based off of the historic georeferenced map. Road type: in the Ontario1955RoadLines shapefile roads are classified as 1, 2, 3, 4, or 5. See below for road type classifications. 1 – dirt roads (unimproved) 2 – graded roads (drained and maintained) 3 – improved roads (gravel, stone, topsoil-sand clay) 4 – paved roads (asphalt, concrete, surface-treated) 5 – superhighways and toll highways (4 lanes or more)
Toronto Metropolitan University Dataverse Logo
Mai, Philip; Gruzd, Anatoliy 2022-09-14 The report provides a snapshot of the social media usage trends amongst online Canadian adults based on an online survey of 1500 participants. Canada continues to be one of the most connected countries in the world. An overwhelming majority of online Canadian adults (94%) have an account on at least one social media platform. However, the 2022 survey results show that the COVID-19 pandemic has ushered in some changes in how and where Canadians are spending their time on social media. Dominant platforms such as Facebook, messaging apps and YouTube are still on top but are losing ground to newer platforms such as TikTok and more niche platforms such as Reddit and Twitch.
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy; Mai, Philip; Recuero, Raquel; Soares, Felipe 2020-02-06 <b>Data collection</b>: Using Twitter's Search API, we collected 363,706 public tweets (in English) mentioning #elxn43 and directed at 1,116 candidates running for office during the 2019 Federal election in Canada. Tweets were collected between September 29 and October 28, 2019. <br> <b>Manual coding</b>: The data set contains a random sample of 3,637 tweets (1% sample) hand coded as either 'toxic' or 'insulting' by using three coders. Only tweets that were flagged by all three coders were considered as either 'toxic' (TOXICITY_3CODERS_AGREE=1) or 'insulting' (INSULT_3CODERS_AGREE = 1).
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy; Mai, Philip; Soares, Felipe Bonow; Saiphoo, Alyssa 2022-07-12 This report examines the extent to which Canadians are exposed to and might be influenced by pro-Kremlin propaganda on social media based on a census-balanced national survey of 1,500 Canadians conducted between May 12–31, 2022. Among other questions, the survey asked participants about their social media use, news consumption about the war in Ukraine, political leanings, as well as their exposure to and belief in common pro-Kremlin narratives.
Toronto Metropolitan University Dataverse Logo
Dubois, Elizabeth; Gruzd, Anatoliy; Mai, Philip; Jacobson, Jenna 2018-12-06 The report examines the ways online Canadian adults are engaging politically on social media. This is the third and final report based on a census-balanced survey of 1,500 Canadians using quota sampling by age, gender, and geographical region. The other two reports in this series are: "The State of Social Media in Canada 2017" and "Social Media Privacy in Canada". The series is published by the Social Media Lab, an interdisciplinary research lab at Ted Rogers School of Management, Ryerson University. The lab studies how social media is changing the ways in which people communicate, share information, conduct business and how these changes are impacting our society.
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy; Mai, Philip 2020-08-19 The current dataset contains 237M Tweet IDs for Twitter posts that mentioned "COVID" as a keyword or as part of a hashtag (e.g., COVID-19, COVID19) between March and July of 2020. Sampling Method: hourly requests sent to Twitter Search API using Social Feed Manager, an open source software that harvests social media data and related content from Twitter and other platforms. NOTE: 1) In accordance with Twitter API Terms, only Tweet IDs are provided as part of this dataset. 2) To recollect tweets based on the list of Tweet IDs contained in these datasets, you will need to use tweet 'rehydration' programs like Hydrator (https://github.com/DocNow/hydrator) or Python library Twarc (https://github.com/DocNow/twarc). 3) This dataset, like most datasets collected via the Twitter Search API, is a sample of the available tweets on this topic and is not meant to be comprehensive. Some COVID-related tweets might not be included in the dataset either because the tweets were collected using a standardized but intermittent (hourly) sampling protocol or because tweets used hashtags/keywords other than COVID (e.g., Coronavirus or #nCoV). 4) To broaden this sample, consider comparing/merging this dataset with other COVID-19 related public datasets such as: https://github.com/thepanacealab/covid19_twitter https://ieee-dataport.org/open-access/corona-virus-covid-19-tweets-dataset https://github.com/echen102/COVID-19-TweetIDs
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy; Jacobson, Jenna; Mai, Philip; Dubois, Elizabeth 2018-06-04 This is the second report in the series based on an online survey of 1,500 Canadians. Building on the first report that provides a snapshot of the social media usage trends in Canada, this second report analyzes social media users’ privacy perceptions and expectations.
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy 2017 <b>Data Collection:</b> Data was collected using a custom web application (Communalytic, available at: https://communalytic.com/) that used Reddit’s public API (https://www.reddit.com/dev/api/). We sampled one percent of public Reddit comments posted in 2016 from AskScience. Since the dataset was collected retroactively, it does not include comments deleted by authors or moderators. <br> <b>Manual Coding:</b> The sample comments were then manually coded using the 'Leaning in the Wild' Coding Schema by three independent coders, each of whom had first completed a schema tutorial training-module. Each coder (1) reviewed a submission that started a thread and was often framed as a question (see the "submissions_title" column) and then (2) assigned up to three applicable codes to the reply message (see the "text" column). The values stored under columns C1-C8 represent the number of coders who agreed on a given code.
Toronto Metropolitan University Dataverse Logo
Gruzd, Anatoliy; Mai, Philip 2020-05-10 To examine the “digital hygiene” practices of Canadians, we asked 1,500 online Canadian adults about where they get news about COVID-19 from, how often they encounter misinformation on this topic, and what do they do about it. The survey was conducted between April 9–17, 2020. This report was produced by the Social Media Lab at Ted Rogers School of Management, Ryerson University. It is released as part of the Social Media Data Stewardship Project funded by the Canada Research Chairs Program, and the COVID-19 Misinformation Portal, a rapid response project funded by the Canadian Institutes of Health Research.

Map search instructions

1.Turn on the map filter by clicking the “Limit by map area” toggle.
2.Move the map to display your area of interest. Holding the shift key and clicking to draw a box allows for zooming in on a specific area. Search results change as the map moves.
3.Access a record by clicking on an item in the search results or by clicking on a location pin and the linked record title.
Note: Clusters are intended to provide a visual preview of data location. Because there is a maximum of 1000 records displayed on the map, they may not be a completely accurate reflection of the total number of search results.