Data Download

The following links contain labeled datasets that are available for download by members of the research community. Questions about the content and structure of the data should be sent to April Edwards.

TextMining And CyberCrime Data (raw data)
GeneralData (Predation Labeled)
Formspring Labeled for Cyberbullying
MySpace Group Data Labeled for Cyberbullying

The following datasets are also available from the authors by emailing April Edwards at

  • A large manually labeled dataset (1.6 MB, archived size) for 170019 posts from the dataset
  • Additional labeled cyberbullying data from Formspring
  • A dataset for the study of Internet Identity, where posts and profiles from the same user have been collected from different platforms (seeded from 81 unique individuals)
  • A large unlabeled dataset of MySpace data (1.43 GB, archived size), from a Summer 2010 crawl, including profiles and user "wall" posts for 127,974 MySpace users.
  • A large unlabeled Formspring dataset (187 MB, archived size), from a Summer 2010 crawl containing all of the questions and answers for 18,554 Formspring users.

Additional information about these datasets can also be provided upon request.