Dataset release: 450,000 Instagram posts with the hashtag #100daysofpractice

whose most prominent words are 100daysofpractice, music, practice, and violin.

Hello! I’m glad to announce the release of 100daysofpractice-dataset!

Below you’ll find the description of the dataset.

Data from Instagram posts with the hashtag #100daysofpractice.

The file (40 MB) contains data from 450,000 Instagram posts with the hashtag #100daysofpractice. contains two files:

  • posts.csv, which contains the posts data, and
  • metadata.txt, which contains the details about its generation.

posts.csv is in the CSV format (everything quoted with ", separated by ,). The fields therein and a short explanation are:

  • post-id: post’s unique ID
  • shortcode: a short string that can be used to access the post in a web browser (see below for instructions)
  • taken_at_timestamp: the date when it was posted
  • owner-id: a unique ID representing the user who posted it; this data was anonymized for privacy reasons, therefore this is not the real user ID
  • is_video: 1 if it is a video, 0 otherwise
  • edge_liked_by-count: number of likes
  • edge_media_to_comment-count: number of comments
  • video_view_count: number of views
  • comments_disabled: 1 if comments were disabled, 0 otherwise
  • __typename: GraphImage if it is an image post, GraphVideo for video one, or GraphSidecar for a post with more than one media
  • hashtags: hashtags from the comments

To access the post in a web browser using a shortcode, just paste it after For instance the first post with the hashtag #100daysofpractice has the shortcode BTrwiUuh8vV. Hence you may access it with the link It was posted by the creator of the hashtag, @violincase, the violin virtuosa Hilary Hahn.

