Let’s Make(file) it more replicable!

Hello! I’m Eglur, and this is my first authored post here (all the previous ones were also written by me, but this is the first with my name, so I felt it was appropriate to introduce myself). Nice to meet you! How have you been?

Now where were we…? Oh yes, replicability! Here at Bits4Waves we take Sciencing very seriously! The most important premise is that we have to have massive fun in the process! But this does not mean that we shouldn’t follow some guidelines…

One very important component of the Scientific Method is that of replicability: other people may be able to obtain the same results if they apply the same methodology. The course Principles, Statistical and Computational Tools for Reproducible Data Science gives a very useful practical tip: avoid doing things manually; instead, create scripts for everything.

We’ve been running some commands manually, and now is a good time to create a “script” for them. Specifically, we’ll create a Makefile to generate the shortcodes.

The Makefile needs a recipe to obtain each target. Let’s review how we got each one of the targets:

  1. shortcodes-orig.txt: used a dedicated Makefile
  2. shortcodes-sort.txt: ran commands manually
  3. shortcodes-uniq.txt: ran commands manually
  4. shortcodes-test.txt: ran commands manually

Nothing like getting some perspective, huh?! It looks like we started things well, and… derailed a little bit afterwards. Nothing to worry, though! Let’s fix it right away!

As we already have a Makefile, it seems natural to use it—we just have to include the remaining targets—shortcodes-sort.txt, shortcodes-test.txt, and shortcodes-uniq.txt.

But some things changed after the Makefile was created—the plot thickens:

  • the shortcode- files earned the right to have their own folder shortcodes/
  • the original file was renamed from shortcode.txt to shortcode-orig.txt (because OCD, that’s why :-).

Therefore, we’ll have to account for these changes while dealing with reconciling past, present and near future.

Practically, we should have the Makefile in its proper context. Let’s move it to the shortcodes/ folder: (We’ll not use a script for this, but document it here, because this is a structural change, that should really be done once—meaning, it doesn’t deserve a script of its own… Please share your thoughts in the comments below!)

PROJECT=~/sci/100daysofpractice-dataset
pushd $PROJECT
git mv Makefile shortcodes/

We have to make some accomodations for the new place inside the Makefile. First, it needs the correct Python virtual environment. Let’s get the appropriate command for that.

PYTHON=../venv/bin/python

Now, the command inside the Makefile is not correct, we need to fix it:

instaloader --login ${IG_USER} --no-profile-pic --no-pictures --no-videos --no-captions "#100daysofpractice"

To get the shortcodes, we used the script get-shortcodes.py. Let’s fix that:

SRC=../src

and

$(PYTHON) $(SRC)/get-shortcodes.py

The script get-shortcodes.py is not currently accomodating for he OCD, as it creates the file shortcodes.txt instead of shortcodes-orig.txt:

import instaloader import time import os

I = instaloader.Instaloader()
I.interactive_login(os.getenv('IG_USER'))
query = instaloader.Hashtag.from_name(I.context, '100daysofpractice')
k = 1
for post in query.get_all_posts():
    print(k)
    shortcode = post.shortcode
    print(shortcode)
    with open('shortcodes.txt', 'a') as file_object:
        file_object.write(shortcode + '\n')
    time.sleep(1)
    k += 1

Let’s fix that…

import instaloader
import time
import os

I = instaloader.Instaloader()
I.interactive_login(os.getenv('IG_USER'))
query = instaloader.Hashtag.from_name(I.context, '100daysofpractice')
k = 1
for post in query.get_all_posts():
    print(k)
    shortcode = post.shortcode
    print(shortcode)
    with open('shortcodes-orig.txt', 'a') as file_object:
        file_object.write(shortcode + '\n')
    time.sleep(1)
    k += 1

Done!

And I think we’ll call it a day! My dinner is getting colder here LOL

See you soon! Take care!

Published by eglur

I have a B.Sc. in Computer Science and a M.Sc. in Computer Engineering, both from the University of São Paulo, and have been programming for 16 years.

One thought on “Let’s Make(file) it more replicable!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: