May 311 min read

Harvard's CS50x Week 6: A Preface to Python

When people think of programming, Python is usually the name that comes to mind.

So why does CS50 pick up this universally simpler language in the second half of the course?

Professor David J. Malan preludes his lecture with this slightly cryptic statement:

“The goal isn't just to throw another fire hose of content and syntax and whatnot at you, but rather, to really equip you all to actually teach yourself new languages in the future.”

Basically, Python will not always be the most popular language, so it’s more beneficial for us to understand the fundamentals of programming before abstracting away most of the nuances!

That aside, let’s go over my personal notes and observations on Week 6’s lecture.

Lecture Notes

Even doing a task as simple as printing a message to the terminal was a pain in C…

#include <stdio.h>

int main(void)
{
    printf("hello, world\n");
}

Python abstracts away most of the cryptic syntax of C, condensing four lines of code into just one!

print("hello, world")

To run this code, execute the command python hello.py in the terminal window, where python acts like ./, hello is our filename, and .py is the file extension (like .c).

Now let’s try retrieving user input. As per our current knowledge, our program would probably look something like this:

#include <cs50.h>
#include <stdio.h>

int main(void)
{
    string answer = get_string("What's your name? ");
    printf("hello, %s\n", answer);
}

In Python, once again, this code can be simplified.

from cs50 import get_string

answer = get_string("What's your name? ")
print("hello, " + answer)

Notice how…

We retrieved the ‘get_string()’ function from the cs50 library, similar to how we used the #include tag in C.
The ‘main()’ function is no longer a necessity in Python
The + symbol concatenates (joins together) the strings “hello, ” and answer
Semicolons are not used in Python!

Another way of integrating variables into print() statements is through f-strings.

print(f"hello, {answer}")

Within f-strings, between two curly braces is where you can place a variable (like answer) such that it renders as the value stored inside of the variable.

Data Types

In the example above, you might have noticed that we were able to declare the variable answer without a prepended datatype (unlike in C).

It turns out that when we declare or initialize variables in Python, we don’t need to specify the data type (the Python interpreter does that for us!)

Data types in Python include, but are not limited to:

bool
float
int
str
range
list
tuple
dict
set

Let’s simulate a dictionary in Python. Notice how the first line assigns the data type set to words. A set is essentially a collection of data that is both unindexed and unordered, kind of like a hashtable.

words = set()

def check(word):
    
    """Return true if word is in dictionary else false"""
    
    if word.lower() in words:
        return True
    else:
        return False

def load(dictionary):
    
    """Load dictionary into memory, returning true if successful else false"""
    
    file = open(dictionary, "r")

    for line in file:
        word = line.rstrip()
        words.add(word)

    file.close()
    return True

def size():

    """Returns number of words in dictionary if loaded else 0 if not yet loaded"""

    return len(words)

def unload():

    """Unloads dictionary from memory, returning true if successful else false"""

    return True

Our first function, ‘check()’, determines if a certain word is in our set of words. Right now, words is empty, but it can be filled using our ‘load()’ function.

load() takes a file as a parameter, opening the file in reading mode (“r”), and then loops over every line in the file, adding each line to our set, words. Finally, the file is closed to prevent memory leaks. You might have noticed the method rstrip() was applied to each of our lines. This is simply because at the end of each line is an unwanted space, which rstrip() removes for us.

size() returns the number of elements in words via the ‘len()’ function

unload() simply returns True as Python handles memory management on its own (so there’s no need to worry about freeing memory!)

Conditionals

Recall that in C, we were able to compare two integers by the following code:

#include <cs50.h>
#include <stdio.h>

int main(void)
{

    // Prompt user for integers

    int x = get_int("What's x? ");
    int y = get_int("What's y? ");

    // Compare integers

    if (x < y)
    {
        printf("x is less than y\n");
    }

    else if (x > y)
    {
        printf("x is greater than y\n");
    }

    else
    {
        printf("x is equal to y\n");
    }
}

In Python, conditionals look slightly different…

from cs50 import get_int

# Prompt user for integers

x = get_int("What's x? ")
y = get_int("What's y? ")

# Compare integers

if x < y:
    print("x is less than y")
elif x > y:
    print("x is greater than y")
else:
    print("x is equal to y")

A notable difference in using external libraries is that in C, you had to #include the entire cs50 library, for instance, whereas in Python, you have two options:
import cs50
from cs50 import get_int, get_float, get_string

The former imports all features of the cs50 library, whereas the latter imports three specific functions from the library (in our case, we just imported get_int)

Further, parenthesis are no longer required, curly braces are replaced by a colon and indentations, and else if is now elif

Loops

If we wanted to print a message thrice in C, our code would look something like this…

#include <stdio.h>

int main(void)
{
    int i = 0;
    while (i < 3)
    {
        printf("hello\n");
        i++;
    }
}

In Python, very little changes when dealing with while loops:

i = 0

while i < 3:
    print("hello")
    i += 1

One thing to note here: the ‘syntactic sugar’ of using i++ to increment in C is no longer an option in Python.

Likewise, for loops can be replicated in Python as such:

for i in range(3):
    print("hello")

Note that our call to range() creates a list of size three, starting from zero up until two. So in essence, our code is no different than

for i in [0, 1, 2]:
    print("hello")

Calculator and Compare

from cs50 import get_int

x = get_int("x: ")
y = get_int("y: ")

print(x + y)

While this program works as intended, let’s try to remove the training wheels that the cs50 library provides us via the ‘get_int()’ function.

x = input("x: ")
y = input("y: ")

print(x + y)

Running this code, input does seem to take user input just like get_int(), but when added, the output strangely places both inputs side-by-side.

In this case, the interpreter mistook x and y as strings rather than as integers, resulting in concatenation. To fix this code, rewrite your code as follows:

x = int(input("x: "))
y = int(input("y: "))

print(x + y)

Now, using the ‘int()’ function included in Python, we’re able to cast our strings as integers.

Similarly, division also seems to work at first, yielding a decimal value.

x = int(input("x: "))
y = int(input("y: "))

print(x / y)

However, upon dividing ints like 2 and 3, for instance, floating-point imprecision will occur, wherein the value displayed (0.666666) is clearly inaccurate, just like in C.

Let’s move onto comparisons. Previously, we looked at comparing integer values. We can apply the same for comparing characters. Starting with C…

int main(void)
{
    char c = get_char("Do you agree? ");

    if (c == 'Y' || c == 'y')
    {
        printf("Agreed.\n");
    }

    else if (c == 'N' || c == 'n')
    {
        printf("Not agreed.\n");
    }
}

In Python, we can tackle this problem in two different ways.

from cs50 import get_string

s = get_string("Do you agree? ")

if s == "Y" or s == "y":
    print("Agreed.")

elif s == "N" or s == "n":
    print("Not agreed.")

The primary difference between these programs is that chars don’t exist in Python – instead, strings are used. Also, the vertical bars signifying ‘or’ in C are simply replaced with the keyword or.

Another, arguably more efficient, way of implementing this program is through the keyword in. Replacing the conditionals above, rewrite your code as follows:

if s in ["y", "yes"]:
    print("Agreed.")

elif s in ["n", "no"]:
    print("Not agreed.")

This modified version allows us to check if a string s is in one of the two values within each of the lists.

Object-Oriented Programming (OOP)

Objects are values that have specific functions attached to them alongside their attributes.

Recall that in C, we were able to create a struct that essentially melds together multiple variable data types into one.

In Python, something similar can be achieved, wherein we can create an object (which is in essence a data type) and assign it particular functions – more technically, methods.

One object in disguise is, in fact, a string! Strings in Python come with built-in methods. Let’s take a look at how we can improve our previous program.

s = get_string("Do you agree? ")

if s.lower() in ["y", "yes"]:
    print("Agreed.")

elif s.lower() in ["n", "no"]:
    print("Not agreed.")

Note how the lower() method is attached to our string in somewhat of a dot notation. Now, instead of having to write the uppercase variations of the words in the list, you could simply convert the user’s input to lowercase.

Depending on your needs, this program could be further enhanced by calling lower() once, so as to improve run time:

s = get_string("Do you agree? ")
s = s.lower()

if s in ["y", "yes"]:
    print("Agreed.")

elif s in ["n", "no"]:
    print("Not agreed.")

This program could also work by using the upper() method and replacing the contents of the list with their uppercase counterparts!

Abstraction Through Functions

Let’s circle back to our loop program, in which we printed a message a set number of times. We can add a couple of functions to better organize our code:

def main():
    hello(3)


def hello(n):
    for i in range(n):
        print("hello")


main()

Unlike in C, remember, our ‘main()’ function is not required for the program to run, but is used by convention to better organize our code.

hello() abstracts away printing the message “hello” n number of times, where n is the parameter passed to hello() that determines the number of times the loop will run.

Print Statements and Mario

One of our first challenges in C was to replicate the obstacles in the Mario game. In Python, we can implement such a program in various ways.

A vertical tower is simple enough:

for i in range(3):
    print("#")

To add a degree of dynamicity to our code, we can add a ‘get_height()’ function that retrieves a specific height from the user.

from cs50 import get_int


def main():
    height = get_height()

    for i in range(height):
        print("#")


def get_height():
    while True:
        n = get_int("Height: ")

        if n > 0:
            return n


main()

Notice here that because Python doesn’t have do-while loops like C does, we have to resort to a while loop fitted with a conditional inside.

Now, let’s modify get_height() by removing the guard rails provided by the cs50 library and using input() to get a string and int() to convert that string to an int.

Hold on… what if the user enters non-integer characters? In such a case, the program would throw a ValueError. In order to catch this error, we can use the try and except blocks.

def get_height():
    while True:
        try:
            n = int(input("Height: "))

            if n > 0:
                return n

        except ValueError:
            print("Not an integer")

Code indented inside of the try block is what the program attempts to perform without errors. If a ValueError is found (as indicated by ‘except ValueError’), the program will run the code within the except block.

It’s important to note that we can change ValueError to any type of error for the program to catch. Also, if we were to simply write ‘except’ without a specific error type after, the program would catch any and all errors.

So how could we implement horizontal rows of blocks? Again, Python provides us with multiple options:

for i in range(4):
    print("#", end="")

print()

The end inside of the print statement allows us to override the typical behavior of print statements, essentially replacing the \n with an empty string.

print("#" * 4)

Much like how we can add strings, there is an option for multiplying them as well. In this case, we get the character # repeated four times without any lines skipped.

Finally, in order to create a square block, we need to account for both rows and columns – this clearly calls for a nested for loop.

for i in range(3):
    for j in range(3):
        print("#", end="")

    print()

Notice how after every three blocks are printed in a row, an empty print statement skips a line, allowing for a new row on the next line.

Lists

Another data structure in Python are lists – in fact, lists are considered objects like strings because they have built-in methods.

A list in Python is comparable to an array in C. Let’s take a look at an example:

from cs50 import get_int

scores = []

for i in range(3):
    score = get_int("Score: ")
    scores.append(score)

average = sum(scores) / len(scores)

print(f"Average: {average}")

Assigning an empty set of brackets to the variable scores initializes our list (albeit with no values yet).

Then, we get an integer score from the user and append it (add it to the end of the list) via the append() method, which we repeat thrice.

Finally, we use the sum() function that is included with Python, along with the len() function, to add up all values in scores and divide it by its length, to then print this average value using an f-string.

An alternative to using the append() method would be…

scores += [score]

Here, we add our score variable to a list of size one, which we add to our main list scores.

Command-Line Arguments and Sys

Another library a part of Python is sys, which allows us to do a number of things, including accessing command-line arguments.

from sys import argv

if len(argv) == 2:
    print(f"hello, {argv[1]}")
else:
    print("hello, world")

Here, our program checks if the list of arguments is exactly two – the filename and the user’s input. If so, the user’s input, or element one of argv, is printed.

What if we wanted to print all of the elements within the arguments array?

Intuitively, you would probably go about this by looping over a list the length of argv and then printing the ith element of that list… but here a problem arises.

Let’s think about what the interpreter would make of our command-line arguments:

python hello.py Bob

Here, python is analogous to ./ in C, so it’s not included in argv. So, because we don’t want to include the filename in our arguments list, we would have to access every element in argv after the first as such:

from sys import argv

for arg in argv[1:]:
    print(arg)

Notice how we’ve used square brackets to slice the list, accessing everything from the first element to the last. The syntax for slicing a list is list[initial : final].

In our case, we started at element one, but where there seemingly should be len(argv), there is nothing. This is simply because leaving the space after the colon empty automatically means ‘up until the end of the list!’

Another feature, if you will, that the sys library provides us is exit codes:

import sys

if len(sys.argv) != 2:
    print("Missing command-line argument")
    sys.exit(1)

print(f"hello, {sys.argv[1]}")

sys.exit(0)

The ‘exit()’ function simply quits the program, stopping all procedures. Because we imported the entire sys library instead of a specific function from the library, we have to use dot notation to access exit() from sys.

Further, the argument passed into exit() is the specific exit code that can potentially indicate a specific type of error.

Searching and Swapping

Recall that in C, you had to loop over every element in a list in order to search for a specific item. In Python, this process can be greatly simplified:

import sys

names = ["Bill", "Charlie", "Fred", "George", "Ginny", "Percy", "Ron"]
name = input("Name: ")

for n in names:
    if n == name:
        print("Found")
        sys.exit(0)

print("Not found")

sys.exit(1)

This implementation of a simple search program is similar to what you would do in C, in that it manually loops over the list via a for loop. Python also offers a much simpler route in place of the loop:

Also, you might have noticed that, unlike in C, Python allows you to compare strings using the == operator, rather than having to use external functions like ‘strcmp()’ (which compares every individual char in both strings using pointers).

if name in names:
    print("Found")
    sys.exit(0)

This first line is, in essence, a linear search algorithm, checking if the variable name is in our list names!

Swapping in Python is also infinitely easier. Instead of having to create a temporary variable, you could swap two integers as follows:

x = 1
y = 2

print(f"x is {x}, y is {y}")
x, y = y, x
print(f"x is {x}, y is {y}")

Phonebook: Dictionaries and CSV

A dictionary is a data type with each element containing a key and value (key-value pairs).

An implementation of a phonebook in C using dictionaries, or dicts, could look something like this:

from cs50 import get_string

people = {
    "Carter": "+1-617-495-1000",
    "David": "+1-949-468-2750"
}

name = get_string("Name: ")

if name in people:
    print(f"Number: {people[name]}")

Our dictionary people contains two elements, each with a key (in this case a name), and a value (a phone number).

The program checks if a user-inputted name exists in the dictionary and accordingly prints out the value corresponding to the key name by indexing into the dictionary.

One more thing – if you try calling the numbers listed in the ‘phonebook’, you’ll find a special easter egg from the course!

Finally, to wrap up this lecture, we’ll take a look at how Python can handle CSV files via the csv library.

import csv

name = input("Name: ")
number = input("Number: ")

with open("phonebook.csv", "a") as file:
    writer = csv.writer(file)
    writer.writerow([name, number])

Evidently, this program first retrieves a name and number from the user. We’re more focused on what happens after.

We’ve seen the ‘open()’ function operate under read mode, or “r” in Python. This time, “a” signifies append mode, which, in our case, creates the file phonebook.csv and adds to it.

Where previously we assigned a variable to ‘open()’ and then used the ‘close()’ method to prevent any memory leaks, this time we use with, which allows us to change the file using writer in the indented area below without needing to close the file.

As you might have guessed, ‘as file’ is the same as assigning the contents of open() to a variable file. Finally, writerow() inserts the user-inputted name and number into a CSV file.

Perhaps a more organized version of this code would be through use of a DictWriter to create specifically named columns. We can modify the contents of the with block as follows:

with open("phonebook.csv", "a") as file:
    writer = csv.DictWriter(file, fieldnames=["name", "number"])
    writer.writerow({"name": name, "number": number})

name and number are defined under fieldnames and then are accessed later in a dictionary-like format (each column name is the key and the user-inputted variable is the value).

Final Thoughts

Next week will focus on SQL, a language closely tied with Python that will allow us to manipulate databases (and a personal favorite of mine).

As always, give this week’s Problem Set a shot. That’s all for Week 6!

See you soon! Meanwhile, stay tuned for updates by following my blog and LinkedIn page.

Harvard's CS50x Week 6: A Preface to Python

Recent Posts

Comments

I value your feedback.
Drop a line to let me know what you think.

hello@sabirseth.com