Documentation & Code Style#

Q&A (5/28)

Q: can we go over relative and absolute paths? I don’t recall going over that one too in depth (i may have been distracted lwk)
A: I won’t spend time in class, as it’s not a main focus going forward, but to review: https://shanellis.github.io/pythonbook/content/02-getting-started/file_paths.html

Q: I’m starting to get lost, are there any more video quizzes to clarify that go more in depth?
A: No more quizzes/videos, but there is text in the textbook in the Strong Code section: https://shanellis.github.io/pythonbook/content/08-good-code/good_code_intro.html

Q: How will this be used for our final project? Will we be discussing the final project anymore in class?
A: Your final project/exam will require code be stored in a module. We will continue to discuss the final project.

Course Announcements (5/28)

Due this week:

  • A5 due Sunday (Scientific Computing)

Notes:

  • CL8 due next Friday (Testing & Documentation)

  • Please complete your SETs

  • Practice Final available today at 5P

    • cumulative (Variables-Code Projects)

    • 6 MC; 5 SA; 1 Debugging; 1 Testing

    • take-home final focuses on material post-E2

Q&A (6/2)

Q: what is a edge case?
A: By definition “a situation that occurs at the extreme boundaries of expected input or behavior.” In practice this means testing/considering the slightly less expected inputs to a function.

Q: How do we test code that involves user input?
A: Great question! You’re able to “mock” what the user input would be in your test. See examples here: https://realpython.com/python-unittest/#testing-with-fake-objects-unittestmock or this discussion I had with ChatGPT.

Q: how heavy is the final for this material? (Testing)
A: There is a “big” testing question at the end of the in-person where you’re asked to write the methods to test a proviced function. On the take-home it will be required as well (but you’ll have your notes/Internet there)

Course Announcements (6/2)

Due:

  • CL8 due Friday (Testing & Documentation)

  • Final Project:

    • submit project by Sun

    • take oral exam next week at assigned slot

      • Note: you should already be signed up for a slot or added your availability on Piazza pinned post

  • Final Exam:

    • take in-person at assigned slot

      • Note: Practice Exam is on PL

    • submit take-home by Tues of finals week (released this Fri at 5P)

Notes:

  • Please complete your SETs

  • On Thurs we’ll discuss/demo code projects (how all the pieces fit together), breakdown of topics covered on in-person final exam, and wrap things up

Today’s Plan#

  • Documentation

    • code comments

    • docstrings

  • Code Style

    • readability

    • PEP8 (style guides)

Strong Code Projects#

Projects are trusted when they are:

  1. well written with good code style

  2. well documented

  3. tested

Code Documentation#

Code documentation is text that accompanies and/or is embedded within a software project, that explains what the code is and how to use it.

Stuff written for humans in human language to help the humans.

Code Comments vs. Documentation#

Comments#

Comments are string literals written directly in the code, typically directed at developers - people reading and potentially writing the code.

Documentation#

Documentation are descriptions and guides written for code users.

Docstrings#

Docstrings are in-code text that describe modules, classes and functions. They describe the operation of the code.

Numpy style docs#

Numpy style docs are a particular specification for docstrings.

Example Docstring#

def add(num1, num2):
    """Add two numbers together. 
    
    Parameters
    ----------
    num1 : int or float
        The first number, to be added. 
    num2 : int or float
        The second number, to be added.
    
    Returns
    -------
    answer : float
        The result of the addition. 
    """
    
    answer = num1 + num2
    
    return answer

Docstrings

  • multi-line string that describes what’s going on

  • starts and ends with triple quotes """

  • one sentence overview at the top - the task/goal of function

  • Parameters : description of function arguments, keywords & respective types

  • Returns : explanation of returned values and their types

Docstrings are available to you outside of the source code.

Note: GenAI is pretty good at generating docstrings…but read what it output if you use it

Docstrings are available through the code#

add?
# The `help` function prints out the `docstring` 
# of an object (module, function, class)
help(add)

__doc__#

# Docstrings get stored as the `__doc__` attribute
# can also be accessed from there
print(add.__doc__)

Documentation for a Software Project#

For our final project/exams, we’ll only require code comments and docstrings but documentation IRL goes beyond this…

Documentation Files:

  • A README is a file that provides an overview of the project to potential users and developers

  • A LICENSE file specifies the license under which the code is available (the terms of use)

  • An API Reference is a collection of the docstrings, listing public interfaces, parameters and return values

  • Tutorials and/or Examples show examples and tutorials for using the codebase

Documentation Sites#

Documentation sites are a host of a package's documentation, for code users.

Example: Function Documentation#

numpy.array : https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

Example: Package Documentation#

scikit learn (sklearn) : https://scikit-learn.org/stable/index.html

Activity: Documentation#

Answer the following questions in the Google Form: https://forms.gle/ehEqET4L6pntWLAg6

  1. What should minimally be included in a numpy-style docstring?

  2. Use an LLM to generate (or do it by hand) a numpy-style docstring for the following class:

class Person:

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def birthday(self):
        self.age += 1
        return 'Happy Birthday!'

Code Style#

Code Readability (API)#

“Code is more often read than written” - Guido van Rossum

So: code should be written to be readable by humans.

Note: one of those humans is future you.

The Zen of Python#

import this

Writing Readable Code#

So how do we write good code for humans?

  • Use good structure

  • Use good naming

  • Use code comments and include documentation

Good Structure#

If you design your program using separate functions for each task, avoid copying + pasting (functions and loops instead), and consider structure beforehand, you’ll be set up for success

Good Naming#

Clear names are for humans. The computer doesn’t care, but you and others reading your code do.

Code Comments & Documentation#

Helpful comments and documentation take your code to the next level. The final piece in the trifecta of readable code!

Good code has good documentation - but code documentation should not be used to try and fix unclear names, or bad structure.

Rather, comments should add any additional context and information that helps explain what the code is, how it works, and why it works that way.

Motivation: Think-Pair-Share#

Consider both functions…which one is it easier to figure out what it does? Discuss why.


def ff(jj):
    oo = []; jj = list(jj) 
    for ii in jj: oo.append(str(ord(ii)))
    return '+'.join(oo)
ff('Hello World.')

def convert_to_unicode(input_string):
    string = []
    input_list = list(input_string)
      
    for character in input_list: 
        string.append(str(ord(character)))
        
    output_string = '+'.join(string)
    return output_string

convert_to_unicode('Hello World.')
  • A) Returns unicode code points, as a list

  • B) Encodes a string as a cypher, returning a string of alphabetical characters

  • C) Returns unicode code points, as a string

  • D) Encodes inputs alphabetical characters, returned as a list

  • E) This code will fail

…and documentation#

That example from before…Improvement Considerations:

  • Structural considerations: indentations & spacing

  • Improved naming: functions & variables

  • Add Comments within code

  • Proper Documentation!

def convert_to_unicode(input_string):
    """Converts an input string into a string containing the unicode code points.
    
    Parameters
    ----------
    input_string : string
        String to convert to code points
        
    Returns
    -------
    output_string : string
        String containing the code points for the input string.
    """ 
    
    string = []
    # converting input string to list to make it iterable
    input_list = list(input_string)

    for character in input_list: 
        string.append(str(ord(character)))
        
    output_string = '+'.join(string)
    return output_string

convert_to_unicode('Hello World.')

Code Style Review#

Or: How to be Pythonic

Reasons to be Pythonic:

  • user friendly for humans

  • extra work up-front on the developers (pays off on the long run)

  • best to practice this early on (I promise!)

Style Guides#

Coding style refers to a set of conventions for how to write good code.

Consistency is the goal. Rules help us achieve consistency.

Much of this will apply to other programming languages, so it’s good to learn…regardless of language.

Some of these Code Style notes will be more specific to Python, however.

Python Enhancement Proposals (PEPs)#

Python PEPs are proposals for how something should be / work in the Python programming language.

These are written by the people responsible for the Python Programming language.

PEP are voted on before incorporation.

PEP8#

PEP8 is an accepted proposal that outlines the style guide for Python.

Defines the style guide for Pythonistas (people who code in Python).

Code Style: Structure#

  • blank lines

  • indentaion

  • spacing

  • length <- NEW

  • imports

Blank Lines#

  • Use 2 blank lines between functions & classes, and 1 between methods

  • Use 1 blank line between segments to indicate logical structure

Indentation#

Use spaces to indicate indentation levels, with each level defined as 4 spaces.

Spacing#

  • Put one (and only one) space between each element

  • Index and assignment don’t have a space between opening & closing ‘()’ or ‘[]’

my_list = ['a', 'b', 'c']
# avoid having space after list and before square brackets
my_list [0]
# instead do this:
my_list[0]

Line Length (NEW)#

  • PEP8 recommends that each line be at most 79 characters long

Computers used to require this.

But, super long lines are hard to read at a glance.

Multi-Line#

my_long_list = [1, 2, 3, 4, 5,
                6, 7, 8, 9, 10]
# Note: you can explicitly indicate a new line with '\'
my_string = 'Python is ' + \
            'a pretty great language.'

One Statement Per Line#

  • While you can condense multiple statements into one line, you usually shouldn’t.

# Badness
for i in [1, 2, 3]: print(i**2 + i%2)
# Goodness
for i in [1, 2, 3]:
    print(i**2 + i%2)

Imports#

  • Import one module per line

  • Avoid * imports

  • Use the import order: standard library; 3rd party packages; local / custom code

# Badness
from numpy import *

import os, sys
# Goodness
import os
import sys

import numpy as np

Note: If you don’t know how to import a local/custom module, figure that out this week in office hours.

Naming (review)#

Valid Names#

  • Use descriptive names for all modules, variables, functions and classes, that are longer than 1 character

Naming Style#

  • CapWords (leading capitals, no separation) for Classes

  • snake_case (all lowercase, underscore separator) for variables, functions, and modules

Code Comments#

Inline code comments#

Inline code comments should use #, and be written at the same indentation level of the code it describes.

How to use comments:

  • Generally:

    • focus on the how and why, over literal ‘what is the code’

    • explain any context needed to understand the task at hand

    • give a broad overview of what approach you are taking to perform the task

    • if you’re using any unusual approaches, explain what they are, and why you’re using them

  • Comments need to be maintained - make sure to keep them up to date

Bad Comments

# This is a loop that iterates over elements in a list
for element in list_of_elements:
    pass

Good Comments

# Because of X, we will use approach Y to do Z
for element in list_of_elements:
    # comment for code block
    pass

Code Style: Comments#

Out-of-date comments are worse than no comments at all.

Keep your comments up-to-date.

Block comments#

  • apply to some (or all) code that follows them

  • are indented to the same level as that code.

  • Each line of a block comment starts with a # and a single space

# Badness
import random

def week_10():
# help try to destress students by picking one thing from the following list using random
    statements = ["You've totally got this!","You're so close!","You're going to do great!","Remember to take breaks!","Sleep, water, and food are really important!"]
    out = random.choice(statements)
    return out

week_10()
# Goodness
def week_10():
    
    # Randomly pick from list of de-stressing statements
    # to help students as they finish the quarter.
    statements = ["You've totally got this!", 
                  "You're so close!", 
                  "You're going to do great!",
                  "Remember to take breaks!",
                  "Sleep, water, and food are really important!"]
    
    out = random.choice(statements)
    
    return out

week_10()

Inline comments#

  • to be used sparingly

  • to be separated by at least two spaces from the statement

  • start with a # and a single space

# Badness
week_10()#words of encouragement
# Goodness
week_10()  # words of encouragement

Activity: Code Style#

Complete the Google Form: https://forms.gle/hxeW5qCXQAad7ZfH9

Improve the code style of the provided class, using PEP8 guidelines:

#original
class CoffeeTracker:
    def __init__(self,n):
        self.n=n
        self.ts=0.0
        self.tc=0
        self.ap=0.0
    def buy_coffee(self,p):
        for p1 in p:self.ts+=p1;self.tc+=1
        if self.tc>0:self.ap=self.ts/self.tc
        print(f"{self.n} bought {self.tc} coffees.")
        print(f"Average price per coffee: ${self.ap:.2f}")

Suggested improvements:

#improved