# Code Style & Documentation

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/COGS18/LectureNotes-COGS18/blob/main/14-Documentation.ipynb)

**Q&A**

> Q: What is the difference between assertEqual and assertTrue?  
> A: `assertEqual` takes in two inputs and checks that the first input is equal to the second; `assertTrue` expects that whatever is in the parentheses evaluates to `True`

**Course Announcements**

Due this week:
- **CL9** (optional, for 2 pts EC...but **you should do it** to get testing practice)
- E1 or E2 retake (*optional*)
- **Final Exam** (6/7-6/13)
- **Post-course assessement** (due 6/13; required; will be available Friday)

Notes: 
- Please complete your SETs
- Handful of questions about **exam retake**:
    - Your final E1/E2 score after retake is 75% of your highest and 25% of your lowest?
    - What does that math look like? Well, let's say you got a 9/12.5 originally and you get 11.5/12.5 on the retake: `0.25*9 + 0.75*11.5 == 10.88` (+1.88 to your final grade)
    - Could my score go down? Yes
    - Sign up on PrairieTest; take in same location (TTC-CBTF)
- **E2 Summary**:
    - Class Average: 70%
    - 31/580 (5%) earned perfect marks
    - 11% higher average for those coming to class than those who didn't
- Do you have to take the final exam if you took the course P/NP and already have >=70 pts? No.
    - If you have a 70% currently, do you have to take the exam? YES! 
- Canvas Gradebook will be updated by EOD Friday with Lecture EC and Oral Exam 2 score
- Office Hours end this Friday (no office hours during finals week; I will monitor Ed/email)

**Final Exam** (13 pts; 1h50min)

- 13 MC (6.5 pts total; 0.5 pt each)
    - Topics: Variables, Operators, Functions, Conditionals, Loops, Classes, Command Line, Imports/File Paths, Scientific Computing (2), Code Testing, Documentation, Code Style
- 2 Code Reading & Debugging Qs (3.5 pts) 
    -  Function (1.5 pts)
    -  Class (2 pts)
- Testing Questions (3pts; 1.5 pt each `unittest`)
    - 1 uses `pandas`
 
Notes:
- No mini-project
- **Practice exam will be very good practice for all of these**
- Information Provided:
    - `pandas` functions/methods
    - `unittest` framework w/ list of `assert` statements discussed in class

## Today's Plan

- Code Style
    - readability
    - PEP8 (style guides)
    - linters
- Documentation 
    - code comments
    - docstrings
- (Revisiting) testing
- versioning

## Code Style

### Code Readability (API)

"Code is more often read than written" - Guido van Rossum

So: code should be written to be readable by humans.

Note: one of those humans is future you.

### The Zen of Python

In [None]:
import this

### Writing Readable Code

So how do we write good code for humans?

- Use good structure
- Use good naming
- Use code comments and include documentation

### Good Structure

If you design your program using separate functions for each task, avoid copying + pasting (functions and loops instead), and consider structure beforehand, you'll be set up for success

### Good Naming

Clear names are for humans. The computer doesn't care, but you and others reading your code do.

### Code comments & Documentation

Helpful comments and documentation take your code to the next level. The final piece in the trifecta of readable code! 

Good code has good documentation - but code documentation should _not_ be used to try and fix unclear names, or bad structure. 

Rather, comments should add any additional context and information that helps explain what the code is, how it works, and why it works that way. 

### Motivation: Think-Pair-Share
What does the following code do?

In [None]:
def ff(jj):
    oo = []; jj = list(jj) 
    for ii in jj: oo.append(str(ord(ii)))
    return '+'.join(oo)
ff('Hello World.')

- A) Returns unicode code points, as a list
- B) Encodes a string as a cypher, returning a string of alphabetical characters
- C) Returns unicode code points, as a string
- D) Encodes inputs alphabetical characters, returned as a list
- E) This code will fail

Improvement Considerations (Style)
- Structural considerations: indentations & spacing
- Improved naming: functions & variables

In [None]:
def convert_to_unicode(input_string):
    string = []
    input_list = list(input_string)
      
    for character in input_list: 
        string.append(str(ord(character)))
        
    output_string = '+'.join(string)
    return output_string

convert_to_unicode('Hello World.')

- A) Returns unicode code points, as a list
- B) Encodes a string as a cypher, returning a string of alphabetical characters
- C) Returns unicode code points, as a string
- D) Encodes inputs alphabetical characters, returned as a list
- E) This code will fail

### Code Style Review

Or: How to be Pythonic

Reasons to be Pythonic:
- user friendly for humans
- extra work up-front on the developers (pays off on the long run)
- best to practice this early on (I promise!)

### Style Guides

<div class="alert alert-success">
Coding style refers to a set of conventions for how to write good code. 
</div>

Consistency is the goal. Rules help us achieve consistency.

Much of this will apply to other programming languages, so it's good to learn...regardless of language.

Some of these Code Style notes will be more specific to Python, however.

#### Python Enhancement Proposals (PEPs)

<div class="alert alert-success">
Python PEPs are proposals for how something should be / work in the Python programming language. 
</div>

These are written by the people responsible for the Python Programming language.

PEP are voted on before incorporation.

#### PEP8

<div class="alert alert-info">
<b><a href="https://www.python.org/dev/peps/pep-0008/">PEP8</a></b> is an accepted proposal that outlines the style guide for Python.
</div>

Defines the style guide for Pythonistas (people who code in Python).

### Code Style: Structure

- blank lines
- indentaion
- spacing 
- length <- NEW
- imports

#### Blank Lines

- Use 2 blank lines between functions & classes, and 1 between methods
- Use 1 blank line between segments to indicate logical structure

#### Indentation

Use spaces to indicate indentation levels, with each level defined as 4 spaces. 

#### Spacing

- Put one (and only one) space between each element
- Index and assignment don't have a space between opening & closing '()' or '[]'

In [None]:
my_list = ['a', 'b', 'c']
# avoid having space after list and before square brackets
my_list [0]
# instead do this:
my_list[0]

#### Line Length (NEW)

- PEP8 recommends that each line be at most 79 characters long

Computers used to require this.

But, super long lines are hard to read at a glance.

#### Multi-Line

In [None]:
my_long_list = [1, 2, 3, 4, 5,
                6, 7, 8, 9, 10]

In [None]:
# Note: you can explicitly indicate a new line with '\'
my_string = 'Python is ' + \
            'a pretty great language.'

#### One Statement Per Line

- While you *can* condense multiple statements into one line, you usually shouldn't.

In [None]:
# Badness
for i in [1, 2, 3]: print(i**2 + i%2)

In [None]:
# Goodness
for i in [1, 2, 3]:
    print(i**2 + i%2)

#### Imports

- Import one module per line
- Avoid `*` imports
- Use the import order: standard library; 3rd party packages; local / custom code

In [None]:
# Badness
from numpy import *

import os, sys

In [None]:
# Goodness
import os
import sys

import numpy as np

Note: If you don't know how to import a local/custom module, figure that out this week in office hours.

### Naming (review)

#### Valid Names

- Use descriptive names for all modules, variables, functions and classes, that are longer than 1 character

#### Naming Style

- CapWords (leading capitals, no separation) for Classes
- snake_case (all lowercase, underscore separator) for variables, functions, and modules

### Code Comments

#### Inline code comments

Inline code comments should use `#`, and be written at the same indentation level of the code it describes. 

How to use comments:
- Generally:
    - focus on the *how* and *why*, over literal 'what is the code'
    - explain any context needed to understand the task at hand
    - give a broad overview of what approach you are taking to perform the task
    - if you're using any unusual approaches, explain what they are, and why you're using them
- Comments need to be maintained - make sure to keep them up to date

**Bad Comments**

In [None]:
# This is a loop that iterates over elements in a list
for element in list_of_elements:
    pass

**Good Comments**

In [None]:
# Because of X, we will use approach Y to do Z
for element in list_of_elements:
    # comment for code block
    pass

#### Code Style: Comments

Out-of-date comments are worse than no comments at all.

Keep your comments up-to-date.

#### Block comments
- apply to some (or all) code that follows them
- are indented to the same level as that code. 
- Each line of a block comment starts with a # and a single space

In [None]:
# Badness
import random

def week_10():
# help try to destress students by picking one thing from the following list using random
    statements = ["You've totally got this!","You're so close!","You're going to do great!","Remember to take breaks!","Sleep, water, and food are really important!"]
    out = random.choice(statements)
    return out

week_10()

In [None]:
# Goodness
def week_10():
    
    # Randomly pick from list of de-stressing statements
    # to help students as they finish the quarter.
    statements = ["You've totally got this!", 
                  "You're so close!", 
                  "You're going to do great!",
                  "Remember to take breaks!",
                  "Sleep, water, and food are really important!"]
    
    out = random.choice(statements)
    
    return out

week_10()

#### Inline comments
- to be used sparingly
- to be separated by at least two spaces from the statement
- start with a # and a single space

In [None]:
# Badness
week_10()#words of encouragement

In [None]:
# Goodness
week_10()  # words of encouragement

### Activity: Code Style

Complete the Google Form: [https://forms.gle/aS3VoiKuT8vNHEgK9](https://forms.gle/aS3VoiKuT8vNHEgK9)

> Improve the code style of the provided class, using PEP8 guidelines:

In [None]:
#original
class CoffeeTracker:
    def __init__(self,n):
        self.n=n
        self.ts=0.0
        self.tc=0
        self.ap=0.0
    def buy_coffee(self,p):
        for p1 in p:self.ts+=p1;self.tc+=1
        if self.tc>0:self.ap=self.ts/self.tc
        print(f"{self.n} bought {self.tc} coffees.")
        print(f"Average price per coffee: ${self.ap:.2f}")

Suggested improvements:
- meaningful variable names (n -> name; ts -> total_spent; tc -> total_coffees)
- line spacing between methods
- multi-line statements (for/if) -> more than one line
- spacing operators (self,n -> self, n; around other operators )

In [None]:
#improved
class CoffeeTracker:
    
    def __init__(self, name):
        self.name = name
        self.total_spent = 0.0
        self.total_coffees = 0
        self.avg_price = 0.0
        
    def buy_coffee(self, prices):
        
        for price in prices:
            self.total_spent += price
            self.total_coffees += 1
            
        if self.tc>0:
            self.avg_price = self.total_spent/self.total_coffees
            
        print(f"{self.name} bought {self.total_coffees} coffees.")
        print(f"Average price per coffee: ${self.avg_price:.2f}")

## Code Documentation

<div class="alert alert-success">
Code documentation is text that accompanies and/or is embedded within a software project, that explains what the code is and how to use it. 
</div>

Stuff written for humans in human language to help the humans.

### Code Comments vs. Documentation

#### Comments

Comments are string literals written directly in the code, typically directed at developers - people reading and potentially writing the code. 

#### Documentation

Documentation are descriptions and guides written for code users. 

## Returning to `convert_to_unicode()`
That example from before...Improvement Considerations:
- Structural considerations: indentations & spacing
- Improved naming: functions & variables
- Add Comments within code
- **Proper Documentation!**

In [None]:
def convert_to_unicode(input_string):
    """Converts an input string into a string containing the unicode code points.
    
    Parameters
    ----------
    input_string : string
        String to convert to code points
        
    Returns
    -------
    output_string : string
        String containing the code points for the input string.
    """ 
    
    string = []
    # converting input string to list to make it iterable
    input_list = list(input_string)

    for character in input_list: 
        string.append(str(ord(character)))
        
    output_string = '+'.join(string)
    return output_string

convert_to_unicode('Hello World.')

## Docstrings

<div class="alert alert-success">
<b>Docstrings</b> are in-code text that describe modules, classes and functions. They describe the operation of the code.
</div>

### Numpy style docs
[Numpy style docs](https://numpydoc.readthedocs.io/en/latest/format.html) are a particular specification for docstrings. 

### Example Docstring

In [None]:
def add(num1, num2):
    """Add two numbers together. 
    
    Parameters
    ----------
    num1 : int or float
        The first number, to be added. 
    num2 : int or float
        The second number, to be added.
    
    Returns
    -------
    answer : float
        The result of the addition. 
    """
    
    answer = num1 + num2
    
    return answer

Docstrings

- multi-line string that describes what's going on
- starts and ends with triple quotes `"""`
- one sentence overview at the top - the task/goal of function
- **Parameters** : description of function arguments, keywords & respective types
- **Returns** : explanation of returned values and their types

**Docstrings** are available to you *outside* of the source code.

Note: ChatGPT is pretty good at generating docstrings...

### Docstrings are available through the code

In [None]:
add?

In [None]:
# The `help` function prints out the `docstring` 
# of an object (module, function, class)
help(add)

#### `__doc__`

In [None]:
# Docstrings get stored as the `__doc__` attribute
# can also be accessed from there
print(add.__doc__)

In [None]:
class Person:

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def birthday(self):
        self.age += 1
        return 'Happy Birthday!'

In [None]:
# with docstring
class Person:
    """
    A class to represent a person with a name and age.

    Attributes
    ----------
    name : str
        The name of the person.
    age : int
        The age of the person.

    Methods
    -------
    birthday():
        Increments the person's age by one and returns a birthday greeting.
    """

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def birthday(self):
        """
        Increments the person's age by one year.

        Returns
        -------
        str
            A birthday greeting message.
        """
        self.age += 1
        return 'Happy Birthday!'


### Final Exam & Documentation

**Will I need to write docstrings on the final exam?**

No...not from scratch (but you need to know what they are and be able to read/understand one). LLMs are really good at this, so I won't make you write them by hand on the exam.

### Activity: Code Testing (Revisited)

Complete the Google Form: [https://forms.gle/jWDmi5Zokg5uXD197](https://forms.gle/jWDmi5Zokg5uXD197)

We now have this nicely-styled and documented function. How could we use `unittest` for this?

In [1]:
def convert_to_unicode(input_string):
    """Converts an input string into a string containing the unicode code points.
    
    Parameters
    ----------
    input_string : string
        String to convert to code points
        
    Returns
    -------
    output_string : string
        String containing the code points for the input string.
    """ 
    
    string = []
    # coverting input string to list to make it iterable
    input_list = list(input_string)
      
    for character in input_list: 
        string.append(str(ord(character)))
        
    output_string = '+'.join(string)
    return output_string

In [2]:
convert_to_unicode('abc')

'97+98+99'

<div class="alert alert-warning">
<b>Below is an example of the <code>unittest</code>  framework I'd provide on the final exam:</b> 
</div>

In [4]:
import unittest

class TestConvertToUnicode(unittest.TestCase):

    output = convert_to_unicode('abc')
    
    def test_output_type(self):
        self.assertIsInstance(self.output, str)
        
    def test_length(self):
        self.assertEqual(8, len(convert_to_unicode('abc')))
            
    def test_output(self):
        self.assertEqual('97+98+99', convert_to_unicode('abc'))
        
if __name__ == '__main__':
    suite = unittest.TestLoader().loadTestsFromTestCase(TestConvertToUnicode)
    unittest.TextTestRunner(verbosity=2).run(suite)

test_length (__main__.TestConvertToUnicode.test_length) ... ok
test_output (__main__.TestConvertToUnicode.test_output) ... ok
test_output_type (__main__.TestConvertToUnicode.test_output_type) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.003s

OK


<div class="alert alert-danger">
<b>NOTE: Below this is not tested on your final exam.</b> 
</div>

### Documentation for a Software Project

Documentation Files:
- A `README` is a file that provides an overview of the project to potential users and developers
- A `LICENSE` file specifies the license under which the code is available (the terms of use)
- An `API Reference` is a collection of the docstrings, listing public interfaces, parameters and return values
- Tutorials and/or Examples show examples and tutorials for using the codebase

### Documentation Sites

<div class="alert alert-success">
Documentation sites are a host of a package's documentation, for code users. 
</div>

#### Example: Function Documentation
`numpy.array` :
https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

#### Example: Package Documentation
**scikit learn** (`sklearn`) : https://scikit-learn.org/stable/index.html

## Linters

<div class="alert alert-success">
A linter is a tool that analyzes code for both programmatic errors and stylistic issues. 
</div>

`pylint` is available from Anaconda to check this for you. (Not available on PrairieLearn.)


```python
# to install on your computer
!pip install --user pylint
```

#### Think-Pair-Share

How many PEP8 violations can you find in this code?

In [None]:
def MyFunction(input_num):
    
    my_list = [0,1,2,3]
    if 1 in my_list: ind = 1
    else:
      ind = 0
    qq = []
    for i in my_list [ind:]:
        qq.append(input_num/i)
    return qq

### list here
- conditional (if) -> multiline
- spacing with indexing
- qq -> good variable naming
- spacing for value in my_list (spaces after commas)
- function name is not snake case 

In [None]:
# Let's fix this code
def output_nums(input_num):
    
    my_list = [0, 1, 2, 3]
    
    if 1 in my_list: 
        ind = 1
    else:
        ind = 0
        
    output_list = []
    for i in my_list[ind:]:
        output_list.append(input_num/i)
        
    return output_list

In [None]:
# check using pylint
!pylint linter_example.py

## Software Versioning

When you make changes to the software you've released into the world, you have to change the version of that software to let people know changes have occurred.

### Versioning Schemes

The rules, if you're new to this can be [dizzying](https://www.python.org/dev/peps/pep-0440/#version-scheme), so we'll [simplify](https://www.python.org/dev/peps/pep-0396/) for now:

- `<MAJOR>.<MINOR>`
    - i.e. 1.3
    
- `<MAJOR>.<MINOR>.<MAINTENANCE>`
    - i.e. 1.3.1

- `<MAJOR>` - increase by 1 w/ incompatible API changes
- `<MINOR>` - increase by 1 w/ added functionality in a backwards-compatible manner
- `<MAINTENANCE>` - (aka patch) increase by 1 w/  backwards-compatible bug fixes.

In [5]:
# see version information
import pandas as pd
pd.__version__

'2.2.3'

In Python package development... when `<MAJOR>` == 0, suggests a package in development

In [6]:
# see version information
!pip show pandas

Name: pandas
Version: 2.2.3
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: 
Author-email: The Pandas Development Team <pandas-dev@python.org>
License: BSD 3-Clause License

Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Copyright (c) 2011-2023, Open source contributors.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
  contributors may be u