Python Function Design Check List
Table of contents
- Introduction
- 1. Use meaningful name for your function
- 2. Function name should always imply what it will return.
- 3: Summarize into a function with no side effects.
- 4: Functionalize in a meaningful cohesive manner.
- 5: Do not make lists or dictionaries the default argument
- 6: Accepts int or str without a collection as an argument
- 7: Index numbers should not have meaning
- 8: Do not abuse variable length arguments in function arguments
- 9: Write "why" in your comments
- 10: Do not write processing in the controller
Introduction
It is very important to name your variable, function and class clearly because it will help not only your teammate but also yourself in the future. In this article, I will help you to learn how you should write your code for companies.
1. Use meaningful name for your function
- Use verb:
get_item_list
,calc_tax
,is_member
etc - Use noun:
current_date
,as_dict
etc - Use role:
json_parser
,file_uploader
etc
Use more specific verb to improve your code.
Bad example
def get_latest_news():
. . .
def get_tax_including(price: int):
. . .
def get_sum_price(items: List[int]):
. . .
Best Practice
from typing import List
def fetch_latest_news():
. . .
def calc_tax_including(price: int):
. . .
def aggregate_sum_price(items: List[int]):
. . .
2. Function name should always imply what it will return.
It is essential to have a function name that people can instinctively understand what it will return.
Let's think about how we can do it!
Bad example
def is_valid(name: str):
if name.endwith(".txt"):
return name[:-4] + ".md"
return name
Problem with this is that is_valid
implies its returning value is boolean. However, it actually returns modified string or original string. This is really confusing, and will contribute to bug as you progress.
So how can we fix this?
Best Practice
def modify_if_valid_name(name: str) -> str:
if name.endwith(".txt"):
return name[:-4]+ ".md"
return name
or
def is_valid(name: str) -> bool:
return not name.endwith(".txt"):
These looks much better to understand what function does.
3: Summarize into a function with no side effects.
It is very important to be aware of "side effects" in programming. A side effect is a change in some state as a result of the execution of a program. If a function is not affected by external variables or state, it should always produce the same output when given the same input. This makes the function easy to test and unaffected by state.
What are the side-effects of a function and what are the caveats?
Bad Example
def is_valid(article):
if article.title in INVALID_TITLES:
return False
# is_valid function is rewriting
# the value of article.valid
article.valid = True
# Saving data externally by calling .save()
article.save()
return True
def update_article(article, title, body):
article.title = title
article.body = body
if not is_valid(article):
return
article.save()
return article
In this case, just calling the is_valid
function will change the .valid
value of the article
. Side effects from various functions can cause data to be modified in ways that developers do not expect. Unexpected data changes can be a source of bugs and difficult troubleshooting.
Best Practice
In this case, the is_valid function should not cause any side effects. The function name is_valid_title
should be used to check whether the title is valid or not.
from pydantic import BaseModel
class Article:
title: str
body: str
valid: bool
def is_valid_title(title: str) -> bool:
return title not in INVALID_TITLES:
def update_article(article:Article, title: str, body: str) -> Article:
if not is_valid_title(title):
return
article.title = title
article.body = body
article.valid = True
article.save()
return article
This way, is_valid_title
has no side effects and can be easily reused from other processes. It also makes it clear that "calling update_article
has side-effects.
4: Functionalize in a meaningful cohesive manner.
Are you thoughtlessly grouping functions together? When separating functions, do not separate them by processing cohesion.
Bad Example
def main() -> None:
with open(...) as f:
reader = csv.reader(f)
for row in reader:
i = row[1]
if i < 100:
print(row[0])
This function is simply grouped together in the main() function as the "main processing". If this is separated by a "processing grouping", it becomes as follows.
def print_row(row) -> None:
i = row[1]
if i < 100:
print(row[0])
def main() -> None:
with open(...) as f:
reader = csv.reader(f)
for row in reader:
print_row(row)
Functionalization makes us feel like we have made an improvement, but this separation is problematic.
Best Practice
Separate functions and processes by the meaning and reusability of the process. Before getting too enthusiastic about "separating into functions," it is important to think about what each process can mean. The function is a way of expressing what it means to read a CSV and compare it to 100. In this case, let's assume that we have a specification that says, "If the price is less than 100 dollars, the item is subject to matching.
import csv
def read_items(path: str) -> tuple:
""" CSV data of the product list is read and returned in a tuple generator
Each product is returned as a "product name, price" tuple
"""
with open(path) as f:
reader = csv.reader(f)
for row in reader:
name = row[0]
price = int(row[1])
yield name, price
def is_addon_price(price: float) -> bool:
""" Returns True if the price is an "item eligible for matching purchase".
"""
return price < 100
def main() -> None:
items = read_items()
for name, price in items:
if is_addon_price(price):
print(name)
By grouping them together by meaning rather than process, it became clear what each process was for.
5: Do not make lists or dictionaries the default argument
Default arguments in Python are a useful feature, but there is a trap in using them. Have you ever written a program like the following example?
Bad Example
def foo(messages=[]):
messages.append("Hello world")
return messages
Do not write values=[]
if you want the argument values to be an empty list by default.
Best Practice
An updatable (mutable) value must not be specified as the default argument. Remember that lists, dictionaries, and sets should not be default arguments. Set the default argument to None and specify an empty list or dictionary if None is specified in the function.
def foo(messages: List[str] =None):
if messages is None:
messages = []
messages.append("Hello world")
return messages
This will always return ["Hello world"] no matter how many times foo() is called.
6: Accepts int or str without a collection as an argument
What values should I expect for function arguments? Thinking about function arguments is very important because it determines the input specification of the function.
What is the problem with the following function?
Bad Example
def calc_after_tax_price(item:dict , tax_rate:float=0.1) -> float :
return item['price'] * (1 + tax_rate)
This calc_after_tax_price
expects item
(a dictionary representing products) as an argument. This means that a dictionary with a 'price' key must be prepared every time a user simply wants to calculate sales tax. This reduces the reusability of the function.
Best Practice
It is best to accept function arguments that are not collections, such as numbers (int), floating-point numbers (float), or strings (str).
def calc_after_tax_price(price:float , tax_rate:float=0.1) -> float:
return price * (1 + tax_rate)
By making the function accept numeric values instead of a dictionary, the calc_after_tax_price
function can also be used to simply calculate the amount of after tax price.
7: Index numbers should not have meaning
There are very rare instances in Python where a list or tuple index number would make for a better program. What happens if you give meaning to the index number?
Bad Example
def item_available(item_id: int) -> bool:
# db query here
return is_available
def validate_sales(row):
"""
row is a tuple representing sales
1st element: sales price
2nd element: product ID
3rd element: user ID
Element 4: quantity
Element 5: date and time of sale
"""
# check if item is available
if not item_exists(row[1]):
raise ...
# check number of items available
if row[3] < 1:
raise ...
For example, in a dictionary, row['item_id']
, the process itself would represent the meaning. However, in this example, the row index number has meaning in the processing, making the program difficult to read. You have to remember that row[1]
is the item ID when you are reading the program.
Also, if you process by index number, the process will break if a new value is entered in between. For example, if the row specification has been changed so that the second element now contains the "dealer ID," the process for specifying subsequent elements must be rewritten. In this case, row[3]
must be changed to row[4]
.
Best Practice
Don't manage data by tuples, but by dictionaries or classes. Replacing row tuples with a class called Sale makes the validate_sales function much easier to read.
from pydantic import BaseModel
from datetime import datetime
class Sale(BaseModel):
sale_id: int
item_id: int
user_id: int
amount: int
sold_at: datetime
def validate_sales(sale):
""" Raise an error if sales sale is invalid data.
"""
if not item_exists(sale.item_id):
raise ...
if sale.amount < 1:
raise ...
8: Do not abuse variable length arguments in function arguments
Variable length arguments, *args
and **kwargs
, are a useful feature of Python, but if they are used to thoughtlessly, they can easily introduce bugs into a program.
What are the problems? Let's think about it while looking at the program.
Bad Example
class User:
def __init__(self, **kwargs):
self.mail = kwargs['mail']
self.password = kwargs.get('password')
This User can accept an email=
argument that is not expected by the class, as shown below. If you program it mistakenly as email=
, you will not get an error.
>>> user = User(name="adam", email="adam@example.com")
Here user.mail
is set to None. There is a problem with unexpected data being created but no error is generated, which can lead to errors elsewhere in the program and problems with needed data not being saved.
Best Practice
Do not carelessly use *args
and **kwargs
, but specify them as individual arguments.
class User:
def __init__(self, password: str, mail: str=None):
self.mail = mail
self.password = password
or
from pydantic import BaseModel
class User(BaseModel):
mail: str
password: str
In this case, specifying an argument that does not exist will result in an error.
9: Write "why" in your comments
What do you write in your program comments? There are some points to keep in mind when writing comments.
Bad Example
def fetch_users_by_ids(user_ids: List[int]) -> List[UserQuery]:
# fetch user from database filtered by user ids.
# user_id must be int
# users receives UserQuery
# Loop through the user_ids one by one and process
# SQL is executed on the backend when user_id loops
for user_id in user_ids:
...
return users
The problem is that this program has a lot of useless comments. If you read the program, you will understand the process. If the code is difficult to understand without comments, consider whether it could be rewritten in simple code before supplementing the explanation with comments.
Best Practice
Write "why" in comments. If you write a function specification, write it in a docstring, not in a comment.
from typing import List
def fetch_users_by_ids(user_ids: List[int]) -> List[UserQuery]:
""" ~~Process to fetch user
Do <fetch_users_by_ids> for multiple users.
~~ for ~~, we need to change the user's data.
"""
# To reduce the number of SQL executions, do not separate this loop into separate functions
for user in users:
...
return user
If a program is difficult to understand just by reading the code, write a comment explaining the meaning of the process and why it is written the way it is.
The comment can be thought of as an explanation of "why it is not processed this way. It is effective to annotate "why this processing is done this way" for a process that is not natural to the reader of the program.
10: Do not write processing in the controller
Are you writing too much in your main()
function or web framework controller (Django's View)?
Bad Example
Here is an example of a process written in a View function in the web framework Django.
@login_required
def product_list_view(request, store_id:int):
shop = get_object_or_404(Store, id=store_id)
if not request.user.memberships.filter(role=Membership.ROLE_OWNER, store=store).exists():
return HttpResponseForbidden()
products = product.objects.filter(store=store,
published_at__isnull=False)
if "search" in request.GET:
search_text = request.GET["search"]
if len(search_text) < 2:
return TemplateResnponse(request, "products/products_list.html",
{"products": products, "error": "text is too short"})
products = items.filter(name__contains=search_text)
prices = []
for product in products:
price = int(product.price * 1.1)
prices.append(f"$: {price}")
products = zip(products, prices)
return TemplateResponse(request, "products/products_list.html", {"products": products})
This program writes too much processing to the item_list_view
function.
Best Practice
The controller should only input and output values and control the entire process. If you implement detailed processing in the controller, too many programs will be written in the controller. Not only will this make the overall process less clear, it will also make unit testing more difficult. Most of the processing in the above product_list_view
View function should be implemented in separate functions or components.
The product_list_view
function after separating each process is as follows.
``` @login_required def product_list_view(request, store_id:int): store = get_object_or_404(Store, id=store_id) validate_membership_permission(request.user, shop, Membership.ROLE_OWNER)
products = product.objects.filter(store=store).published()
form = ProductSearchForm(request.GET)
if form.is_valid():
products = form.filter_products(products)
return TemplateResponse(request, "products/products_list.html",
{"products": products, "form": form})