
By optimizing Python code, you improve performance, reduce resource consumption, and enhance scalability. While Python is known for its simplicity and readability, these characteristics can sometimes come at the cost of efficiency.
In this article, we'll explore four ways to optimize your Python project and improve performance.
First, we'll look at how best to use data structures.
Efficient Use of Python Data Structures
We'll use some of the most well-known Python data structures to optimize our code.
Lists Vs. Tuples
Lists and tuples are probably the most basic and well-known data structures in Python. They serve different purposes, so they have different performance characteristics:
- Lists are mutable, which means they can be modified after creation.
- Tuples, instead, are immutable.
Before diving deep into why there are performance differences, let's write a code sample that creates a list and a tuple of 5 numbers.
import timeit # Calculate creation time list_test = timeit.timeit(stmt="[1, 2, 3, 4, 5]", number=1000000) tuple_test = timeit.timeit(stmt="(1, 2, 3, 4, 5)", number=1000000) # Print results print(f"List creation: {list_test: .3} seconds") print(f"Tuple creation: {tuple_test: .3} seconds")
This results in:
List creation: 0.135 seconds Tuple creation: 0.0207 seconds
To calculate performance differences, we use the timeit
module like so:
- The
stmt
parameter defines the code snippet we want to evaluate. So, in the case of thelist_test
variable, it evaluates a list of five numbers; intuple_test
, it evaluates a tuple of five numbers. - The
number
parameter specifies how many times thestmt
parameter must be executed. In both cases, we run it 100,0000 times, meaning the code creates the list and the tuple 100,0000 times.
As the example shows, tuples are way faster than lists. Let's dig into why:
- Memory allocation:
- Due to their immutability, tuples are stored in a fixed-size block of memory. The size of this block is determined when the tuple is created, and it doesn’t change. This fixed size makes tuple memory allocation fast.
- Lists, on the other hand, need to support dynamic resizing. This means they often allocate extra space to accommodate potential growth without requiring frequent reallocations.
- Internal structure:
- The internal structure of a tuple consists essentially of a continuous block of memory with a fixed layout. This layout includes the elements themselves and some metadata (like size), but since tuples are immutable, the structure remains simple.
- Lists have a more complex internal structure to manage their mutability. They need to keep track of their current size and allocated capacity, and must handle changes in size dynamically.
- Caching and optimization:
- Python can use various optimizations for tuples, such as caching, because their immutability guarantees that they won’t change after creation. These optimizations reduce the need for repeated memory allocation and speed up creation.
- While Python does optimize list operations, the potential for lists to change means that optimization is limited compared to tuples.
Dictionaries and Sets Vs. Lists in Python
In Python, dictionaries and sets are data structures that allow for fast lookups. When you want to check if an item is in a set or find a value associated with a key in a dictionary, these operations typically take constant time; this is denoted as O(1)
in "Big O notation".
Given their structure, using dictionaries and sets can significantly improve performance when you need to frequently check for the existence of an item or access elements by a key.
Let's show this with a code snippet. For example, suppose we create a dictionary, a set, and a list with 100,0000 numbers. We want to look for the number 999,999 and then work out how long it takes using these three different data structures:
import time # Create a large set, dictionary, and list large_set = {i for i in range(1000000)} large_dict = {i: str(i) for i in range(1000000)} large_list = [i for i in range(1000000)] # define element to lookup element = 999999 # Timing set lookup start_time = time.time() found = element in large_set end_time = time.time() print(f"Set lookup took: {end_time - start_time:.8f} seconds") # Timing dictionary lookup start_time = time.time() found = element in large_dict end_time = time.time() print(f"Dictionary lookup took: {end_time - start_time:.8f} seconds") # Timing list lookup start_time = time.time() found = element in large_list end_time = time.time() print(f"List lookup took: {end_time - start_time:.8f} seconds")
The result is:
Set lookup took: 0.00000000 seconds Dictionary lookup took: 0.00000000 seconds List lookup took: 0.00771618 seconds
So, basically, the time needed to search for the element 999,999 is (almost) 0 seconds for the set and the dictionary.
Of course, if we want to compare the sets and the dictionary, we'll find that the set provides better performance, as we may expect:
import timeit # Calculate timing performance for dictionary and set dict_test = timeit.timeit(stmt="'a' in {'a': 1, 'b': 2, 'c': 3}", number=1000000) set_test = timeit.timeit(stmt="1 in {1, 2, 3, 4, 5}", number=1000000) # Print results print(f"Dictionary lookup: {dict_test: .3} seconds") print(f"Set lookup: {set_test: .3} seconds")
This results in:
Dictionary lookup: 0.0821 seconds Set lookup: 0.0212 seconds
So, how do dictionaries and sets achieve O(1)
?
Well, dictionaries and sets use a data structure called a hash table. Here's a simplified explanation of how it works:
- Hashing: When you add a key to a dictionary or an item to a set, Python computes a hash value (a fixed-size integer) from the key or item. This hash value determines where the data is stored in memory.
- Direct access: With the hash value, Python can directly access the location where the data is stored without searching the entire data structure.
So it's very fast to check if an item exists in a set or dictionary, which is useful for operations that require frequent existence checks.
Choosing the Right Data Structure
Choosing the appropriate data structure based on the specific needs of your application leads to significant performance gains.
If you need to store data and you're sure it won't change over time, definitely use tuples to optimize your code.
When you need to frequently look for elements, prefer sets and dictionaries over lists or tuples.
Global Variables, Encapsulation, and Namespace
In Python, scope determines the visibility and lifetime of a variable in a program. Variables can have different scopes:
- Local Scope: This refers to variables defined within a function. They are only accessible inside that function.
- Global Scope: Variables defined at the top level of a script or module. They are accessible throughout the module.
- Class/instance Scope: Variables defined within a class, including class attributes and instance attributes.
This section describes code optimization by avoiding global variables, using class encapsulation, and managing a namespace correctly.
Avoiding Global Variables
Local variables are faster to access compared to global variables, primarily due to the way Python manages variable scopes and lookups.
In particular, Python uses the Local, Enclosing, Global, Built-in (LEGB) rule to resolve variable names:
- Local: Names defined within a function.
- Enclosing: Names in the local scopes of any enclosing functions.
- Global: Names at the top level of the module or script.
- Built-in: Preassigned names in the Python built-in namespace.
When accessing a variable, Python starts searching from the innermost scope (the local one). Since the local scope is limited to the function’s context, it contains fewer variables, making the search process quicker. On the contrary, global scope encompasses all top-level names in the module, resulting in a potentially larger search space.
Here's an example to show the difference in performance when using global vs. local variables:
import time # Local variable test def local_test(): a = 0 for var_1 in range(1000000): a += 1 # Global variable test b = 0 def global_test(): global b for var_2 in range(1000000): b += 1 start_time = time.time() local_test() local_time = time.time() - start_time print(f"Local variable test:{local_time: .3} seconds") start_time = time.time() global_test() global_time = time.time() - start_time print(f"Global variable test: {global_time: .3} seconds")
And the result is:
Local variable test: 0.0441 seconds Global variable test: 0.0685 seconds
So, whenever possible, prefer using local variables to global ones.
Encapsulation
Encapsulating variables within functions and classes can improve performance by reducing the scope and limiting the number of variables the interpreter needs to track.
Let's see the difference in performance between using encapsulation and not using it:
import timeit # Class without encapsulation class Rectangle: '''This class creates a rectangle without encapsulation''' def __init__(self, width, height): self.width = width self.height = height def area(self): return self.width * self.height def perimeter(self): return 2 * (self.width + self.height) # Class with encapsulation class EncapsulatedRectangle: '''This class creates a rectangle with encapsulation''' def __init__(self, width, height): self._width = width self._height = height def get_width(self): return self._width def set_width(self, width): self._width = width def get_height(self): return self._height def set_height(self, height): self._height = height def area(self): return self._width * self._height def perimeter(self): return 2 * (self._width + self._height) # Create instances of both classes rect = Rectangle(10, 20) enc_rect = EncapsulatedRectangle(10, 20) # Define the test functions def test_rect_area(): return rect.area() def test_enc_rect_area(): return enc_rect.area() def test_rect_perimeter(): return rect.perimeter() def test_enc_rect_perimeter(): return enc_rect.perimeter() # Time the functions using timeit iterations = 5000000 rect_area_time = timeit.timeit(test_rect_area, number=iterations) enc_rect_area_time = timeit.timeit(test_enc_rect_area, number=iterations) rect_perimeter_time = timeit.timeit(test_rect_perimeter, number=iterations) enc_rect_perimeter_time = timeit.timeit(test_enc_rect_perimeter, number=iterations) # Print results print(f"Rectangle (no encapsulation) area time: {rect_area_time:.4f} seconds") print(f"Encapsulated Rectangle area time: {enc_rect_area_time:.4f} seconds") print(f"Rectangle (no encapsulation) perimeter time: {rect_perimeter_time:.4f} seconds") print(f"Encapsulated Rectangle perimeter time: {enc_rect_perimeter_time:.4f} seconds")
The result is:
Rectangle (no encapsulation) area time: 2.1583 seconds Encapsulated Rectangle area time: 2.2764 seconds Rectangle (no encapsulation) perimeter time: 2.5185 seconds Encapsulated Rectangle perimeter time: 2.4265 seconds
Encapsulation can significantly improve performance in larger applications with numerous variables. By keeping variables local to functions and classes, you reduce the interpreter's workload since it has fewer variables to manage. This leads to a faster execution time, especially in complex programs with many functions and classes.
These are the key benefits of encapsulation:
- Reduced scope: By limiting the scope of variables to the smallest necessary context, the interpreter has fewer variables to track, which leads to faster execution.
- Memory management: Local variables are automatically deallocated when a function exits, which helps with efficient memory use.
- Avoiding naming conflicts: Encapsulation prevents variable name clashes, making code easily maintainable and less error-prone.
So, you'd better use encapsulation to improve performance when creating classes. Also, note that encapsulation provides controlled access and modification of attributes, protecting data from outside modifications.
Correct Namespace Management
Minimizing global namespace pollution leads to better performance. The main idea, in this case, is to use modules and packages to organize your code and keep the global namespace clean. In other words, instead of using too many global variables and creating too many functions or classes, create modules and import them.
Here's a general example of inefficient namespace management:
global_var_1 = "..." def first_function(): '''A function''' .... def second_function(): '''Another function''' ... def main_function(): '''The main function of the program''' ...
The main idea is to change this to something like:
from functions.function_1 import * from functions.function_2 import * def main_function(): '''The main function of the program''' ...
To do so, you have to modularize your functions so that the structure of your folders becomes something like:
main_folder/ |__ main.py | |__ functions/ |__ function_1.py |__ function_2.py
NOTE: In this case,
function_1.py
andfunction_2.py
can be bigger thanfirst_function()
andsecond_function()
.
So, whenever possible, create modules and packages from your code. This helps with:
- Performance: The code is more efficient on the machine side.
- Readability: Shorter code is generally more easily readable than longer code. It's better to have small connected programs than a big program.
- Reuse: Any module or package you create can be used in other programs, helping you save time in the future.
Utilize List Comprehensions and Generator Expressions
This section describes how code performance can be improved through list comprehension and generators.
List Comprehension
List comprehension is a fast and concise way to create a new list using the power of loops and statements with one line of code.
Let's see the difference in performance first:
import timeit # Define the code snippets as functions def loop_code(): '''This function creates a new list out of classic for loop''' squares = [] for x in range(10): squares.append(x**2) def comprehension_code(): ''''This function created a new list out of a list comprehension''' squares = [x**2 for x in range(10)] # Measure execution time loop_test = timeit.timeit(loop_code, number=1000000) comprehension_test = timeit.timeit(comprehension_code, number=1000000) print(f"Loop: {loop_test: .4} seconds") print(f"List comprehension: {comprehension_test: .4} seconds")
This leads to:
Loop: 2.642 seconds List comprehension: 2.444 seconds
List comprehension is more performance-friendly than standard for-loops because of:
- Reduced overhead: List comprehensions are implemented in C within the Python interpreter, making them faster (as lower-level optimizations aren't accessible in a standard Python for-loop).
- No method calls: In a traditional for-loop, the
append()
method is called repeatedly, which adds some overhead. List comprehensions avoid this by constructing the list in a single expression. - Local scope: Variables defined within a list comprehension are scoped more tightly than variables defined in a for-loop. This reduces the potential for variable conflicts and can sometimes make garbage collection more efficient.
So, whenever possible, always prefer using list comprehension to create a new list. This enhances performance and code readability.
Generator Expressions
Generator expressions in Python provide a concise way to create generators without using a separate generator function with the yield()
method. They are similar to list comprehensions — the key difference being that they produce values one at a time and only when needed, which makes them more memory-efficient for large data sets.
Let's see how generator expressions can improve performance:
import timeit # Define the size of the iterable n = 1000000 # Generator expression gen_expr = (i for i in range(n)) # List comprehension list_comp = [i for i in range(n)] # Measure the time taken by the generator expression gen_time = timeit.timeit('sum((i for i in range(n)))', globals=globals(), number=10) # Measure the time taken by the list comprehension list_time = timeit.timeit('sum([i for i in range(n)])', globals=globals(), number=10) print(f"Generator expression took {gen_time:.4f} seconds") print(f"List comprehension took {list_time:.4f} seconds")
This results in:
Generator expression took 1.4823 seconds List comprehension took 1.9738 seconds
Generator expressions are more efficient than list comprehension due to:
- Lazy evaluation: Generator expressions generate items on the fly. This means that they do not compute all items at once, which is memory efficient.
- Memory efficiency: Since values are produced one at a time, generator expressions use less memory compared to list comprehensions, especially for large datasets.
If you need to store values and use them only when needed, always prefer generators for good performance.
Leveraging Built-in Functions and Libraries
This section describes how optimizing code using built-in libraries and functions improves the performance of your machine.
Standard Library Efficiency
Python’s standard library functions are often implemented in C and optimized for speed. Using these functions leads to significant performance improvements, due to:
- Lower-level operations: C operates closer to the hardware level compared to Python, providing more efficient memory and CPU usage.
- Optimized algorithms: Experienced developers highly optimize standard library functions to perform common tasks efficiently.
- Reduced overhead: Invoking a function implemented in C avoids the overhead associated with Python's dynamic typing and interpreted execution.
Suppose we sort a list with a lot of numbers. The following example compares performance when creating a custom function versus using a built-in one:
import time # Sorting via custom function def bubble_sort(arr): n = len(arr) for i in range(n): for j in range(0, n-i-1): if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[j] arr = [i for i in range(10000, 0, -1)] start_time = time.time() bubble_sort(arr) end_time = time.time() print(f"Bubble sort took: {end_time - start_time} seconds") # Sorting via built-in function arr = [i for i in range(10000, 0, -1)] start_time = time.time() sorted(arr) end_time = time.time() print(f"Sorted function took: {end_time - start_time} seconds")
The performance results:
Bubble sort took: 25.067534685134888 seconds Sorted function took: 0.0 seconds
The built-in sorted()
function immediately performs the sorting operation. The custom function, on the other hand, takes nearly 30 seconds to complete its tasks.
Using Third-party Libraries
We can also use third-party libraries like NumPy (a library that brings the computational power of languages like C and Fortran to Python) and Pandas (a fast, powerful, flexible, and easy-to-use open source data analysis and manipulation tool, built on top of the Python programming language) for performance optimizations. These libraries are highly optimized for numerical computations and data manipulation, so, for performance reasons, it's always better to use them rather than to create a custom function.
Suppose we want to add up an array's elements. We can do so with a custom function or with the method np.sum()
by Numpy:
import time import numpy as np def sum_array(arr): total = 0 for num in arr: total += num return total # Create a large array large_array = list(range(1, 10000001)) # Measure time for custom function start_time = time.time() custom_sum = sum_array(large_array) custom_duration = time.time() - start_time # Convert the list to a NumPy array large_array_np = np.array(large_array) # Measure time for NumPy function start_time = time.time() numpy_sum = np.sum(large_array_np) numpy_duration = time.time() - start_time print(f"Duration with custom function: {custom_duration: .4} seconds") print(f"Duration with Numpy: {numpy_duration: .4} seconds")
And here's the result:
Duration with custom function: 1.023 seconds Duration with Numpy: 0.008745 seconds
The difference in performance is huge in this case!
So, remember: you don't need to reinvent the wheel. One of Python's superpowers is that it relies on a vast range of both standard and third-party libraries. You can always use them to save coding and computation time.
Wrapping Up
In this article, we've described four ways to optimize your Python code to improve your machine's performance (and save coding time). We hope you find these tips and tricks useful.
Happy coding!
Wondering what you can do next?
Finished this article? Here are a few more things you can do:
- Subscribe to our Python Wizardry newsletter and never miss an article again.
- Start monitoring your Python app with AppSignal.
- Share this article on social media
Most popular Python articles
An Introduction to Flask-SQLAlchemy in Python
In this article, we'll introduce SQLAlchemy and Flask-SQLAlchemy, highlighting their key features.
See moreMonitor the Performance of Your Python Flask Application with AppSignal
Let's use AppSignal to monitor and improve the performance of your Flask applications.
See moreFind and Fix N+1 Queries in Django Using AppSignal
We'll track the N+1 query problem in a Django app and fix it using AppSignal.
See more

Federico Trotta
Guest author Federico is a freelance Technical Writer who specializes in writing technical articles and documenting digital products. His mission is to democratize software through technical content.
All articles by Federico TrottaBecome our next author!
AppSignal monitors your apps
AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!
