Python
The purpose of this notebook is to give me a quick memory refresh of some key Python concepts. It is more of a mindmap than a thorough documentation/tutorial of the Python language.
- Language Notebook
- Built-in functions
- Enumeration, Iterators and Generators
- Generators
- * and ** Operators
- Lambda expressions
- Functional Programming Modules
- Map, Reduce, Filter with lambdas
- Magic Methods
- Annotations
- Function arguments
- Dictionary views
- If statements
- Looping and mutating strategies
- Exceptions
- del statement
- Sequences
- Sets
- Dictionaries
- Looping
- Scopes and namespaces
- I/O
- Coroutines
- Decorators
- Classes / OOO
- Misc
- Third party modules
- ML
Language Notebook
Built-in functions
any(iterable)
: Return True if bool(x) is True for any x in the iterable.all(iterable)
: Return True if bool(x) is True for all values x in the iterable.dir([object])
: If called without an argument, return the names in the current scope. Else, return an alphabetized list of names comprising (some of) the attributes of the given object, and of attributes reachable from it.map()
: Return an iterator that applies function to every item of iterable, yielding the results.filter(function, iterable)
: Return an iterator yielding those items of iterable for which function(item) is true.zip(*iterables)
: Make an iterator that aggregates elements from each of the iterablessum(iterable)
ord(char)
: Get ASCII code of that characterchr(code)
: Inverse oford()
- Full list here
Types and Data Structures
Sequence types
- Mutable
list()
.sort()
:: in-place sort
bytearray()
- Mutable counterpart to
bytes()
- List of methods here
- Mutable counterpart to
collections.deque()
- A generalization of stacks and queues
- Appends and pops from either side occur at O(1)
- Though
list()
objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v)
- Immutable
tuple()
- Sequences typically used to store collections of heterogenous data
- Create using:
tuple()
ortuple(iterator)
()
,(1, 2)
, 1, 2, 31,
or(1,)
for singleton tuple
range()
- Sequence of numbers
str()
- Sequence of Unicode code points
- List of methods here
bytes()
- Sequence of single bytes
collections.namedtuple()
- Can be used wherever regular tuples are used and they add the ability to access fields by name instead of position index
Point = namedtuple('Point', ['x', 'y'])
p = Point(11, y=22)
Common Sequence Operations
They all support a number of operations:
x in s
orx not in s
- In
str
,bytes
andbytearray
sequences you can test for subsequences, like"gg" in "eggs"
- In
s + t
(concatenation)s * n
orn * s
(add s to itself n times)s[i]
s[i:j]
s[i:j:k]
len(s)
min(s)
max(s)
s.index(x)
(index of first occurence of x)s.count(x)
(total number of occurences of x)
Mutable Sequence Operations
s[i] = x
s[i:j] = t
:: t must be an iterables[i:j:k] = t
:: t must be an iterables.append(x)
s.clear()
:: same asdel [:]
s.copy()
:: create shallow copy (same ass[:]
)s.extend(t)
ors += t
:: extend s with contents of ts *= n
:: update s with its contents repeated n timess.insert(i, x)
:: insert x into s at index is.pop(i)
s.remove(x)
s.reverse()
:: reverse in place
Set Types
Unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
- Mutable
set()
:: can also be created with{1, 2}
add(item)
remove(item)
:: removes item. raisesKeyError
if item is not containeddiscard(item)
:: remove item if presentpop()
:: remove an arbitrary element. raisesKeyError
if set is emptyclear()
- Imutable
frozenset()
Mapping Types
A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects.
There is currently only one standard mapping type, the dictionary dict()
collections.ChainMap(*maps)
- A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.
collections.Counter(iterable-or-mapping)
- A Counter is a dict subclass for counting hashable objects.
- It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values.
collections.OrderedDict()
- Less relevant now (Python 3+) that
dict()
remembers insertions order
- Less relevant now (Python 3+) that
List comprehensions
""" Simple example """
squares = [ x**2 for x in range(10) ]
print(squares)
""" for .. for .. if example """
combinations = [ (x,y) for x in [1,2,3] for y in [3,1,4] if x != y ]
print(combinations)
""" Nested comprehension example """
matrix = [
[1,2,3,4],
[5,6,6,8],
[9,10,11,12]
]
transpose = [[row[i] for row in matrix] for i in range(4)]
print(transpose)
""" Example with zip, which is simpler """
transp = list(zip(*matrix))
print(transp)
Enumeration, Iterators and Generators
enumerate()
The enumerate(iterable[,start=0])
function returns an iterator for a list of
(i, value)
tuples, where i
is an increasing counter for every value
item
of the iterable. Example:
>>> for i,v in enumerate(['a', 'b', 'c']):
... print(i, v)
0 a
1 b
2 c
Iterator
- An object representing a stream of data.
- Must implement
__next__()
- Repeated calls to the iterator’s
__next__()
method (or passing the iterator object to the built-innext()
function) return successive items in the stream. When no more data are available aStopIteration
exception is raised.
Iterator objects are required to have an __iter__()
method that returns the
iterator object itself, so every iterator is also iterable.
Iterable
- An object capable of returning its members one at a time. Examples are list, str, tuple, dict and file objects.
- Iterable classes must implement
__iter__(self)
or__getitem__(self, key)
__iter__()
should return a new iterator- For
__getitem__()
, it should:- accept integers and slice objects
- Raise
TypeError
if key is of inappropriate type - Raise
IndexError
if key is of a value outside the set of indexes
- Iterables can be used in a for loop and in places where a sequence is needed (zip(), map() etc.)
- When an iterable object is passes as an argument to the built-in iter(), it returns an iterator for the object.
- The iterator returned with iter() is good for one pass over the set of values.
- When dealing with iterables you usually don’t have to call iter() yourself. For example the for statement does that automatically for you, creating an unnamed variable to hold the iterator for the duratino of the loop.
iter()
Called when an iterator is required for a container. It should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over thekeys of the container.
Iterator implementation
import random
class Item:
"""Example collection container that provides an iterator"""
def __init__(self, n):
""" Create list of n random ints"""
self.n = n
self.items = [random.randint(0,n) for i in range(n)]
def __iter__(self):
return ItemIter(self.items)
def loop(self):
""" Generator-based iterator"""
for index in range(len(self.items)):
yield self.items[index]
class ItemIter:
""" Subclass of 'Items' that implements Iterator API"""
def __init__(self, items):
self.items = items
self.pos = 0
def __next__(self):
""" Implement iterator API method: __next__"""
if self.pos >= len(self.items):
raise StopIteration
item = self.items[self.pos]
self.pos += 1
return item
def __iter__(self):
""" Implement iterator API method: __iter__"""
return self
Generators
Generators are iterators, but you can iterate over them only once. The reason is that they do not store all the values in memory, they generate the values on the fly.
You use them by iterating over them, either with a for or by passing them to any function or construct that iterates
They are written like regular functions but use the yield
or yield from
statement whenever
they want to return data. They can be created like:
def reverse(data):
for i in range(len(data)-1, -1, -1):
yield data[index]
for char in reverse('golf'):
print(char)
yield from
allows a generator to delegate part of its operations to another generator. For simple iterators, it essentially is a shortened form offor item in iterable: yield item
(replaced withyield from iterable
)
Anything that can be done with generators can also be done with class-based
iterators What makes generators so compact is that the __iter__()
and
__next__()
methods are created automatically.
Generator expressions
They are similar to list comprehensions, but with parentheses instead of square brackets. These expressions are designed for situations where the generator is used right away by an enclosing function
s1 = sum(i*i for i in range(10)) # generator, sum of squares
s2 = sum([ i*i for i in range(10) ]) # same with list comprehension
print(s1)
print(s2)
* and ** Operators
What is the * operator?
This operator unpacks arguments that are already in a list or tuple
Another example when using range(), which expects two arguments for start and stop, is:
args = [3,6]
print(list(range(*args)))
In a similar fashion with *, ** can deliver keyword arguments for dictionaries.
def add3(term1, term2, term3):
return term1 + term2 + term3
d = {
"term1": 1,
"term2": 2,
"term3": 6
}
print(add3(**d))
Lambda expressions
Small anonoymous fucntions acn be created with the lambda keyword They can be used wherever function objects are required. They are syntactically restricted to a single expression.
def make_incrementor(n):
return lambda x: x + n
f = make_incrementor(2)
print(f(10))
print(f(15))
Functional Programming Modules
itertools
Full list: https://docs.python.org/3/library/itertools.html
- Infinite iterators
count()
- Example:
count(10) --> 10 11 12 13 14 ...
- Example:
cycle()
- Example:
cycle('ABCD') --> A B C D A B C D ...
- Example:
repeat()
- Example:
repeat(10, 3) --> 10 10 10
- Example:
- Iterators terminating on the shortest input sequence
accumulate()
- Example:
accumulate([1,2,3,4,5]) --> 1 3 6 10 15
- Example:
chain()
- Example:
chain('ABC', 'DEF') --> A B C D E F
- Example:
chain.from_iterable()
- Example:
chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
- Example:
compress()
- Example:
compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
- Example:
zip_longest()
- Example:
zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
- Example:
- Combinatoric iterators
product()
- Example:
cartesian product, equivalent to a nested for-loop
- Example:
permutations()
- Example:
r-length tuples, all possible orderings, no repeated elements
- Example:
combinations()
- Example:
r-length tuples, in sorted order, no repeated elements
- Example:
combinations_with_replacement()
- Example:
r-length tuples, in sorted order, with repeated elements
- Example:
functools
Full list: https://docs.python.org/3/library/functools.html
cached_property(fn)
cmp_to_key(fn)
lru_cache(fn)
reduce(fn, iterable[,initializer])
wraps()
decorator
Map, Reduce, Filter with lambdas
map()
, functools.reduce()
and filter()
return iterables so we convert the results
to a list with list()
nums = list(range(1,10))
squared = list(map(lambda x: x * x, nums))
filtered = list(filter(lambda x: x > 5, nums))
product = list(reduce(lambda x, y: x * y, nums))
Magic Methods
See https://rszalski.github.io/magicmethods/ for the full list
- Construction and Initialization
__new__(cls, [...])
__init__(self, [...])
__del__(self)
- Comparison
__eq__(self, other)
__ne__(self, other)
__lt__(self, other)
__gt__(self, other)
__le__(self, other)
__ge__(self, other)
- Unary operators and functions
__pos__(self)
(+some_object)__neg__(self)
__abs__(self)
__invert__(self)
__round__(self, n)
__floor__(self)
- Normal arithmetic operators
__add__(self, other)
__sub__(self, other)
__mul__(self, other)
__div__(self, other)
__fllordiv__(self, other)
- Reflected arithmetic operators
- Same as normal equivalents, except the perform the operation with other as the first operand and self as the second, rather than the other way around.
- For the reflectd operators to be called, the object on the left hand side of the operator (other in the example) must not define (or return NotImplemented) for its definition of the non-reflected version of an operation
__radd__(self, other)
__rsub__(self, other)
__rmul__(self, other)
__rdivxadd__(self, other)
- Type conversion magic methods
__int__(self)
__long__(self)
__float__(self)
- Representing your Classes
__str__(self)
__repr__(self)
__format__(self)
__hash__(self)
__sizeof__(self)
Annotations
When annotating, assignment is optional.
-
Variable annotations are usually used for type hints:
count: int = 0
-
Function annotations
def sum_two_numbers(a: int, b: int) -> int:
return a + b
Function arguments
There are two kinds of arguments:
- Keyword argument. Preceded by an identifier (e.g. name=)
Example:
complex(real=3, imag=5)
complex(**{'real': 3, 'imag': 5})
- Positional arguments:
complex(3,5)
complex(*[3,5])
Dictionary views
The objects returned from dict.keys(), dict.values(), and dict.items() are called dictionary views.
To force the dictionary view to become a full list use list(dictview)
If statements
x = 0
if x > 0:
pass
elif x < 0:
pass
else:
pass
Looping and mutating strategies
""" Strategy: Iterate over a copy """
users = {}
for user, status in users.copy().items():
if status == 'inactive':
del users[user]
""" Strategy: Create a new collection """
active_users = {}
for user, status in users.items():
if status == 'active':
active_users[user] = status
Exceptions
- Errors detected during execution are called exceptions and are not unconditionally fatal
try:
x = int('test')
except ValueError:
print('Not a valid number')
pass
""" Exception raising and defining """
class B(Exception):
pass
class C(B):
pass
class D(C):
def __str__(self):
return "Error def"
pass
for cls in [B, C, D]:
try:
raise cls("Exception text")
except D as err:
print("D {0}".format(err))
except C as err:
print("C {0}".format(err))
except B:
print("B")
finally:
pass
- Predefined clean up actions
- This is basically called context management in python
with open("./list-comprehension.py") as f: for line in f: print(line, end='')
- This is basically called context management in python
del statement
You can delete items from a list likeso:
- del a[0]
- del a[2:4]
Sequences
These are the tuple, list and range data types
For tuples:
- Tuple packing: t = 1,"test", 123
- Tuple unpacking: a, b, c = t
- Unpacking works with any sequence (list range or tuple)
Sets
- To create an empty set use set(), not {}, which will create a dictionary
1 in {1,2,3}
// fast membership checking- Set comprehensions work
- {x for x in ‘abracadabra’ if x not in ‘abc’}
- Note that {x: x for x in ‘abracadabra’} creates a dict, not a set
Dictionaries
> dict = { 'sape': 4139, 'guido': 4127, 'jack': 4098 }
> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
> {x: x**2 for x in (2, 4, 6)}
Looping
for key, value in {'a': 1, 'b': 2}:
print(key, value)
for i, v in enumerate(['a', 'b', 'c']):
print(i, v)
Scopes and namespaces
def scope_test():
def do_local():
spam = "local spam"
def do_nonlocal():
nonlocal spam
spam = "nonlocal spam"
def do_global():
global spam
spam = "global spam"
spam = "test spam"
do_local()
print("After local assignment:", spam)
do_nonlocal()
print("After nonlocal assignment:", spam)
do_global()
print("After global assignment:", spam)
scope_test()
print("In global scope:", spam)
The output of the example code is:
After local assignment: test spam
After nonlocal assignment: nonlocal spam
After global assignment: nonlocal spam
In global scope: global spam
I/O
import json
print(json.dumps({'kostas': 1, 'lekkas':2, 'age': 34}))
print("{:2.3}".format(1/3))
""" Use json.dump(x, f) to write to file """
""" Use x = json.load(f) to load from file """
"""
The following strategy iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty.
"""
import fileinput
for line in fileinput.input():
process(line)
Coroutines
async def read_data(db):
""" native coroutine """
pass
async def read_data2(db):
data = await db.fetch('SELECT ...')
Decorators
- The following example is from https://github.com/chiphuyen/python-is-cool/blob/master/cool-python-tips.ipynb
Defining a timit
decorator:
def timeit(fn):
# *args and **kwargs are to support positional and named arguments of fn
@functools.wraps
def get_time(*args, **kwargs):
start = time.time()
output = fn(*args, **kwargs)
print(f"Time taken in {fn.__name__}: {time.time() - start:.7f}")
return output # make sure that the decorator returns the output of fn
return get_time
functools.wrapper
is a convenience function for invokingfunctools.update_wrapper()
as a function decorator. Whatfunctools.wraps()
dooes is that it assigns attributes of the original function to the wrapper function, like__name__
,__modules__
,__annotations__
,__qualname__
and__doc__
. If we don’t usefunctools.wrap
then the wrapped function loses any docstrings and its name is that of the wrapper function.
Adding the decorator(s):
@functools.lru_cache()
def fib_helper(n):
if n < 2:
return n
return fib_helper(n - 1) + fib_helper(n - 2)
@timeit
def fib(n):
return fib_helper(n)
Classes / OOO
We have the following conventions with regards to naming variables and methods:
_var
: Hint that the method/variable is intented for internal use. Not enforced.var_
: Sometimes used when the most fitting name is already taken by a keyword__var
: The python interpreter will rewrite the attribute in order to avoid naming conflicts in subclasses. This is called named-mangling.__var__
: Perhaps surprisingly, names with both leading and trailing double underscores are not name-mangled. This convention is reserved for special use in the language, also called as dunder (i.e. double under) methods._
: Single underscore is sometimes used as a name to indicate that a variable/result is temporary or insignificant.
Other notes:
- The
class Node(object):
syntax is only needed in Python 2.x, in 3 the(object)
part is the implicit default so it’s not needed. isinstance(object, class)
: Check if object is instance of classissubclass(class, class_or_tuple)
: Check if class is subclass of another class
@property
decorator:
class Foo:
def __init__(self, v)
self.v = v
@property
def v(self):
return self.__v
@v.setter
def v(self, v):
self.__v = v + 1
f = Foo(1)
f.x
f.x = 5
Class, instance and static methods
- instance methods
- Regular methods, defined like
def method(self[,args])
- Can access attributes and methods of the same object through
self
- Can even modify class state through
self.__class__
- Regular methods, defined like
- class methods
- Decorated with
@classmethod
- Defined like
def clsmethod(cls[,args])
. Thecls
parameters points to the class - They can’t modify object instance state, only class state
- Can be used as object factories, e.g.
return cls([args])
- Decorated with
- static methods
- Decorated with
@staticmethod
- Define like
def staticmethod([args])
- Can neither modify object state nor class state
- Primarily a way to namespace methods
- Decorated with
super()
- Allows you to call methods of the superclass
- e.g. in the subclass’s
__init__
,super().__init__([args])
- e.g. in the subclass’s
setattr() and getattr()
These built-in functions set and get properties of classes:
getattr(object, name[, default])
setattr(object, name, value)
Can be useful to minimize repetitions and perform update actions in bulk, like:
class Character:
__slots__ = (
"strength",
"dexterity"
)
def __init__(self):
for i in self.__slots__:
setattr(self, i, 0)
Misc
- floor division is available with //, like
11 // 4 == 2
import strings
for a collection of string constants, likestrings.ascii_lowercase
recordclass.recordclass
is basically a mutablecollections.namedtuple
Packages
- A package is a collection of modules
- An
__init__.py
file is required to make Python treat directories containing the file as packages.__init__.py
can be empty, execute initialization code or set the__all__
variable.__all__
is a list of module names to be included onfrom package import *
- Intra-package references:
from . import echo
from .. import formats
from ..filters import equilizer
- setup.py
- This file is the build script for setuptools. It tells setuptools about your package (such as the name and version) as well as which code files to include.
- Mostly used to build and distribute a package
Third party modules
scipy
Scipy is an ecosystem with a few popular core packages:
- NumPy
- NumPy is the fundamental package for scientific computing with Python. It
contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
- NumPy is the fundamental package for scientific computing with Python. It
contains among other things:
- Core SciPy Library
- Linear Algebra
- scipy.linalg contains all the functions in numpy.linalg. plus some other more advanced ones not contained in numpy.linalg.
- Optimization
- Integration
- Signal processing & Fourrier transforms
- Graph routines
- Statistics
- etc.
- Linear Algebra
- Matplotlib: 2D plotting
- SymPy: symbolic mathematics
- pandas: Data structures & analysis
- Feature engineering
- IPython: interactive console, a core component of Jupyter
seaborn
- Plotting library e.g. for barplots
- What is the difference with matplotlib ?
Seaborn is a Python visualization library based on matplotlib. It provides a high-level, dataset-oriented interface for creating attractive statistical graphics. The plotting functions in seaborn understand pandas objects and leverage pandas grouping operations internally to support concise specification of complex visualizations. Seaborn also goes beyond matplotlib and pandas with the option to perform statistical estimation while plotting, aggregating across observations and visualizing the fit of statistical models to emphasize patterns in a dataset.
ML
- TensorFlow
- PyTorch
- Keras
- Z3
- OR-Tools
- SciKit
- Bokeh