This is more of a braindump of algorithms, data structures and python notes useful for interviews.

  • Combinatorial Objects
    • Permutations
      • Arrangements or orderings of items
      • When asked for:
        • arrangement
        • tour
        • ordering
        • sequence
    • Subsets
      • Selections for a set of items
      • When asked for:
        • cluster
        • collection
        • comittee
        • group
        • packaging
        • selection
    • Trees
      • Reperesent hierarchical relationships between items
      • When asked for:
        • hierarchy
        • dominance relationship
        • taxonomy
    • Graphs
      • Represent relationships between arbitrary pairs of objects
      • When asked for:
        • network
        • circuit
        • web
        • relationship
    • Points
    • Polygons
    • Strings
  • Dominance Relations for O() functions:
    • n! » c^n » n^3 » n^2 » n^(1+ε) » nlogn » n » sqrt(n) » (logn)^2 » logn » loglogn » a(n) » 1
    • logn: Arise when things are repeatedly halfed
    • n^2 (quadratic): Look at most, or all, pairds of items in an n-element universe
    • n^3 (cubic): Enumerate through all triples of items in an n-element universe. Typically arise in dynamic programming algorithms
    • c^n (exponential): Arise when enumerating all subsets of n items
    • n! (factorial): Enumerate all permutations or orderings of n items

Data Structures

  • Abstract data types (containers, dictionaries, queues etc.) are implemented by data structures.
  • Contiguous vs linked
  • Arrays
    • Benefits
      • Constant access time
      • Space efficiency
      • Memory locality
    • Drawbacks
      • Not easily resisable
        • Note that total work of managing a dynamic list is O(n)
  • Lists vs Arrays
    • +Overflow can never happen unless memory is full
    • +Insert/delete simpler than for contiguous lists
    • +Moving pointers faster/easier than moving actual data
    • -Extra space needed for metadata
    • -Don’t allow random access
    • -arrays better with cache/memory locality
  • Stacks
    • LIFO
    • push() and pop()
  • Queues
    • FIFO
    • enqueue() and dequeue()
  • Dictionaries
    • Search()
    • Insert()
    • Delete()
    • Certain dict structures also support:
      • max(), min()
      • predecessor(k), successor(k)
  • Binary trees
    • search()
    • min(), max()
    • traverse()
      • in-order: process node according to sorting policy
      • pre-order: process node first (i.e. entering subtree with node as root)
      • post-order: process node last (i.e. exiting subtree with node as root)
    • insert()
    • delete()
  • Balanced Search Trees
    • Order of insertion has a great impact on the shape and height of the btree.
    • e.g. if we insert (1,2,3,4,), the tree will be ‘skinny’ with linear height
    • Importance of randomization
    • Blanced binary search tree data structures:
      • red-black trees
      • AVL trees
      • splay trees
  • Priority Queue
    • insert()
    • find-min(), find-max()
    • del-min(), del-max()
  • Hashing
    • Looking up an item in an array takes constant time once you have its index
    • Use value of hash function as an index into array
    • Collision resolution
      • Two distinct keys will occasionally hash to the same value
      • Chaining is the easiest approach
        • Hash table is an array of m linked lists
    • Open addressing
      • Array of elements, initialized to null
      • If desired position is empty, instert item
      • If not, find another place
        • Simplest approach is called sequential probing, which inserts the item in the next open spot in the table

Algorithms

  • Sorting and Searching

  • Applications of sorting
    • Searching in O(logn)
    • Finding closest pair in O(nlogn)
    • Find duplicates
    • Frequency distribution
    • Selection (kth largest item in array)
    • Convex hulls
  • Heaps
    • Data structure that enables us to represent binary trees as an array without pointers
    • Efficiently support insert, extract-max/min priority queue operations
    • They maintain partial order, which is weaker than sorted order, so it is efficient to maintain, yet stronger than random order, so the minimum or maxium element can be quickly identified
      • left child of k sits in position 2k
      • right child of k sits in position 2k+1
      • parent of k at floor(k/2)
    • min-heap, node dominates children by having smaller key
    • max-heap, by having bigger
    • Extract min or max: remove root, replace with rightmost leaf, then bubble down. Bubble down operation is called ‘heapify’
    • Insert node: insert in leftmost open spot in array, then bubble up as needed to preserve dominance order.
  • Heapsort
  • Insertion sort
    • O(n^2) in the worst case, performs considerably better if data is almost sorted
  • Mergesort
    • Divide and Conquer
    • Great to order linked lists, because it doesn’t require random access like quicksort or heapsort do.
    • Needs an auxilary array when soring arrays, but can merge linked lists without any extra space.
  • Quicksort
    • Worst case is O(n^2)
  • Bucketsort

Python

  • heapq is min heap
  • heapq .heappush(heap, item) .heappop(heap) .heapify(x) : heapify in-place, linear time .nlargest(n, iterable, key=None) .nsmallest(n, iterable, key=None)
  • Can work with (priorityNum, key) tuples

functools

  • functools.lru_cache
  • functools.lru_cache(maxsize=128)
  • functools.reduce(function, iterable[, initializer])

itertools

  • itertools.zip_longest(a, b, fillvalue=’-‘)
  • itertools.permutations(p, r)
  • itertools.combinations(p, r)
  • itertools.combinations_with_replacement(p, r)

Exceptions

  • try
    • raise Exception
  • except Exception
  • except (ValueError, Exception)

Regex

  • import re
    • text = “this 09 is a test”
    • res = re.match(r”test”, text)
      • None if not found
      • res.group() for result, res.span() for span
    • results = re.findall(r”[0-9]+”, text)
      • Returns list of matched strings

I/O

  • remember ‘with open(file) as f’ syntax
  • f = open(‘file.txt’, ‘r’)
    • f.read() reads the whole file
    • f.readline() reads line
    • for line in f: loops over lines
  • import pickle
    • pickle.dump(obj, open(‘file.dat’, ‘wb’))
    • obj = pickle.load(open(‘file.dat’, ‘rb’))
  • import json
    • json.dump(obj, open(‘file.json’, ‘w’))
    • obj = json.load(open(‘file.json’, ‘r’))
  • import requests
    • res = requests.get(‘https://lekkas.io’)
    • res.ok
    • res.status_code
    • res.text
    • d = res.json() - extract data to d dict

Testing

  • assert fib(n) == fib_m(n)
  • import unittest
  • from mylib import fun
    • class Tester(unittest.TestCase):
      • def test_fun(self):
        • self.assertEquals([], fun())
        • self.assertTrue(fun(1))

Subprocess

  • import subprocess
    • subprocess.run([“ls”, “-la”])

asyncio

  • Event loop framework
  • Coroutines
    • Can be awaited
    • import asyncio
    • async def nested():
      • return 42
    • async def main():
      • await nested()
    • asyncio.run(main())
  • Tasks
    • Corouties that are scheduled concurrently
    • task = asyncio.create_task(nested())
    • await task
  • await asyncio.gather(task1, task2, task3)
    • Run tasks concurrently

multiprocessing

  • Run on multiple processors
  • Better to use, due to GIL

packaging

  • init.py treads directory as package
    • can be empty, init or set the all variable
    • all : list of packages to be included on import
    • setup : mostly used to build/distribute package

Misc resources

  • https://sahandsaba.com/thirty-python-language-features-and-tricks-you-may-not-know.html
  • https://www.bigocheatsheet.com/
  • https://lekkas.io/notes/python/

Other

  • Heapsort and heap
  • Suffix Trees
  • Balancing Trees
  • Integer partitions