TIL: James Powell's Fast Python Techniques and Scalene Performance Profiler
Today I learned about advanced Python optimization techniques from James Powell's 'Fast and Furious Python' talk and discovered Scalene, a high-performance CPU and memory profiler for Python.
Today I discovered advanced Python performance optimization techniques and a powerful new profiling tool that provides detailed insights into Python application performance.
# Expensive: Creates new objects repeatedlydefslow_string_building(items):result=""foriteminitems:result+=str(item)# Creates new string each timereturnresult# Fast: Use join for string concatenationdeffast_string_building(items):return"".join(str(item)foriteminitems)# Benchmark differenceimporttimeititems=list(range(1000))slow_time=timeit.timeit(lambda:slow_string_building(items),number=100)fast_time=timeit.timeit(lambda:fast_string_building(items),number=100)print(f"Slow: {slow_time:.4f}s, Fast: {fast_time:.4f}s")# Fast is typically 10-100x faster
# Slow: Python loopdefslow_sum(numbers):total=0fornuminnumbers:total+=numreturntotal# Fast: Built-in function (implemented in C)deffast_sum(numbers):returnsum(numbers)# Even faster for specific casesimportoperatorfromfunctoolsimportreducedefreduce_sum(numbers):returnreduce(operator.add,numbers,0)# Specialized operationsimportstatisticsimportmath# Fast statistical functionsmean_value=statistics.mean(numbers)median_value=statistics.median(numbers)sqrt_sum=math.sqrt(sum(x*xforxinnumbers))
3. Use List Comprehensions and Generator Expressions:#
# Slow: Linear search in listdefslow_membership_test(items,targets):return[itemforitemintargetsifiteminitems]# Fast: Use sets for O(1) lookupdeffast_membership_test(items,targets):item_set=set(items)return[itemforitemintargetsifiteminitem_set]# Collections module optimizationsfromcollectionsimportdeque,Counter,defaultdict# Fast queue operationsqueue=deque()queue.appendleft(item)# O(1) vs list.insert(0, item) O(n)# Fast countingword_counts=Counter(words)# Much faster than manual dict counting# Avoid KeyError with defaultdictword_positions=defaultdict(list)fori,wordinenumerate(words):word_positions[word].append(i)# No need to check if key exists
# O(n²) - Avoid nested loops when possibledefslow_find_duplicates(items):duplicates=[]fori,item1inenumerate(items):forj,item2inenumerate(items[i+1:],i+1):ifitem1==item2:duplicates.append(item1)returnduplicates# O(n) - Use data structures effectivelydeffast_find_duplicates(items):seen=set()duplicates=set()foriteminitems:ifiteminseen:duplicates.add(item)else:seen.add(item)returnlist(duplicates)# Even better: Use CounterfromcollectionsimportCounterdeffastest_find_duplicates(items):counts=Counter(items)return[itemforitem,countincounts.items()ifcount>1]
# Use __slots__ for classes with fixed attributesclassPoint:__slots__=['x','y']# Saves memory, faster attribute accessdef__init__(self,x,y):self.x=xself.y=y# Use array.array for numeric dataimportarraynumbers=array.array('i',range(10000))# Much less memory than list
fromscaleneimportscalene_profilerdefyour_function():# Function to profilereturn[i*iforiinrange(100000)]# Profile specific code blockswithscalene_profiler:result=your_function()
These tools and techniques provide a comprehensive approach to Python performance optimization, from understanding algorithmic complexity to detailed profiling and measurement of actual performance characteristics.