{"id":366,"date":"2022-09-27T00:00:00","date_gmt":"2022-09-27T00:00:00","guid":{"rendered":"https:\/\/tac.debuzzify.com\/?p=366"},"modified":"2023-06-27T05:23:21","modified_gmt":"2023-06-27T05:23:21","slug":"speed-up-slow-for-loops-in-python","status":"publish","type":"post","link":"https:\/\/www.the-analytics.club\/speed-up-slow-for-loops-in-python\/","title":{"rendered":"Is Your Python For-loop Slow? Use NumPy Instead"},"content":{"rendered":"\n\n\n

 Speed is always a concern for developers \u2014 especially for data-savvy work.<\/p>\n\n\n\n

The ability to iterate is the basis of all automation and scaling. The first and foremost choice for all of us is a for-loop. It\u2019s excellent, simple, and flexible. Yet, they are not built for scaling up to massive datasets.<\/p>\n\n\n\n

This is where vectorization comes in. When you do extensive data processing in for-loops, consider vectorization. And Numpy comes in handy there.<\/p>\n\n\n\n

This post explains how fast NumPy operations are compared to for-loops.<\/p>\n\n\n\n

\n
\n
\n

Grab your aromatic coffee <\/a>(or tea<\/a>) and get ready…!<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n

 Comparing For-loops with NumPy<\/h2>\n\n\n\n

Let\u2019s take a simple summation operation. We have to sum up all the elements in a list.<\/p>\n\n\n\n

The sum is an inbuilt operation in Python<\/a> you can use over a list of numbers. But let\u2019s assume there isn\u2019t one, and you need to implement it.<\/p>\n\n\n\n

Any programmer would opt to iterate over the list and add the numbers to a variable. But experienced developers know<\/a> the limitations and go for an optimized version.<\/p>\n\n\n\n

Here are both the list and NumPy versions of our summation. We create an array with a million random numbers between 0 and 100. Then we use both methods and record the execution times.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
import<\/span> numpy <\/span>as<\/span> np<\/span><\/span>\nimport<\/span> timeit<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>sum_with_for_loop<\/span>(<\/span>array<\/span>)<\/span> <\/span>-><\/span> <\/span>int<\/span>:<\/span><\/span>\n    <\/span>sum<\/span> <\/span>=<\/span> <\/span>0<\/span><\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> array<\/span>:<\/span><\/span>\n        <\/span>sum<\/span> <\/span>+=<\/span> i<\/span><\/span>\n    <\/span>return<\/span> <\/span>sum<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>sum_with_np_sum<\/span>(<\/span>array<\/span>)<\/span> <\/span>-><\/span> <\/span>int<\/span>:<\/span><\/span>\n    <\/span>return<\/span> np<\/span>.<\/span>sum<\/span>(<\/span>array<\/span>)<\/span><\/span>\n<\/span>\n<\/span>\nif<\/span> __name__ <\/span>==<\/span> <\/span>"<\/span>__main__<\/span>"<\/span>:<\/span><\/span>\n    array <\/span>=<\/span> np<\/span>.<\/span>random<\/span>.<\/span>randint<\/span>(<\/span>0<\/span>,<\/span> <\/span>100<\/span>,<\/span> <\/span>1000000<\/span>)<\/span><\/span>\n<\/span>\n    <\/span># print time for for loop<\/span><\/span>\n    <\/span>print<\/span>(<\/span>timeit<\/span>.<\/span>timeit<\/span>(<\/span>lambda<\/span>:<\/span> <\/span>sum_with_for_loop<\/span>(<\/span>array<\/span>),<\/span> <\/span>number<\/span>=<\/span>100<\/span>))<\/span><\/span>\n<\/span>\n    <\/span># print time for np.sum<\/span><\/span>\n    <\/span>print<\/span>(<\/span>timeit<\/span>.<\/span>timeit<\/span>(<\/span>lambda<\/span>:<\/span> <\/span>sum_with_np_sum<\/span>(<\/span>array<\/span>),<\/span> <\/span>number<\/span>=<\/span>100<\/span>))<\/span><\/span><\/code><\/pre>Python<\/span><\/div>\n\n\n\n
<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
$<\/span> <\/span>python<\/span> <\/span>main.py<\/span> <\/span><\/span>\nSummation<\/span> <\/span>time<\/span> <\/span>with<\/span> <\/span>for-loop:<\/span>  <\/span>14.793345853999199<\/span><\/span>\nSummation<\/span> <\/span>time<\/span> <\/span>with<\/span> <\/span>np.sum:<\/span>  <\/span>0.1294808290003857<\/span><\/span><\/code><\/pre>Bash<\/span><\/div>\n\n\n\n

The NumPy version is faster. It took roughly one-hundredth of the time for-loops took.<\/p>\n\n\n\n

Sum products in NumPy vs. Lists<\/h3>\n\n\n\n

It\u2019s a popular numerical computation you can even use in Excel. Let\u2019s measure the performances of lists and NumPy versions.<\/p>\n\n\n\n

The following code multiplies each element of an array with a corresponding element in another array. Finally, we sum up all the individual products.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
import<\/span> numpy <\/span>as<\/span> np<\/span><\/span>\nimport<\/span> timeit<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>sum_product_with_for_loop<\/span>(<\/span>array1<\/span>,<\/span> <\/span>array2<\/span>)<\/span> <\/span>-><\/span> <\/span>int<\/span>:<\/span><\/span>\n    <\/span>sum<\/span> <\/span>=<\/span> <\/span>0<\/span><\/span>\n    <\/span>for<\/span> i<\/span>,<\/span> j <\/span>in<\/span> <\/span>zip<\/span>(<\/span>array1<\/span>,<\/span> array2<\/span>):<\/span><\/span>\n        <\/span>sum<\/span> <\/span>+=<\/span> i <\/span>*<\/span> j<\/span><\/span>\n    <\/span>return<\/span> <\/span>sum<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>sum_product_with_np_sum<\/span>(<\/span>array1<\/span>,<\/span> <\/span>array2<\/span>)<\/span> <\/span>-><\/span> <\/span>int<\/span>:<\/span><\/span>\n    <\/span>return<\/span> np<\/span>.<\/span>sum<\/span>(<\/span>array1 <\/span>*<\/span> array2<\/span>)<\/span><\/span>\n<\/span>\n<\/span>\nif<\/span> __name__ <\/span>==<\/span> <\/span>"<\/span>__main__<\/span>"<\/span>:<\/span><\/span>\n    array1 <\/span>=<\/span> np<\/span>.<\/span>random<\/span>.<\/span>randint<\/span>(<\/span>0<\/span>,<\/span> <\/span>100<\/span>,<\/span> <\/span>1000000<\/span>)<\/span><\/span>\n    array2 <\/span>=<\/span> np<\/span>.<\/span>random<\/span>.<\/span>randint<\/span>(<\/span>0<\/span>,<\/span> <\/span>100<\/span>,<\/span> <\/span>1000000<\/span>)<\/span><\/span>\n<\/span>\n    <\/span># Print the time taken to execute the function<\/span><\/span>\n    <\/span>print<\/span>(<\/span>timeit<\/span>.<\/span>timeit<\/span>(<\/span>lambda<\/span>:<\/span> <\/span>sum_product_with_for_loop<\/span>(<\/span>array1<\/span>,<\/span> array2<\/span>),<\/span> <\/span>number<\/span>=<\/span>100<\/span>))<\/span><\/span>\n<\/span>\n    <\/span>print<\/span>(<\/span>timeit<\/span>.<\/span>timeit<\/span>(<\/span>lambda<\/span>:<\/span> <\/span>sum_product_with_np_sum<\/span>(<\/span>array1<\/span>,<\/span> array2<\/span>),<\/span> <\/span>number<\/span>=<\/span>100<\/span>))<\/span><\/span><\/code><\/pre>Python<\/span><\/div>\n\n\n\n

Here\u2019s the output of the above code:<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
$<\/span> <\/span>python<\/span> <\/span>main.py<\/span> <\/span><\/span>\nSum<\/span> <\/span>of<\/span> <\/span>products<\/span> <\/span>with<\/span> <\/span>for<\/span> <\/span>loop:<\/span>  <\/span>26.099454337999987<\/span><\/span>\nSum<\/span> <\/span>of<\/span> <\/span>products<\/span> <\/span>with<\/span> <\/span>np.sum:<\/span>  <\/span>0.28206900699990456<\/span><\/span><\/code><\/pre>Bash<\/span><\/div>\n\n\n\n

Once again, the NumPy version was about 100 times faster than iterating over a list.<\/p>\n\n\n\n

Matrix multiplication performance of NumPy and lists.<\/h3>\n\n\n\n

Matrix multiplication is an extended version of sum-product. It involves not a single array but an array of arrays.<\/p>\n\n\n\n

Matrix multiplication is also very common when implementing algorithms that involve a lot of data. Here\u2019s the benchmark.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
import<\/span> numpy <\/span>as<\/span> np<\/span><\/span>\nimport<\/span> timeit<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>matrix_muliplication_with_np<\/span>(<\/span>matrix1<\/span>,<\/span> <\/span>matrix2<\/span>):<\/span><\/span>\n    <\/span>return<\/span> np<\/span>.<\/span>matmul<\/span>(<\/span>matrix1<\/span>,<\/span> matrix2<\/span>)<\/span><\/span>\n<\/span>\n<\/span>\ndef<\/span> <\/span>matrix_multiplication_with_for_loop<\/span>(<\/span>matrix1<\/span>,<\/span> <\/span>matrix2<\/span>):<\/span><\/span>\n    result <\/span>=<\/span> np<\/span>.<\/span>zeros<\/span>((<\/span>len<\/span>(<\/span>matrix1<\/span>),<\/span> <\/span>len<\/span>(<\/span>matrix2<\/span>[<\/span>0<\/span>])))<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>matrix1<\/span>)):<\/span><\/span>\n        <\/span>for<\/span> k <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>matrix2<\/span>)):<\/span><\/span>\n            <\/span>for<\/span> j <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>matrix2<\/span>[<\/span>0<\/span>])):<\/span><\/span>\n                result<\/span>[<\/span>i<\/span>][<\/span>j<\/span>]<\/span> <\/span>+=<\/span> matrix1<\/span>[<\/span>i<\/span>][<\/span>k<\/span>]<\/span> <\/span>*<\/span> matrix2<\/span>[<\/span>k<\/span>][<\/span>j<\/span>]<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> result<\/span><\/span>\n<\/span>\n<\/span>\nif<\/span> __name__ <\/span>==<\/span> <\/span>"<\/span>__main__<\/span>"<\/span>:<\/span><\/span>\n    matrix1 <\/span>=<\/span> np<\/span>.<\/span>random<\/span>.<\/span>randint<\/span>(<\/span>1<\/span>,<\/span> <\/span>10<\/span>,<\/span> <\/span>(<\/span>1000<\/span>,<\/span> <\/span>1000<\/span>))<\/span><\/span>\n    matrix2 <\/span>=<\/span> np<\/span>.<\/span>random<\/span>.<\/span>randint<\/span>(<\/span>1<\/span>,<\/span> <\/span>10<\/span>,<\/span> <\/span>(<\/span>1000<\/span>,<\/span> <\/span>1000<\/span>))<\/span><\/span>\n<\/span>\n    <\/span>print<\/span>(<\/span><\/span>\n        <\/span>"<\/span>Matrix multiplication with numpy: <\/span>"<\/span>,<\/span><\/span>\n        timeit<\/span>.<\/span>timeit<\/span>(<\/span>lambda<\/span>:<\/span> <\/span>matrix_muliplication_with_np<\/span>(<\/span>matrix1<\/span>,<\/span> matrix2<\/span>),<\/span> <\/span>number<\/span>=<\/span>1<\/span>),<\/span><\/span>\n    <\/span>)<\/span><\/span>\n    <\/span>print<\/span>(<\/span><\/span>\n        <\/span>"<\/span>Matrix multiplication with for loop: <\/span>"<\/span>,<\/span><\/span>\n        timeit<\/span>.<\/span>timeit<\/span>(<\/span><\/span>\n            <\/span>lambda<\/span>:<\/span> <\/span>matrix_multiplication_with_for_loop<\/span>(<\/span>matrix1<\/span>,<\/span> matrix2<\/span>),<\/span> <\/span>number<\/span>=<\/span>1<\/span><\/span>\n        <\/span>),<\/span><\/span>\n    <\/span>)<\/span><\/span><\/code><\/pre>Python<\/span><\/div>\n\n\n\n
<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
$<\/span> <\/span>python<\/span> <\/span>main.py<\/span><\/span>\nMatrix<\/span> <\/span>multiplication<\/span> <\/span>with<\/span> <\/span>for<\/span> <\/span>loop:<\/span>  <\/span>1597.9121425140002<\/span><\/span>\nMatrix<\/span> <\/span>multiplication<\/span> <\/span>with<\/span> <\/span>numpy:<\/span>  <\/span>2.8506258010002057<\/span><\/span><\/code><\/pre>Bash<\/span><\/div>\n\n\n\n

The results of using NumPy were profound. Our vectorized version ran more than 500 times faster.<\/p>\n\n\n\n

NumPy\u2019s benefits are more prominent as the size and dimensions of arrays grow.<\/p>\n\n\n\n

Why is NumPy faster than lists?<\/h2>\n\n\n\n

Simple; They are designed for different purposes.<\/p>\n\n\n\n

NumPy\u2019s role is to provide an optimized interface for numerical computation. A Python list<\/a>, however, is only a collection of objects.<\/p>\n\n\n\n

A NumPy array allows only homogeneous data types<\/b>. Thus the NumPy operations don\u2019t have to worry about types before every step of an algorithm. This is where we gain a lot of speed \u2014 quick wins.<\/p>\n\n\n\n

Also, in NumPy, the whole array, not individual elements, is an object known as densely packed<\/b>. Thus it takes much less memory.<\/p>\n\n\n\n

Further, NumPy operations are (primarily) implemented using C<\/b>, not in Python itself.<\/p>\n\n\n\n

\u00a0\u00a0Lists in Python<\/a> are not more than an object store. Individual objects take up space, and you\u2019ll quickly need more memory to process them. Also, lists could accommodate different types of objects in it. But on the downside, you\u2019d have to do element-wise-type checks on every operation. This makes it costly.<\/p>\n\n\n\n

Final thoughts<\/h2>\n\n\n\n

This post encourages you to convert your lists<\/a> to NumPy arrays and use vectorized operations to speed executions.<\/p>\n\n\n\n

It\u2019s natural for people to use for-loops over a list because it\u2019s straightforward. But if it involves a lot of numbers, it\u2019s not the optimal way. To understand it better, we\u2019ve compared the performances of trivial operations such as summation, sum-product, and matrix multiplication. In all cases, NumPy performed far better than lists.<\/p>\n\n\n\n

For-loops, too, have their place in programming. The rule of thumb is to use them when your data structures are more complex and have fewer items to iterate.<\/p>\n\n\n\n

You may be better off summing a few hundred numbers without NumPy. Also, if you have to do more work than numerical computation in each iteration, NumPy isn\u2019t your option.<\/p>\n\n\n\n


\n\n\n\n
\n

Thanks for the read, friend. It seems you and I have lots of common interests. Say Hi to me on LinkedIn<\/strong><\/a>, Twitter<\/strong><\/a>, and Medium<\/strong><\/a>. <\/p>\n\n\n\n

Not a Medium member yet? Please use this link to become a member<\/strong><\/a> because I earn a commission for referring at no extra cost for you.<\/p>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"

Python has NumPy. It’s optimized for speed using C implementation. It’s the best alternative for Python for-loops. <\/p>\n","protected":false},"author":2,"featured_media":160,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_blocks_custom_css":"","_kad_blocks_head_custom_js":"","_kad_blocks_body_custom_js":"","_kad_blocks_footer_custom_js":"","_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"categories":[3],"tags":[30,27],"taxonomy_info":{"category":[{"value":3,"label":"Python"}],"post_tag":[{"value":30,"label":"optimization"},{"value":27,"label":"python"}]},"featured_image_src_large":["https:\/\/www.the-analytics.club\/wp-content\/uploads\/2023\/06\/loops-1024x576.jpg",1024,576,true],"author_info":{"display_name":"Thuwarakesh","author_link":"https:\/\/www.the-analytics.club\/author\/thuwarakesh\/"},"comment_info":0,"category_info":[{"term_id":3,"name":"Python","slug":"python","term_group":0,"term_taxonomy_id":3,"taxonomy":"category","description":"","parent":5,"count":52,"filter":"raw","cat_ID":3,"category_count":52,"category_description":"","cat_name":"Python","category_nicename":"python","category_parent":5}],"tag_info":[{"term_id":30,"name":"optimization","slug":"optimization","term_group":0,"term_taxonomy_id":30,"taxonomy":"post_tag","description":"","parent":0,"count":2,"filter":"raw"},{"term_id":27,"name":"python","slug":"python","term_group":0,"term_taxonomy_id":27,"taxonomy":"post_tag","description":"","parent":0,"count":9,"filter":"raw"}],"_links":{"self":[{"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/posts\/366"}],"collection":[{"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/comments?post=366"}],"version-history":[{"count":3,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/posts\/366\/revisions"}],"predecessor-version":[{"id":1269,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/posts\/366\/revisions\/1269"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/media\/160"}],"wp:attachment":[{"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/media?parent=366"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/categories?post=366"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.the-analytics.club\/wp-json\/wp\/v2\/tags?post=366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}

More examples of using Numpy to Speed up calculations<\/h2>\n\n\n\n

NumPy is used heavily for numerical computation. That said, if you\u2019re working with colossal dataset vectorization and the use of NumPy is unavoidable.<\/p>\n\n\n\n

Most machine learning<\/a> libraries use NumPy under the hood to optimize algorithms. If you\u2019ve ever created a scikit learn-to model, you\u2019d have used NumPy already.<\/p>\n\n\n\n

Here are some more examples you\u2019d frequently use when dealing with extensive numerical data.<\/p>\n\n\n\n

Let\u2019s run the program<\/a> and see what we get. The output may look like the one below.<\/p>\n\n\n\n