cft
Become a CreatorSign inGet Started

Parallelization in Python

In this article, we would know the method to parallelize any characteristic logic with python’s multiprocessing module.


user

Mansoor Ahmed

3 months ago | 2 min read
Follow

parallelization-python-oitoo

Introduction

Parallelization in Python enables the developer to run several parts of a program at the same time. It is intended to decrease the complete processing time. The multiprocessing module is used to run free parallel processes.

Those are run by using sub-processes in place of threads. It permits us to influence multiple processors on a machine. It means that together with Windows and Unix, the processes may be run in fully separate memory locations.

In this article, we would know the method to parallelize any characteristic logic with python’s multiprocessing module.

Description

We must divide the program into separate chunks of work to develop the full speed advantage from parallelization for the Python program. Each of those may then be given to different threads or processes. We’ll likewise require to structure the application as the parallel tasks don’t exceed each other’s work. They also don’t create arguments for shared resources for example memory and input and output channels.

Multiprocessing for parallel processing

We can well parallelize simple tasks by creating child processes. These may be attained using the standard multiprocessing module. This module makes available an easy-to-use interface. It consists of a set of utilities to control the task submission and synchronization.

Process

We can make a process that runs independently by sub-classing the multi-processing process. We may initialize resources by extending the __init__ method. We can write the code for the sub-process by implementing the Process. run() method. We understand how to create a process that prints the assigned id in the following code:

We need to initialize the Process object and raise the Process. start() method to brood the process. In this scenario, Process. start() would create a new process. That will raise the Process. run() method.

After p.start(), the code will be implemented directly before the task completion of process p. We can use Process. join() to wait for the job completion.

 

Complete Code

import multiprocessing

import time

class Process(multiprocessing.Process):

    def __init__(self, id):

        super(Process, self).__init__()

        self.id = id

    def run(self):

        time.sleep(1)

        print("I'm the process with id: {}".format(self.id))

if __name__ == '__main__':

    p = Process(0)

    p.start()

    p.join()

    p = Process(1)

    p.start()

    p.join()

Maximum parallel processes we can run

We can run the maximum number of processes at a time is limited by the number of processors in the computer. The cpu_count() function in multiprocessing will display that how many processors are existing in the machine.

import multiprocessing as mp

print("Number of processors: ", mp.cpu_count())

Synchronous and Asynchronous execution

There are two types of execution in parallel processing

  1. Synchronous execution

In the Synchronous execution, the processes are done in the same order in which it was in progress. This is attained by locking the key program until the particular processes are finished.

  1. Asynchronous execution

Asynchronous execution doesn’t include locking. Consequently, the order of results may get varied up. Though, typically gets done faster. There are two foremost objects in multiprocessing to apply parallel execution of a function:

Parallelization by using Pool. map()

  • map() takes only one iterable as an argument.
  • By setting the default to the minimum and maximum parameters to generate a new howmany_within_range_rowonly() function.
  • It takes only an iterable list of rows as input.
  • This is not a pleasant use case of the map().
  • Though, it obviously displays how it varies from apply().

# Parallelizing using Pool.map()

import multiprocessing as mp

# Redefine, with only 1 mandatory argument.

def howmany_within_range_rowonly(row, minimum=4, maximum=8):

    count = 0

    for n in row:

        if minimum <= n <= maximum:

            count = count + 1

    return count

pool = mp.Pool(mp.cpu_count())

results = pool.map(howmany_within_range_rowonly, [row for row in data])

pool.close()

print(results[:10])

#> [3, 1, 4, 4, 4, 2, 1, 1, 3, 3]


For more details visit:
https://www.technologiesinindustry4.com/2021/12/parallelization-in-python.html

Upvote


user
Created by

Mansoor Ahmed

Follow

Technologies in industry 4.0

Chemical Engineer, web developer and Tech writer


people
Post

Upvote

Downvote

Comment

Bookmark

Share


Related Articles