- Reactive Programming
- Event-Driven Programming
- Processes Intercommunication
- Multiprocessing
- Pool of Processes
- Pool of Threads
- Benchmarking & Profiling
- Debugging Thread Applications
- Testing Thread Applications
- Threads Intercommunication
- Synchronizing Threads
- Implementation of Threads
- Threads
- System & Memory Architecture
- Concurrency vs Parallelism
- Introduction
- Home
Concurrency in Python Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Concurrency in Python - Multiprocessing
In this chapter, we will focus more on the comparison between multiprocessing and multithreading.
Multiprocessing
It is the use of two or more CPUs units within a single computer system. It is the best approach to get the full potential from our hardware by utipzing full number of CPU cores available in our computer system.
Multithreading
It is the abipty of a CPU to manage the use of operating system by executing multiple threads concurrently. The main idea of multithreading is to achieve parallepsm by spaniding a process into multiple threads.
The following table shows some of the important differences between them −
Multiprocessing | Multiprogramming |
---|---|
Multiprocessing refers to processing of multiple processes at same time by multiple CPUs. | Multiprogramming keeps several programs in main memory at the same time and execute them concurrently utipzing single CPU. |
It utipzes multiple CPUs. | It utipzes single CPU. |
It permits parallel processing. | Context switching takes place. |
Less time taken to process the jobs. | More Time taken to process the jobs. |
It faciptates much efficient utipzation of devices of the computer system. | Less efficient than multiprocessing. |
Usually more expensive. | Such systems are less expensive. |
Epminating impact of global interpreter lock (GIL)
While working with concurrent apppcations, there is a pmitation present in Python called the GIL (Global Interpreter Lock). GIL never allows us to utipze multiple cores of CPU and hence we can say that there are no true threads in Python. GIL is the mutex – mutual exclusion lock, which makes things thread safe. In other words, we can say that GIL prevents multiple threads from executing Python code in parallel. The lock can be held by only one thread at a time and if we want to execute a thread then it must acquire the lock first.
With the use of multiprocessing, we can effectively bypass the pmitation caused by GIL −
By using multiprocessing, we are utipzing the capabipty of multiple processes and hence we are utipzing multiple instances of the GIL.
Due to this, there is no restriction of executing the bytecode of one thread within our programs at any one time.
Starting Processes in Python
The following three methods can be used to start a process in Python within the multiprocessing module −
Fork
Spawn
Forkserver
Creating a process with Fork
Fork command is a standard command found in UNIX. It is used to create new processes called child processes. This child process runs concurrently with the process called the parent process. These child processes are also identical to their parent processes and inherit all of the resources available to the parent. The following system calls are used while creating a process with Fork −
fork() − It is a system call generally implemented in kernel. It is used to create a copy of the process.p>
getpid() − This system call returns the process ID(PID) of the calpng process.
Example
The following Python script example will help you understabd how to create a new child process and get the PIDs of child and parent processes −
import os def child(): n = os.fork() if n > 0: print("PID of Parent process is : ", os.getpid()) else: print("PID of Child process is : ", os.getpid()) child()
Output
PID of Parent process is : 25989 PID of Child process is : 25990
Creating a process with Spawn
Spawn means to start something new. Hence, spawning a process means the creation of a new process by a parent process. The parent process continues its execution asynchronously or waits until the child process ends its execution. Follow these steps for spawning a process −
Importing multiprocessing module.
Creating the object process.
Starting the process activity by calpng start() method.
Waiting until the process has finished its work and exit by calpng join() method.
Example
The following example of Python script helps in spawning three processes
import multiprocessing def spawn_process(i): print ( This is process: %s %i) return if __name__ == __main__ : Process_jobs = [] for i in range(3): p = multiprocessing.Process(target = spawn_process, args = (i,)) Process_jobs.append(p) p.start() p.join()
Output
This is process: 0 This is process: 1 This is process: 2
Creating a process with Forkserver
Forkserver mechanism is only available on those selected UNIX platforms that support passing the file descriptors over Unix Pipes. Consider the following points to understand the working of Forkserver mechanism −
A server is instantiated on using Forkserver mechanism for starting new process.
The server then receives the command and handles all the requests for creating new processes.
For creating a new process, our python program will send a request to Forkserver and it will create a process for us.
At last, we can use this new created process in our programs.
Daemon processes in Python
Python multiprocessing module allows us to have daemon processes through its daemonic option. Daemon processes or the processes that are running in the background follow similar concept as the daemon threads. To execute the process in the background, we need to set the daemonic flag to true. The daemon process will continue to run as long as the main process is executing and it will terminate after finishing its execution or when the main program would be killed.
Example
Here, we are using the same example as used in the daemon threads. The only difference is the change of module from multithreading to multiprocessing and setting the daemonic flag to true. However, there would be a change in output as shown below −
import multiprocessing import time def nondaemonProcess(): print("starting my Process") time.sleep(8) print("ending my Process") def daemonProcess(): while True: print("Hello") time.sleep(2) if __name__ == __main__ : nondaemonProcess = multiprocessing.Process(target = nondaemonProcess) daemonProcess = multiprocessing.Process(target = daemonProcess) daemonProcess.daemon = True nondaemonProcess.daemon = False daemonProcess.start() nondaemonProcess.start()
Output
starting my Process ending my Process
The output is different when compared to the one generated by daemon threads, because the process in no daemon mode have an output. Hence, the daemonic process ends automatically after the main programs end to avoid the persistence of running processes.
Terminating processes in Python
We can kill or terminate a process immediately by using the terminate() method. We will use this method to terminate the child process, which has been created with the help of function, immediately before completing its execution.
Example
import multiprocessing import time def Child_process(): print ( Starting function ) time.sleep(5) print ( Finished function ) P = multiprocessing.Process(target = Child_process) P.start() print("My Process has terminated, terminating main thread") print("Terminating Child Process") P.terminate() print("Child Process successfully terminated")
Output
My Process has terminated, terminating main thread Terminating Child Process Child Process successfully terminated
The output shows that the program terminates before the execution of child process that has been created with the help of the Child_process() function. This imppes that the child process has been terminated successfully.
Identifying the current process in Python
Every process in the operating system is having process identity known as PID. In Python, we can find out the PID of current process with the help of the following command −
import multiprocessing print(multiprocessing.current_process().pid)
Example
The following example of Python script helps find out the PID of main process as well as PID of child process −
import multiprocessing import time def Child_process(): print("PID of Child Process is: {}".format(multiprocessing.current_process().pid)) print("PID of Main process is: {}".format(multiprocessing.current_process().pid)) P = multiprocessing.Process(target=Child_process) P.start() P.join()
Output
PID of Main process is: 9401 PID of Child Process is: 9402
Using a process in subclass
We can create threads by sub-classing the threading.Thread class. In addition, we can also create processes by sub-classing the multiprocessing.Process class. For using a process in subclass, we need to consider the following points −
We need to define a new subclass of the Process class.
We need to override the _init_(self [,args] ) class.
We need to override the of the run(self [,args] ) method to implement what Process
We need to start the process by invoking thestart() method.
Example
import multiprocessing class MyProcess(multiprocessing.Process): def run(self): print ( called run method in process: %s %self.name) return if __name__ == __main__ : jobs = [] for i in range(5): P = MyProcess() jobs.append(P) P.start() P.join()
Output
called run method in process: MyProcess-1 called run method in process: MyProcess-2 called run method in process: MyProcess-3 called run method in process: MyProcess-4 called run method in process: MyProcess-5
Python Multiprocessing Module – Pool Class
If we talk about simple parallel processing tasks in our Python apppcations, then multiprocessing module provide us the Pool class. The following methods of Pool class can be used to spin up number of child processes within our main program
apply() method
This method is similar to the.submit()method of .ThreadPoolExecutor.It blocks until the result is ready.
apply_async() method
When we need parallel execution of our tasks then we need to use theapply_async()method to submit tasks to the pool. It is an asynchronous operation that will not lock the main thread until all the child processes are executed.
map() method
Just pke the apply() method, it also blocks until the result is ready. It is equivalent to the built-in map() function that sppts the iterable data in a number of chunks and submits to the process pool as separate tasks.
map_async() method
It is a variant of the map() method as apply_async() is to the apply() method. It returns a result object. When the result becomes ready, a callable is appped to it. The callable must be completed immediately; otherwise, the thread that handles the results will get blocked.
Example
The following example will help you implement a process pool for performing parallel execution. A simple calculation of square of number has been performed by applying the square() function through the multiprocessing.Pool method. Then pool.map() has been used to submit the 5, because input is a pst of integers from 0 to 4. The result would be stored in p_outputs and it is printed.
def square(n): result = n*n return result if __name__ == __main__ : inputs = pst(range(5)) p = multiprocessing.Pool(processes = 4) p_outputs = pool.map(function_square, inputs) p.close() p.join() print ( Pool : , p_outputs)
Output
Pool : [0, 1, 4, 9, 16]Advertisements