Python - Generators
25.
How do generators contribute to the efficient processing of big data sets in Python?
Generators contribute to the efficient processing of big data sets in Python by enabling lazy evaluation and avoiding the need to load the entire dataset into memory at once. This results in reduced memory consumption and faster processing times. Let's explore an example to illustrate this:
# Function to read a large file line by line using a generator
def read_large_file(file_path):
with open(file_path, 'r') as file:
for line in file:
yield line.strip()
# Function to filter lines containing a specific keyword
def filter_lines(lines, keyword):
return (line for line in lines if keyword in line)
# Example: Processing a large log file
log_file_path = 'large_log_file.txt'
# Reading and filtering lines using generators
lines_generator = read_large_file(log_file_path)
filtered_lines_generator = filter_lines(lines_generator, 'error')
# Displaying the first 5 lines containing the keyword 'error'
first_5_error_lines = [next(filtered_lines_generator) for _ in range(5)]
['2022-01-01 12:01:30 - ERROR: Invalid input', '2022-01-01 12:05:45 - ERROR: Connection failed', '2022-01-01 12:10:20 - ERROR: File not found', '2022-01-01 12:15:12 - ERROR: Server timeout', '2022-01-01 12:20:05 - ERROR: Database error']
In this example, the read_large_file
function generates lines from a large log file lazily. The filter_lines
function takes a generator of lines and filters those containing a specific keyword ('error'). The combination of these generators allows for efficient processing of large log files while minimizing memory usage.
Quick links
Quantitative Aptitude
Verbal (English)
Reasoning
Programming
Interview
Placement Papers