Waste Generated in Distributed Data Processing Systems: Strategies and Future Directions
DOI:
https://doi.org/10.57041/78vyv598Keywords:
Big Data Processing, Distributed Data Processing, Waste Reduction, Distributed SystemAbstract
Distributed Data Processing (DDP) is integral to modern cloud and edge computing environments. Additionally, it serves as a fundamental component of big data analytics, particularly for real-time analytics. Various frameworks such as Spark FSpark, spark flow, and Storm are designed for effective distributed data processing. However, DDP generates significant waste in terms of energy consumption, resource allocation, and data transmission. Efficiently managing these wastes is crucial for improving system performance and minimizing environmental impact. This paper explores various waste and their reduction strategies employed in distributed systems, focusing on energy- efficient scheduling, network optimization, and resource management techniques. The result indicates various wastes such as energy waste, improper or inefficient utilization or distribution of computational resources, carbon footprint waste, inefficient execution time, storage of unnecessary or duplicate data, and inefficientdata transmission. Moreover, the study result also signifies that carbon- and energy-aware scheduling, task offloading, and data deduplication strategies show promising results in minimizing waste. The contribution of this paper lies in identifying waste, key approaches to waste reduction and offering insights into their practical applications in distributed data systems. The study will be useful for the researcher to extend the research by providing additional effective solutions or verifying the reported solutions in terms of waste minimization.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 https://paas-pk.org/index.php/pjosr/cr

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
