This document proposes a simulation-based approach using machine learning to identify straggler tasks in Hadoop MapReduce jobs. Straggler tasks significantly increase overall job completion time. The existing approach in Hadoop uses simple profiling to detect stragglers, but this does not account for hardware variations. The proposed approach uses k-means clustering to classify tasks as fast or slow based on their expected completion time, as determined through simulation. This has the potential to improve job completion time by enabling more effective speculative execution of straggler tasks on alternative nodes. The approach groups simulated task completion times into two clusters representing fast and slow tasks. This allows accurate identification of straggler tasks to target for speculative execution.