Abstract Fork-join queues are natural models for various computer and communications systems that involve parallel multitasking and the splitting and resynchronizing of data, such as parallel computing, query processing in distributed databases, and parallel disk access. Job response time in a fork-join queue is a critical performance indicator but its exact analysis is challenging. We introduce a stochastic model for K-node homogeneous fork-join queues (K≥2) that focuses on the difference in length between any node-queue and the shortest one, truncating the state space such that the maximum difference is at most a constant C. Whilst most previous methods focus on the mean response time, our model is also able to evaluate the response time distribution, as well as accommodating phase-type processing times and Markovian arrival processes. In order to tackle scenarios with high loads, which require a large value of C to provide sufficient accuracy, we develop an efficient algorithm using matrix-analytic methods. Tests against simulation show that the proposed model yields accurate results for 2-node fork-join queues. As the model becomes numerically intractable for large values of K, we further propose an approximate approach, based on properties of order statistics and extreme values. The approximation gives a high degree of accuracy on response time tails, and has the advantage of being efficient and scalable, requiring only the analytical results for a single-node and 2-node fork-join queues, which we obtain with the aforementioned matrix-analytic model. Comparison with simulation results shows that our approximation yields good fits for the tails, even in very large cases with general processing and inter-arrival times.