Fork–join queue

Type of queue


title: "Fork–join queue" type: doc version: 1 created: 2026-02-28 author: "Wikipedia contributors" status: active scope: public tags: ["single-queueing-nodes"] description: "Type of queue" topic_path: "general/single-queueing-nodes" source: "https://en.wikipedia.org/wiki/Fork–join_queue" license: "CC BY-SA 4.0" wikipedia_page_id: 0 wikipedia_revision_id: 0

::summary Type of queue ::

::figure[src="https://upload.wikimedia.org/wikipedia/commons/f/fb/Fork-join-queue.svg" caption="A fork–join queueing node"] ::

In queueing theory, a discipline within the mathematical theory of probability, a fork–join queue is a queue where incoming jobs are split on arrival for service by numerous servers and joined before departure. The model is often used for parallel computations or systems where products need to be obtained simultaneously from different suppliers (in a warehouse or manufacturing setting). The key quantity of interest in this model is usually the time taken to service a complete job. The model has been described as a "key model for the performance analysis of parallel and distributed systems." Few analytical results exist for fork–join queues, but various approximations are known.

The situation where jobs arrive according to a Poisson process and service times are exponentially distributed is sometimes referred to as a Flatto–Hahn–Wright model or FHW model.

Definition

On arrival at the fork point, a job is split into N sub-jobs which are served by each of the N servers. After service, sub-job wait until all other sub-jobs have also been processed. The sub-jobs are then rejoined and leave the system.

For the fork–join queue to be stable the input rate must be strictly less than sum of the service rates at the service nodes.

Applications

Fork–join queues have been used to model zoned RAID systems, parallel computations and for modelling order fulfilment in warehouses.

Response time

The response time (or sojourn time) is the total amount of time a job spends in the system.

Distribution

Ko and Serfozo give an approximation for the response time distribution when service times are exponentially distributed and jobs arrive either according to a Poisson process or a general distribution. QIu, Pérez and Harrison give an approximation method when service times have a phase-type distribution.

Average response time

An exact formula for the average response time is only known in the case of two servers (N=2) with exponentially distributed service times (where each server is an M/M/1 queue). In this situation, the response time (total time a job spends in the system) is :\frac{12-\rho}{8\mu(1-\rho)} where

  • \rho=\lambda/\mu is the utilization.
  • \lambda is the arrival rate of jobs to all the nodes.
  • \mu is the service rate across all the nodes. In the situation where nodes are M/M/1 queues and N 2, Varki's modification of mean value analysis can also be used to give an approximate value for the average response time.

For general service times (where each node is an M/G/1 queue) Baccelli and Makowski give bounds for the average response time and higher moments of this quantity both in the transient and steady state situations. Kemper and Mandjes show that for some parameters these bounds are not tight and show demonstrate an approximation technique. For heterogeneous fork-join queues (fork-join queues with different service times), Alomari and Menasce propose an approximation based on harmonic numbers that can be extended to cover more general cases such as probabilistic fork, open and closed fork-join queues.

Subtask dispersion

The subtask dispersion, defined to be the range of service times, can be numerically computed and optimal deterministic delays introduced to minimize the range.

Stationary distribution

In general the stationary distribution of the number of jobs at each queue is intractable. Flatto considered the case of two servers (N=2) and derived the stationary distribution for the number of jobs at each queue via uniformization techniques. Pinotsi and Zazanis show that a product form solution exists when arrivals are deterministic as the queue lengths are then independent D/M/1 queues.

Heavy traffic/diffusion approximation

When the server is heavily loaded (service rate of the queue is only just larger than arrival rate) the queue length process can be approximated by a reflected Brownian motion which converges to the same stationary distribution as the original queueing process. Under limiting conditions the state space of the synchronisation queues collapses and all queues behave identically.

Join queue distribution

Once jobs are served, the parts are reassembled at the join queue. Nelson and Tantawi published the distribution of the join queue length in the situation where all servers have the same service rate. Heterogeneous service rates and distribution asymptotic analysis are considered by Li and Zhao.

Networks of fork–join queues

An approximate formula can be used to calculate the response time distribution for a network of fork–join queues joined in series (one after the other).

Split–merge model

A related model is the split–merge model, for which analytical results exist. Exact results for the split-merge queue are given by Fiorini and Lipsky. Here on arrival a job is split into N sub-tasks which are serviced in parallel. Only when all the tasks finish servicing and have rejoined can the next job start. This leads to a slower response time on average.

Generalized (n,k) fork-join system

A generalization of the fork-join queueing system is the (n,k) fork-join system where the job exits the system when any k out of n tasks are served. The traditional fork-join queueing system is a special case of the (n,k) system when k = n . Bounds on the mean response time of this generalized system were found by Joshi, Liu and Soljanin.

References

References

  1. (1989). "Analysis of the fork-join queue". IEEE Transactions on Computers.
  2. (2009). "Basics of Applied Stochastic Processes".
  3. Boxma, Onno. (1996). "Queueing-theoretic Solution Methods for Models of Parallel and Distributed Systems".
  4. Wright, Paul E.. (1992). "Two parallel processors with coupled inputs". Advances in Applied Probability.
  5. (2005). "Synchronized queues with deterministic arrivals". Operations Research Letters.
  6. (September 1989). "Stationary and Stability of Fork-Join Networks". Journal of Applied Probability.
  7. (2009). "Computer Performance Engineering".
  8. (2008). "Sojourn times in G/M/1 fork-join networks". Naval Research Logistics.
  9. (2015). "Beyond the mean in fork-join queues: Efficient approximation for response-time tails". [[Performance Evaluation]].
  10. (1988). "Approximate analysis of fork/join synchronization in parallel queues". IEEE Transactions on Computers.
  11. "M/M/1 Fork-join queue with variable sub-tasks".
  12. (1985). "Simple computable bounds for the fork-join queue". National Institute for Research in Computer Science and Control Technical Report.
  13. (2011). "Mean sojourn times in two-queue fork-join systems: Bounds and approximations". OR Spectrum.
  14. (2013). "Efficient Response Time Approximations for Multiclass Fork and Join Queues in Open and Closed Queuing Networks". IEEE Transactions on Parallel and Distributed Systems.
  15. (2013). "Computer Performance Engineering".
  16. (2004). "Response times in M/M/s fork-join networks". Advances in Applied Probability.
  17. (1984). "Two Parallel Queues Created by Arrivals with Two Demands I". SIAM Journal on Applied Mathematics.
  18. (1996). "A fork-join queueing model: Diffusion approximation, integral representations and asymptotics". Queueing Systems.
  19. Varma, Subir. (1990). "Heavy and Light Traffic Approximations for Queues with Synchronization Constraints (PhD thesis)". University of Maryland.
  20. (2012). "2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton)".
  21. (2010). "On the Probability Distribution of Join Queue Length in a Fork-Join Model". Probability in the Engineering and Informational Sciences.
  22. (2007). "Computational Science and Its Applications – ICCSA 2007".
  23. (June 2007). "Response Time Approximations in Fork-Join Queues".
  24. (2003). "Computer Performance Evaluation. Modelling Techniques and Tools".
  25. (2015). "Exact Analysis of Some Split Merge Queues". SIGMETRICS Performance Evaluation Review.
  26. (Oct 2012). "Coding for Fast Content Download".
  27. (May 2014). "On the Delay-Storage trade-off in Content Download from Coded Distributed Storage".

::callout[type=info title="Wikipedia Source"] This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page. ::

single-queueing-nodes