ECE/CS 6960/5960 Fundamentals of Cloud Systems

Paper List

Part I: Cloud Computing

  • Coded Distributed Computing (Coded MapReduce)

    • Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), 107-113.

    • Zaharia, Matei, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. “Spark: Cluster computing with working sets.” HotCloud 10, no. 10-10 (2010): 95.

    • Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coded mapreduce.” In Communication, Control, and Computing (Allerton), 2015 53rd Annual Allerton Conference on, pp. 964-971. IEEE, 2015.

    • Li, Songze, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, and Salman Avestimehr. “Coded terasort.” In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International, pp. 389-398. IEEE, 2017. (Coded TeraShort Implementations: here)

    • Li, Songze, Mohammad Ali Maddah-Ali, Qian Yu, and A. Salman Avestimehr. “A fundamental tradeoff between computation and communication in distributed computing.” IEEE Transactions on Information Theory 64, no. 1 (2018): 109-128.

    • Ji, Mingyue, Giuseppe Caire, and Andreas F. Molisch. “Fundamental limits of caching in wireless D2D networks.” IEEE Transactions on Information Theory 62, no. 2 (2016): 849-869.

    • Li, Songze, Mohammad Ali Maddah-Ali, Qian Yu, and A. Salman Avestimehr. “A fundamental tradeoff between computation and communication in distributed computing.” IEEE Transactions on Information Theory 64, no. 1 (2018): 109-128.

    • Woolsey, Nicholas, Rong-Rong Chen and Mingyue Ji, “A New Combinatorial Design of Coded Distributed Computing,” 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, 2018, pp. 726-730.

    • Li, Songze, Qian Yu, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “A scalable framework for wireless distributed computing.” IEEE/ACM Transactions on Networking 25, no. 5 (2017): 2643-2654.

    • Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coding for distributed fog computing.” IEEE Communications Magazine 55, no. 4 (2017): 34-40.

    • Reisizadeh, Amirhossein, Saurav Prakash, Ramtin Pedarsani, and Amir Salman Avestimehr. “Coded computation over heterogeneous clusters.” arXiv preprint arXiv:1701.05973 (2017).

    • Kiamari, Mehrdad, Chenwei Wang, and A. Salman Avestimehr. “On heterogeneous coded distributed computing.” In GLOBECOM 2017-2017 IEEE Global Communications Conference, pp. 1-7. IEEE, 2017.

    • Ezzeldin, Yahya H., Mohammed Karmoose, and Christina Fragouli. “Communication vs distributed computation: an alternative trade-off curve.” In Information Theory Workshop (ITW), 2017 IEEE, pp. 279-283. IEEE, 2017.

    • Konstantinidis, Konstantinos, and Aditya Ramamoorthy. “Leveraging Coding Techniques for Speeding up Distributed Computing.” arXiv preprint arXiv:1802.03049 (2018).

    • Prakash, Saurav, Amirhossein Reisizadeh, Ramtin Pedarsani, and Salman Avestimehr. “Coded Computing for Distributed Graph Analytics.” arXiv preprint arXiv:1801.05522 (2018).

    • Srinivasavaradhan, Sundara Rajan, Linqi Song, and Christina Fragouli. “Distributed Computing Trade-offs with Random Connectivity.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1281-1285. IEEE, 2018.

    • Song, Linqi, Sundara Rajan Srinivasavaradhan, and Christina Fragouli. “The benefit of being flexible in distributed computation.” In Information Theory Workshop (ITW), 2017 IEEE, pp. 289-293. IEEE, 2017.

    • Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Compressed Coded Distributed Computing.” arXiv preprint arXiv:1805.01993 (2018).

    • Yang, Yaoqing, Matteo Interlandi, Pulkit Grover, Soummya Kar, Saeed Amizadeh, and Markus Weimer. “Coded Elastic Computing.” arXiv preprint arXiv:1812.06411 (2018).

    • Woolsey, Nicholas, Rong-Rong Chen, and Mingyue Ji. “Cascaded Coded Distributed Computing on Heterogeneous Networks.” arXiv preprint arXiv:1901.07670 (2019).

    • Yang, Yaoqing, Matteo Interlandi, Pulkit Grover, Soummya Kar, Saeed Amizadeh, and Markus Weimer. “Coded Elastic Computing.” arXiv preprint arXiv:1812.06411 (2018).

  • Straggler Mitigation via Coding

    • Dean, Jeffrey, and Luiz André Barroso. “The tail at scale.” Communications of the ACM 56, no. 2 (2013): 74-80.

    • Weinberg, Jonathan. “Job Scheduling on Parallel Systems.” In Job Scheduling Strategies for Parallel Processing. 2002.

    • “Task Assignment Policies for Server Farms.”, Book Chapter in “Performance Modeling and Design of Computer Systems Queueing Theory in Action”.

    • Zaharia, Matei, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing.” In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pp. 2-2. USENIX Association, 2012.

    • Ananthanarayanan, Ganesh, Ali Ghodsi, Scott Shenker, and Ion Stoica. “Effective Straggler Mitigation: Attack of the Clones.” In NSDI, vol. 13, pp. 185-198. 2013.

    • Wang, Da, Gauri Joshi, and Gregory Wornell. “Using straggler replication to reduce latency in large-scale parallel computing.” ACM SIGMETRICS Performance Evaluation Review 43, no. 3 (2015): 7-11.

    • Lee, Kangwook, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. “Speeding up distributed machine learning using codes.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1514-1529.

    • Lee, Kangwook, Changho Suh, and Kannan Ramchandran. “High-dimensional coded matrix multiplication.” In Information Theory (ISIT), 2017 IEEE International Symposium on, pp. 2418-2422. IEEE, 2017.

    • Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.

    • Yu, Qian, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Straggler mitigation in distributed matrix multiplication: Fundamental limits and optimal coding.” arXiv preprint arXiv:1801.07487 (2018).

    • Gardner, Kristen, Mor Harchol-Balter, and Alan Scheller-Wolf. “A better model for job redundancy: Decoupling server slowdown and job size.” In Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2016 IEEE 24th International Symposium on, pp. 1-10. IEEE, 2016.

    • Joshi, Gauri, Emina Soljanin, and Gregory Wornell. “Efficient replication of queued tasks for latency reduction in cloud systems.” In Communication, Control, and Computing (Allerton), 2015 53rd Annual Allerton Conference on, pp. 107-114. IEEE, 2015.

    • Ousterhout, Kay, Patrick Wendell, Matei Zaharia, and Ion Stoica. “Sparrow: distributed, low latency scheduling.” In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 69-84. ACM, 2013.

    • Dutta, Sanghamitra, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, and Pulkit Grover. “On the optimal recovery threshold of coded matrix multiplication.” arXiv preprint arXiv:1801.10292 (2018).

    • Dutta, Sanghamitra, Viveck Cadambe, and Pulkit Grover. “Short-dot: Computing large linear transforms distributedly using coded short dot products.” In Advances In Neural Information Processing Systems, pp. 2100-2108. 2016.

    • Dutta, Sanghamitra, Viveck Cadambe, and Pulkit Grover. “Coded convolution for parallel and distributed computing within a deadline.” In Information Theory (ISIT), 2017 IEEE International Symposium on, pp. 2403-2407. IEEE, 2017.

    • Sheth, Utsav, Sanghamitra Dutta, Malhar Chaudhari, Haewon Jeong, Yaoqing Yang, Jukka Kohonen, Teemu Roos, and Pulkit Grover. “An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation.” arXiv preprint arXiv:1811.11811 (2018).

    • Dutra, Sanghamitra, Ziqian Bai, Haewon Jeong, Tze Meng Low, and Pulkit Grover. “A unified coded deep neural network training strategy based on generalized polydot codes.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1585-1589. IEEE, 2018.

    • Yu, Qian, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coded fourier transform.” In Communication, Control, and Computing (Allerton), 2017 55th Annual Allerton Conference on, pp. 494-501. IEEE, 2017.

    • Tandon, Rashish, Qi Lei, Alexandros G. Dimakis, and Nikos Karampatziakis. “Gradient coding: Avoiding stragglers in distributed learning.” In International Conference on Machine Learning, pp. 3368-3376. 2017.

    • Raviv, Netanel, Itzhak Tamo, Rashish Tandon, and Alexandros G. Dimakis. “Gradient coding from cyclic MDS codes and expander graphs.” arXiv preprint arXiv:1707.03858 (2017).

    • Ye, Min, and Emmanuel Abbe. “Communication-computation efficient gradient coding.” arXiv preprint arXiv:1802.03475 (2018).

    • Yang, Yaoqing, Pulkit Grover, and Soummya Kar. “Coded distributed computing for inverse problems.” In Advances in Neural Information Processing Systems, pp. 709-719. 2017.

    • Maity, Raj Kumar, Ankit Singh Rawat, and Arya Mazumdar. “Robust gradient descent via moment encoding with ldpc codes.” arXiv preprint arXiv:1805.08327 (2018).

    • Severinson, Albin, Alexandre Graell i Amat, and Eirik Rosnes. “Block-diagonal and lt codes for distributed computing with straggling servers.” IEEE Transactions on Communications (2018).

    • Yang, Heecheol, and Jungwoo Lee. “Secure distributed computing with straggling servers using polynomial codes.” IEEE Transactions on Information Forensics and Security 14, no. 1 (2019): 141-150.

    • Aliasgari, Malihe, Osvaldo Simeone, and Joerg Kliewer. “Distributed and Private Coded Matrix Computation with Flexible Communication Load.” arXiv preprint arXiv:1901.07705 (2019).

    • Haddadpour, Farzin, Yaoqing Yang, Malhar Chaudhari, Viveck R. Cadambe, and Pulkit Grover. “Straggler-resilient and communication-efficient distributed iterative linear solver.” arXiv preprint arXiv:1806.06140 (2018).

    • Kiani, Shahrzad, Nuwan Ferdinand, and Stark C. Draper. “Exploitation of stragglers in coded computation.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1988-1992. IEEE, 2018.

    • Ferdinand, Nuwan, and Stark C. Draper. “Hierarchical coded computation.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1620-1624. IEEE, 2018.

  • Data Shufflng

    • Lee, Kangwook, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. “Speeding up distributed machine learning using codes.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1514-1529.

    • Attia, Mohamed A., and Ravi Tandon. “Near Optimal Coded Data Shuffling for Distributed Learning.” arXiv preprint arXiv:1801.01875 (2018).

    • Elmahdy, Adel, and Soheil Mohajer. “On the Fundamental Limits of Coded Data Shuffling for Distributed Learning Systems.” arXiv preprint arXiv:1807.04255 (2018).

    • Wan, Kai, Daniela Tuninetti, Mingyue Ji, and Pablo Piantanida. “Fundamental limits of distributed data shuffling.” arXiv preprint arXiv:1807.00056 (2018).

    • Chung, Jichan, Kangwook Lee, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. “UberShuffle: Communication-efficient Data Shuffling for SGD via Coding Theory.”

  • Adversarial Computing Nodes Tolerance

    • Chen, Lingjiao, Hongyi Wang, Zachary Charles, and Dimitris Papailiopoulos. “DRACO: Byzantine-resilient Distributed Training via Redundant Gradients.” In International Conference on Machine Learning, pp. 902-911. 2018.

    • Kadhe, Swanand, O. Ozan Koyluoglu, and Kannan Ramchandran. “Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers.” arXiv preprint arXiv:1904.13373 (2019).

  • Secure Coded Computation

    • Bitar, Rawad, and Salim El Rouayheb. “Staircase codes for secret sharing with optimal communication and read overheads.” IEEE Transactions on Information Theory 64, no. 2 (2018): 933-943.

    • Bitar, Rawad, Parimal Parag, and Salim El Rouayheb. “Minimizing latency for secure coded computing using secret sharing via staircase codes.” arXiv preprint arXiv:1802.02640 (2018).

    • D'Oliveira, Rafael GL, Salim El Rouayheb, and David Karpuk. “GASP Codes for Secure Distributed Matrix Multiplication.” arXiv preprint arXiv:1812.09962 (2018).

    • Bitar, Rawad, Yuxuan Xing, Yasaman Keshtkarjahromi, Venkat Dasari, Salim El Rouayheb, and Hulya Seferoglu. “PRAC: Private and Rateless Adaptive Coded Computation at the Edge.” (2019).

  • Using Efficient Redundancy to Reduce Latency and Computing Cost in Cloud Systems

    • Wang, Da, Gauri Joshi, and Gregory Wornell. “Using straggler replication to reduce latency in large-scale parallel computing.” ACM SIGMETRICS Performance Evaluation Review 43, no. 3 (2015): 7-11.

    • Gauri Joshi, Emina Soljanin, and Gregory Wornell. “Efficient redundancy techniques for latency reduction in cloud systems.” ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 2, no. 2 (2017): 12.

    • Jiang, Zhiyuan, Sheng Zhou, Xueying Guo, and Zhisheng Niu. “Task replication for deadline-constrained vehicular cloud computing: Optimal policy, performance analysis, and implications on road traffic.” IEEE Internet of Things Journal 5, no. 1 (2018): 93-107.

    • Sun, Yin, C. Emre Koksal, and Ness B. Shroff. “On delay-optimal scheduling in queueing systems with replications.” arXiv preprint arXiv:1603.07322 (2016).

    • Aktas, Mehmet Fatih, Pei Peng, and Emina Soljanin. “Effective straggler mitigation: Which clones should attack and when?.” arXiv preprint arXiv:1710.00748 (2017).

    • Dutta, Sanghamitra, Gauri Joshi, Soumyadip Ghosh, Parijat Dube, and Priya Nagpurkar. “Slow and stale gradients can win the race: Error-runtime trade-offs in distributed SGD.” arXiv preprint arXiv:1803.01113 (2018).

    • Aktas, Mehmet Fatih, Pei Peng, and Emina Soljanin. “Straggler mitigation by delayed relaunch of tasks.” arXiv preprint arXiv:1710.00414 (2017).

    • van der Boor, Mark, Sem C. Borst, Johan SH van Leeuwaarden, and Debankur Mukherjee. “Scalable load balancing in networked systems: A survey of recent advances.” arXiv preprint arXiv:1806.05444 (2018).

    • Xu, Maotong, Sultan Alamro, Tian Lan, and Suresh Subramaniam. “Chronos: A unifying optimization framework for speculative execution of deadline-critical mapreduce jobs.” In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 718-729. IEEE, 2018.

    • Beaumont, Olivier, Lionel Eyraud-Dubois, and Yihong Gao. “Influence of Tasks Duration Variability on Task-Based Runtime Schedulers.” (2018).

    • Behrouzi-Far, Amir, and Emina Soljanin. “On the Effect of Task-to-Worker Assignment in Distributed Computing Systems with Stragglers.” In 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 560-566. IEEE, 2018.

    • Zeng, Yun, Jian Tan, and Cathy H. Xia. “Fork and Join Queueing Networks with Heavy Tails: Scaling Dimension and Throughput Limit.” ACM SIGMETRICS Performance Evaluation Review 46, no. 1 (2019): 122-124.

    • Chen, Lixing, and Jie Xu. “Task Offloading and Replication for Vehicular Cloud Computing: A Multi-Armed Bandit Approach.” arXiv preprint arXiv:1812.04575 (2018).

    • Qiu, Zhan, Juan F. Pérez, and Peter G. Harrison. “Tackling latency via replication in distributed systems.” In Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering, pp. 197-208. ACM, 2016.

    • Joshi, Gauri. “Synergy via redundancy: Boosting service capacity with adaptive replication.” ACM SIGMETRICS Performance Evaluation Review 45, no. 2 (2018): 21-28.

    • Wang, Weina, Mor Harchol-Balter, Haotian Jiang, Alan Scheller-Wolf, and R. Srikant. “Delay asymptotics and bounds for multi-task parallel jobs.” ACM SIGMETRICS Performance Evaluation Review 46, no. 3 (2019): 2-7.

    • Kaler, Tim, Yuxiong He, and Sameh Elnikety. “Optimal Reissue Policies for Reducing Tail Latency.” In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 195-206. ACM, 2017. % – Aktaş, Mehmet Fatih, and Emina Soljanin. “Heuristics for Analyzing Download Time in MDS Coded Storage Systems.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1929-1933. IEEE, 2018.

    • Qiu, Zhan, Juan F. Pérez, and Peter G. Harrison. “Tackling latency via replication in distributed systems.” In Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering, pp. 197-208. ACM, 2016.

    • Zaryadov, Ivan, Andrey Kradenyh, and Anastasiya Gorbunova. “The Analysis of Cloud Computing System as a Queueing System with Several Servers and a Single Buffer.” In International Conference on Analytical and Computational Methods in Probability Theory, pp. 11-22. Springer, Cham, 2017.

    • Wang, Huajin, Jianhui Li, Zhihong Shen, and Yuanchun Zhou. “Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach.” In 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 422-431. IEEE, 2018.

    • Anderson, Sarah E., Ann Johnston, Gauri Joshi, Gretchen L. Matthews, Carolyn Mayer, and Emina Soljanin. “Service Rate Region of Content Access from Erasure Coded Storage.” In 2018 IEEE Information Theory Workshop (ITW), pp. 1-5. IEEE, 2018.

    • Mukherjee, Debankur. “Scalable load balancing algorithms in networked systems.” arXiv preprint arXiv:1809.02018 (2018).

  • Cloud Network Control

    • Nahir, Amir, Ariel Orda, and Danny Raz. “Resource allocation and management in cloud computing.” In Integrated Network Management (IM), 2015 IFIP/IEEE International Symposium on, pp. 1078-1084. IEEE, 2015.

    • Feng, H., Llorca, J., Tulino, A.M. and Molisch, A.F., 2018. Optimal Control of Wireless Computing Networks. IEEE Transactions on Wireless Communications, 17(12), pp.8283-8298.

    • Feng, Hao, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal dynamic cloud network control.” IEEE/ACM Transactions on Networking (TON) 26, no. 5 (2018): 2118-2131.

    • Zhang, Jianan, Abhishek Sinha, Jaime Llorca, Antonia Tulino, and Eytan Modiano. “Optimal Control of Distributed Computing Networks with Mixed-Cast Traffic Flows.” arXiv preprint arXiv:1805.10527 (2018).

    • Wang, Chang-Heng, Jaime Llorca, Antonia M. Tulino, and Tara Javidi. “Dynamic Cloud Network Control under Reconfiguration Delay and Cost.” arXiv preprint arXiv:1802.06581 (2018).

    • Jiao, Lei, Antonia Maria Tulino, Jaime Llorca, Yue Jin, and Alessandra Sala. “Smoothed online resource allocation in multi-tier distributed cloud networks.” IEEE/ACM Transactions on Networking (TON) 25, no. 4 (2017): 2556-2570.

    • Mukherjee, Debankur. “Scalable load balancing algorithms in networked systems.” arXiv preprint arXiv:1809.02018 (2018).

  • Atomicity and Consistency

    • Cadambe, Viveck R., Nancy Lynch, Muriel Médard, and Peter Musial. “A coded shared atomic memory algorithm for message passing architectures.” Distributed Computing 30, no. 1 (2017): 49-73.

    • Cadambe, Viveck, Nicolas Nicolaou, Kishori M. Konwar, N. Prakash, Nancy Lynch, and Muriel Medard. “ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage.” arXiv preprint arXiv:1805.03727 (2018).

    • Konwar, Kishori M., N. Prakash, Nancy Lynch, and Muriel Médard. “A layered architecture for erasure-coded consistent distributed storage.” arXiv preprint arXiv:1703.01286 (2017).

    • Ali, Ramy E., and Viveck R. Cadambe. “Multi-version Coding for Consistent Distributed Storage of Correlated Data Updates.” arXiv preprint arXiv:1708.06042 (2017).

    • Ali, Ramy E., and Viveck Cadambe. “Harnessing Correlations in Distributed Erasure Coded Key-Value Stores.” arXiv preprint arXiv:1810.01527 (2018).

    • Wang, Zhiying, and Viveck Cadambe. “Multi-version coding in distributed storage.” In Information Theory (ISIT), 2014 IEEE International Symposium on, pp. 871-875. IEEE, 2014.

    • Ali, Ramy E., Viveck Cadambe, Jaime Llorca, and Antonia Tulino. “Multi-version Coding with Side Information.” arXiv preprint arXiv:1805.04337 (2018).

    • Wang, Zhiying, and Viveck R. Cadambe. “Multi-Version Coding—An Information-Theoretic Perspective of Consistent Distributed Storage.” IEEE Transactions on Information Theory 64, no. 6 (2018): 4540-456

Part II: Distributed Storage Theory (not thorough)

  • Survey

    • A. G. Dimakis, K. Ramchandran, Y. Wu and C. Suh, “A Survey on Network Codes for Distributed Storage,” in Proceedings of the IEEE, vol. 99, no. 3, pp. 476-489, March 2011.

  • Functional Repair

    • A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.

  • Exact Repair

    • Wu, Yunnan, and Alexandros G. Dimakis. “Reducing repair traffic for erasure coding-based storage via interference alignment.” In 2009 IEEE International Symposium on Information Theory, pp. 2276-2280. IEEE, 2009.

    • Rashmi, K. V., Nihar B. Shah, P. Vijay Kumar, and Kannan Ramchandran. “Explicit construction of optimal exact regenerating codes for distributed storage.” In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1243-1249. IEEE, 2009.

    • Papailiopoulos, Dimitris S., Alexandros G. Dimakis, and Viveck R. Cadambe. “Repair optimal erasure codes through hadamard designs.” IEEE Transactions on Information Theory 59, no. 5 (2013): 3021-3037.

    • Shanmugam, Karthikeyan, Dimitris S. Papailiopoulos, Alexandros G. Dimakis, and Giuseppe Caire. “A repair framework for scalar MDS codes.” IEEE Journal on Selected Areas in Communications 32, no. 5 (2014): 998-1007.

    • Rashmi, K. V., Nihar B. Shah, P. Vijay Kumar, and Kannan Ramchandran. “Explicit construction of optimal exact regenerating codes for distributed storage.” In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1243-1249. IEEE, 2009.

    • Tamo, Itzhak, Zhiying Wang, and Jehoshua Bruck. “Zigzag codes: MDS array codes with optimal rebuilding.” IEEE Transactions on Information Theory 59, no. 3 (2013): 1597-1616.

    • Shah, Nihar B., K. Vinayak Rashmi, P. Vijay Kumar, and Kannan Ramchandran. “Distributed storage codes with repair-by-transfer and nonachievability of interior points on the storage-bandwidth tradeoff.” IEEE Transactions on Information Theory 58, no. 3 (2012): 1837-1852.

    • Shah, Nihar B., K. V. Rashmi, P. Vijay Kumar, and Kannan Ramchandran. “Interference alignment in regenerating codes for distributed storage: Necessity and code constructions.” IEEE Transactions on Information Theory 58, no. 4 (2012): 2134-2158.

    • Cadambe, Viveck R., Syed Ali Jafar, Hamed Maleki, Kannan Ramchandran, and Changho Suh. “Asymptotic interference alignment for optimal repair of MDS codes in distributed storage.” IEEE Transactions on Information Theory 59, no. 5 (2013): 2974-2987.

    • Suh, Changho, and Kannan Ramchandran. “Exact-repair MDS code construction using interference alignment.” IEEE Transactions on Information Theory 57, no. 3 (2011): 1425-1442.

    • El Rouayheb, Salim, and Kannan Ramchandran. “Fractional repetition codes for repair in distributed storage systems.” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1510-1517. IEEE, 2010.

    • Pawar, Sameer, Nima Noorshams, Salim El Rouayheb, and Kannan Ramchandran. “Dress codes for the storage cloud: Simple randomized constructions.” In 2011 IEEE International Symposium on Information Theory Proceedings, pp. 2338-2342. IEEE, 2011.

    • Tian, Chao. “Characterizing the rate region of the (4, 3, 3) exact-repair regenerating codes.” IEEE Journal on Selected Areas in Communications 32, no. 5 (2014): 967-975.

    • Elyasi, Mehran, and Soheil Mohajer. “A cascade code construction for (n, k, d) distributed storage systems.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1241-1245. IEEE, 2018.

    • Rashmi, K. V., Nihar B. Shah, and Kannan Ramchandran. “A piggybacking design framework for read-and download-efficient distributed storage codes.” IEEE Transactions on Information Theory 63, no. 9 (2017): 5802-5820.

  • Locally Repairable Codes

    • Papailiopoulos, Dimitris S., and Alexandros G. Dimakis. “Locally repairable codes.” IEEE Transactions on Information Theory 60, no. 10 (2014): 5843-5855.

    • Tamo, Itzhak, Dimitris S. Papailiopoulos, and Alexandros G. Dimakis. “Optimal locally repairable codes and connections to matroid theory.” IEEE Transactions on Information Theory 62, no. 12 (2016): 6661-6671.

    • Rawat, Ankit Singh, Dimitris S. Papailiopoulos, Alexandros G. Dimakis, and Sriram Vishwanath. “Locality and availability in distributed storage.” IEEE Transactions on Information Theory 62, no. 8 (2016): 4481-4493.

    • Sathiamoorthy, Maheswaran, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. “Xoring elephants: Novel erasure codes for big data.” In Proceedings of the VLDB Endowment, vol. 6, no. 5, pp. 325-336. VLDB Endowment, 2013.

    • Tamo, Itzhak, and Alexander Barg. “A family of optimal locally recoverable codes.” IEEE Transactions on Information Theory 60, no. 8 (2014): 4661-4676. Shanmugam, Karthikeyan, and Alexandros G. Dimakis. “Bounding multiple unicasts through index coding and locally repairable codes.” In 2014 IEEE International Symposium on Information Theory, pp. 296-300. IEEE, 2014.

  • Random Linear Network Coding (RLNC) based design

    • A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.

    • Fitzek, Frank HP, Tamas Toth, Aron Szabados, Morten V. Pedersen, Daniel E. Lucani, Marton Sipos, Hassan Charaf, and Muriel Medard. “Implementation and performance evaluation of distributed cloud storage solutions using random linear network coding.” In 2014 IEEE International Conference on Communications Workshops (ICC), pp. 249-254. IEEE, 2014.

    • V. Abdrashitov and M. Médard, “Durable network coded distributed storage,” 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, 2015, pp. 851-856.

    • Sipos, Márton, Patrik János Braun, Daniel Enrique Lucani, Frank HP Fitzek, and Hassan Charaf. “On the effectiveness of recoding-based repair in network coded distributed storage.” Periodica Polytechnica Electrical Engineering and Computer Science 61, no. 1 (2017): 12-21.

  • Applications

    • Pawar, Sameer, Salim El Rouayheb, Hao Zhang, Kangwook Lee, and Kannan Ramchandran. “Codes for a distributed caching based video-on-demand system.” In 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 1783-1787. IEEE, 2011.

  • Private Information Retrivial and Private Coded Computation

    • Sun, Hua, and Syed Ali Jafar. “The capacity of private information retrieval.” IEEE Transactions on Information Theory 63, no. 7 (2017): 4075-4088.

    • Sun, Hua, and Syed Ali Jafar. “The capacity of robust private information retrieval with colluding databases.” IEEE Transactions on Information Theory 64, no. 4 (2018): 2361-2370.

    • Sun, Hua, and Syed Ali Jafar. “The capacity of symmetric private information retrieval.” IEEE Transactions on Information Theory 65, no. 1 (2019): 322-329.

    • Sun, Hua, and Syed Ali Jafar. “Private Information Retrieval from MDS Coded Data With Colluding Servers: Settling a Conjecture by Freij-Hollantiet al.” IEEE Transactions on Information Theory 64, no. 2 (2018): 1000-1022.

    • Sun, Hua, and Syed Ali Jafar. “Optimal download cost of private information retrieval for arbitrary message length.” IEEE Transactions on Information Forensics and Security 12, no. 12 (2017): 2920-2932.

    • Sun, Hua, and Syed Ali Jafar. “Multiround private information retrieval: Capacity and storage overhead.” IEEE Transactions on Information Theory 64, no. 8 (2018): 5743-5754.

    • Sun, Hua, and Syed Ali Jafar. “The capacity of private computation.” IEEE Transactions on Information Theory (2018).

    • Tian, Chao, Hua Sun, and Jun Chen. “Capacity-achieving private information retrieval codes with optimal message size and upload cost.” arXiv preprint arXiv:1808.07536 (2018).

    • Banawan, Karim, and Sennur Ulukus. “The capacity of private information retrieval from coded databases.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1945-1956.

    • Banawan, Karim, and Sennur Ulukus. “The capacity of private information retrieval from Byzantine and colluding databases.” IEEE Transactions on Information Theory 65, no. 2 (2019): 1206-1219.

    • Banawan, Karim, and Sennur Ulukus. “Multi-message private information retrieval: Capacity results and near-optimal schemes.” IEEE Transactions on Information Theory 64, no. 10 (2018): 6842-6862.

    • Wei, Yi-Peng, Karim Banawan, and Sennur Ulukus. “Fundamental limits of cache-aided private information retrieval with unknown and uncoded prefetching.” IEEE Transactions on Information Theory (2018).

    • Banawan, Karim, Batuhan Arasli, Yi-Peng Wei, and Sennur Ulukus. “The Capacity of Private Information Retrieval from Heterogeneous Uncoded Caching Databases.” arXiv preprint arXiv:1902.09512 (2019).

    • Wei, Yi-Peng, Batuhan Arasli, Karim Banawan, and Sennur Ulukus. “The capacity of private information retrieval from decentralized uncoded caching databases.” arXiv preprint arXiv:1811.11160 (2018).

    • Wei, Yi-Peng, Karim Banawan, and Sennur Ulukus. “Cache-aided private information retrieval with partially known uncoded prefetching: Fundamental limits.” IEEE Journal on Selected Areas in Communications 36, no. 6 (2018): 1126-1139.

    • Attia, Mohamed Adel, Deepak Kumar, and Ravi Tandon. “The capacity of private information retrieval from uncoded storage constrained databases.” arXiv preprint arXiv:1805.04104 (2018).

    • Tandon, Ravi. “The capacity of cache aided private information retrieval.” In 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1078-1082. IEEE, 2017.

    • Wang, Qiwen, and Mikael Skoglund. “Symmetric Private Information Retrieval from MDS Coded Distributed Storage with Non-colluding and Colluding Servers.” IEEE Transactions on Information Theory (2019).

    • Wang, Qiwen, Hua Sun, and Mikael Skoglund. “The capacity of private information retrieval with eavesdroppers.” IEEE Transactions on Information Theory (2018).

    • Shah, Nihar B., K. V. Rashmi, and Kannan Ramchandran. “One extra bit of download ensures perfectly private information retrieval.” In 2014 IEEE International Symposium on Information Theory, pp. 856-860. IEEE, 2014.

    • Fanti, Giulia, and Kannan Ramchandran. “Efficient private information retrieval over unsynchronized databases.” IEEE Journal of Selected Topics in Signal Processing 9, no. 7 (2015): 1229-1239.

    • Mirmohseni, Mahtab, and Mohammad Ali Maddah-Ali. “Private function retrieval.” In 2018 Iran Workshop on Communication and Information Theory (IWCIT), pp. 1-6. IEEE, 2018.

    • Woolsey, Nicholas, Rong-Rong Chen, and Mingyue Ji. “A New Design of Private Information Retrieval for Storage Constrained Databases.” arXiv preprint arXiv:1901.07490 (2019).

    • Sun, Hua, and Syed A. Jafar. “On the Capacity of Locally Decodable Codes.” arXiv preprint arXiv:1812.05566 (2018).

    • Sun, Hua, and Syed Ali Jafar. “The capacity of private computation.” IEEE Transactions on Information Theory (2018).

    • Woolsey, Nicholas, Rong-Rong Chen, and Mingyue Ji. “An Optimal Iterative Placement Algorithm for PIR from Heterogeneous Storage-Constrained Databases.” arXiv preprint arXiv:1904.02131 (2019).

    • Mousavi, Mohammad Hossein, Mohammad Ali Maddah-Ali, and Mahtab Mirmohseni. “Private Inner Product Retrieval for Distributed Machine Learning.” arXiv preprint arXiv:1902.06319 (2019).

    • Aliasgari, Malihe, Osvaldo Simeone, and Joerg Kliewer. “Distributed and Private Coded Matrix Computation with Flexible Communication Load.” arXiv preprint arXiv:1901.07705 (2019).

    • Mousavi, Mohammad Hossein, Mohammad Ali Maddah-Ali, and Mahtab Mirmohseni. “Private Inner Product Retrieval for Distributed Machine Learning.” arXiv preprint arXiv:1902.06319 (2019).

    • Raviv, Netanel, and David A. Karpuk. “Private polynomial computation from Lagrange encoding.” arXiv preprint arXiv:1812.04142 (2018).

    • Obead, Sarah A., Hsuan-Yin Lin, Eirik Rosnes, and Jörg Kliewer. “Capacity of private linear computation for coded databases.” In 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 813-820. IEEE, 2018.

    • Heidarzadeh, Anoosheh, Swanand Kadhe, Salim El Rouayheb, and Alex Sprintson. “Single-Server Multi-Message Individually-Private Information Retrieval with Side Information.” arXiv preprint arXiv:1901.07509 (2019).

    • Bitar, Rawad, and Salim El Rouayheb. “Staircase-PIR: Universally Robust Private Information Retrieval.” In 2018 IEEE Information Theory Workshop (ITW), pp. 1-5. IEEE, 2018.

    • Tajeddine, Razane, Oliver W. Gnilke, and Salim El Rouayheb. “Private information retrieval from MDS coded data in distributed storage systems.” IEEE Transactions on Information Theory 64, no. 11 (2018): 7081-7093.

    • D'Oliveira, Rafael GL, and Salim El Rouayheb. “One-Shot PIR: Refinement and Lifting.” arXiv preprint arXiv:1810.05719 (2018).

    • D'Oliveira, Rafael GL, and Salim El Rouayheb. “Lifting private information retrieval from two to any number of messages.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1744-1748. IEEE, 2018.

    • Kadhe, Swanand, Brenden Garcia, Anoosheh Heidarzadeh, Salim El Rouayheb, and Alex Sprintson. “Private information retrieval with side information.” arXiv preprint arXiv:1709.00112 (2017).

    • Tajeddine, Razane, Oliver W. Gnilke, David Karpuk, Ragnar Freij-Hollanti, Camilla Hollanti, and Salim El Rouayheb. “Private information retrieval schemes for coded data with arbitrary collusion patterns.” In 2017 IEEE International Symposium on Information Theory (ISIT), pp. 1908-1912. IEEE, 2017.

    • Tajeddine, Razane, and Salim El Rouayheb. “Robust private information retrieval on coded data.” In 2017 IEEE International Symposium on Information Theory (ISIT), pp. 1903-1907. IEEE, 2017.

Part III: Distributed Machine Learning (not thorough)

  • Synchronous (Stochastic) Gradient Descent

    • Yin, Dong, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, and Peter Bartlett. “Gradient diversity: a key ingredient for scalable distributed learning.” arXiv preprint arXiv:1706.05699 (2017).

    • Bottou, Léon, Frank E. Curtis, and Jorge Nocedal. “Optimization methods for large-scale machine learning.” Siam Review 60, no. 2 (2018): 223-311.

    • Bousquet, Olivier, and André Elisseeff. “Stability and generalization.” Journal of machine learning research 2, no. Mar (2002): 499-526.

    • Chen, Jianmin, Xinghao Pan, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. “Revisiting distributed synchronous SGD.” arXiv preprint arXiv:1604.00981 (2016).

    • Cotter, Andrew, Ohad Shamir, Nati Srebro, and Karthik Sridharan. “Better mini-batch algorithms via accelerated gradient methods.” In Advances in neural information processing systems, pp. 1647-1655. 2011.

    • De, Soham, Abhay Yadav, David Jacobs, and Tom Goldstein. “Big batch SGD: Automated inference using adaptive batch sizes.” arXiv preprint arXiv:1610.05792 (2016).

    • Karimi, Hamed, Julie Nutini, and Mark Schmidt. “Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition.” In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 795-811. Springer, Cham, 2016.

    • Lee, Jason D., Qihang Lin, Tengyu Ma, and Tianbao Yang. “Distributed stochastic variance reduced gradient methods by sampling extra data with replacement.” The Journal of Machine Learning Research 18, no. 1 (2017): 4404-4446.

    • Li, Mu, Tong Zhang, Yuqiang Chen, and Alexander J. Smola. “Efficient mini-batch training for stochastic optimization.” In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 661-670. ACM, 2014.

    • Lian, Xiangru, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent.” In Advances in Neural Information Processing Systems, pp. 5330-5340. 2017.

    • Zinkevich, Martin, Markus Weimer, Lihong Li, and Alex J. Smola. “Parallelized stochastic gradient descent.” In Advances in neural information processing systems, pp. 2595-2603. 2010.

    • Cotter, Andrew, Ohad Shamir, Nati Srebro, and Karthik Sridharan. “Better mini-batch algorithms via accelerated gradient methods.” In Advances in neural information processing systems, pp. 1647-1655. 2011.

    • Dekel, Ofer, Ran Gilad-Bachrach, Ohad Shamir, and Lin Xiao. “Optimal distributed online prediction using mini-batches.” Journal of Machine Learning Research 13, no. Jan (2012): 165-202.

    • M. P. Friedlander and M. Schmidt. Hybrid deterministic-stochastic methods for data tting. SIAM Journal on Scienti c Computing, 34(3):A1380{A1405, 2012.

    • M. Takac, A. S. Bijral, P. Richtarik, and N. Srebro. Mini-batch primal and dual methods for svms. In ICML (3), pages 1022{1030, 2013.

    • M. Li, T. Zhang, Y. Chen, and A. J. Smola. Ecient mini-batch training for stochastic optimization. In Proceedings of the 20th ACM SIGKDD, pages 661{670. ACM, 2014.

    • P. Jain, S. M. Kakade, R. Kidambi, P. Netrapalli, and A. Sidford. Parallelizing stochastic approximation through mini-batching and tail-averaging. arXiv preprint arXiv:1610.03774, 2016.

  • Straggler Mitigation via Asynchronous (Stochastic) Gradient Descent

    • Mania, Horia, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. “Perturbed iterate analysis for asynchronous stochastic optimization.” arXiv preprint arXiv:1507.06970 (2015).

    • Pan, Xinghao, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, and Christopher Ré. “Cyclades: Conflict-free asynchronous machine learning.” In Advances in Neural Information Processing Systems, pp. 2568-2576. 2016.

    • Recht, Benjamin, Christopher Re, Stephen Wright, and Feng Niu. “Hogwild: A lock-free approach to parallelizing stochastic gradient descent.” In Advances in neural information processing systems, pp. 693-701. 2011.

    • Dutta, Sanghamitra, Gauri Joshi, Soumyadip Ghosh, Parijat Dube, and Priya Nagpurkar. “Slow and stale gradients can win the race: Error-runtime trade-offs in distributed SGD.” arXiv preprint arXiv:1803.01113 (2018).

    • X. Lian, Y. Huang, Y. Li, and J. Liu, “Asynchronous parallel stochastic gradient for nonconvex optimization,” in Advances in Neural Informa- tion Processing Systems, 2015, pp. 2737–2745.

    • Zheng, Shuxin, Qi Meng, Taifeng Wang, Wei Chen, Nenghai Yu, Zhi-Ming Ma, and Tie-Yan Liu. “Asynchronous stochastic gradient descent with delay compensation.” In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 4120-4129. JMLR. org, 2017.

    • X. Lian, W. Zhang, C. Zhang, and J. Liu, “Asynchronous decentralized parallel stochastic gradient descent,” arXiv preprint arXiv:1710.06952, 2017.

  • Straggler Mitigation via Coding

    • Dutta, Sanghamitra, Ziqian Bai, Tze Meng Low, and Pulkit Grover. “CodeNet: Training Large Scale Neural Networks in Presence of Soft-Errors.” arXiv preprint arXiv:1903.01042 (2019).

    • Dutta, Sanghamitra, Ziqian Bai, Haewon Jeong, Tze Meng Low, and Pulkit Grover. “A unified coded deep neural network training strategy based on generalized polydot codes for matrix multiplication.” arXiv preprint arXiv:1811.10751 (2018).

    • Sheth, Utsav, Sanghamitra Dutta, Malhar Chaudhari, Haewon Jeong, Yaoqing Yang, Jukka Kohonen, Teemu Roos, and Pulkit Grover. “An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation.” In 2018 IEEE International Conference on Big Data (Big Data), pp. 1113-1120. IEEE, 2018.

    • So, Jinhyun, Basak Guler, A. Salman Avestimehr, and Payman Mohassel. “CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning.” arXiv preprint arXiv:1902.00641 (2019).

    • Li, Songze, Seyed Mohammadreza Mousavi Kalan, Qian Yu, Mahdi Soltanolkotabi, and A. Salman Avestimehr. “Polynomially coded regression: Optimal straggler mitigation via data encoding.” arXiv preprint arXiv:1805.09934 (2018).

    • Avestimehr, A. Salman, Seyed Mohammadreza Mousavi Kalan, and Mahdi Soltanolkotabi. “Fundamental resource trade-offs for encoded distributed optimization.” arXiv preprint arXiv:1804.00217 (2018).

  • Communication Bottleneck and Gradient Quatization

    • Tsitsiklis, John N., and Zhi-Quan Luo. “Communication complexity of convex optimization.” Journal of Complexity 3, no. 3 (1987): 231-243.

    • Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH, 2014.

    • Nikko Strom. Scalable distributed DNN training using commodity GPU cloud computing. In INTERSPEECH, 2015.

    • Christopher M De Sa, Ce Zhang, Kunle Olukotun, and Christopher Ré. Taming the wild: A unified analysis of hogwild-style algorithms. In NIPS, 2015.

    • Alistarh, Dan, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. “QSGD: Communication-efficient SGD via gradient quantization and encoding.” In Advances in Neural Information Processing Systems, pp. 1709-1720. 2017.

    • Wen, Wei, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. “Terngrad: Ternary gradients to reduce communication in distributed deep learning.” In Advances in neural information processing systems, pp. 1509-1519. 2017.

  • Federated Learning

    • Konečný, Jakub, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. “Federated learning: Strategies for improving communication efficiency.” arXiv preprint arXiv:1610.05492 (2016).

    • Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “When edge meets learning: Adaptive control for resource-constrained distributed machine learning.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp. 63-71. IEEE, 2018.

    • Tuor, Tiffany, Shiqiang Wang, Theodoras Salonidis, Bong Jun Ko, and Kin K. Leung. “Demo abstract: Distributed machine learning at resource-limited edge nodes.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1-2. IEEE, 2018.

    • Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “Adaptive federated learning in resource constrained edge computing systems.” IEEE Journal on Selected Areas in Communications (2019).

    • S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge computing systems,” Tech. Rep., 2019. Online. Available: https:arxiv.orgabs1804.05271

    • Tuor, Tiffany, Shiqiang Wang, Kin K. Leung, and Kevin Chan. “Distributed machine learning in coalition environments: overview of techniques.” In 2018 21st International Conference on Information Fusion (FUSION), pp. 814-821. IEEE, 2018.

    • Yousefpour, Ashkan, Caleb Fung, Tam Nguyen, Krishna Kadiyala, Fatemeh Jalali, Amirreza Niakanlahiji, Jian Kong, and Jason P. Jue. “All one needs to know about fog computing and related edge computing paradigms: a complete survey.” Journal of Systems Architecture (2019).

    • Yang, Qiang, Yang Liu, Tianjian Chen, and Yongxin Tong. “Federated Machine Learning: Concept and Applications.” ACM Transactions on Intelligent Systems and Technology (TIST) 10, no. 2 (2019): 12.

    • B. McMahan and D. Ramage, “Federated learning: Collaborative machine learning without centralized training data,” Apr. 2017. Online. Available: https:ai.googleblog.com201704/federated- learning-collaborative.html

    • Wang, Jianyu, and Gauri Joshi. “Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD.” arXiv preprint arXiv:1810.08313 (2018).

    • Wang, Jianyu, and Gauri Joshi. “Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms.” arXiv preprint arXiv:1808.07576 (2018).

    • K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for federated learning on user-held data,” in NIPS Workshop on Private Multi-Party Machine Learning, 2016.

    • J. Konen, H. B. McMahan, D. Ramage, and P. Richtarik, “Federated optimization: Distributed machine learning for on-device intelligence,” 2016. Online. Available: https:arxiv.orgabs1610.02527

    • T. Nishio and R. Yonetani, “Client selection for federated learn- ing with heterogeneous resources in mobile edge,” arXiv preprint arXiv:1804.08333, 2018.

    • Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582, 2018.

    • H. Yu, S. Yang, and S. Zhu, “Parallel restarted SGD with faster con- vergence and less communication: Demystifying why model averaging works for deep learning,” in AAAI Conference on Artificial Intelligence, Jan.–Feb. 2019.

    • C. Ma, J. Konecˇny‘, M. Jaggi, V. Smith, M. I. Jordan, P. Richta ́rik, and M. Taka ́cˇ, “Distributed optimization with arbitrary local solvers,” Optimization Methods and Software, vol. 32, no. 4, pp. 813–848, 2017.