What is cloud computing
Examples on cloud computing (Amazon Web Services (AWS))
What we will/won't cover
Grading Polices
MapReduce and Spark
Coded MapReduce
References:
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), 107-113.
Zaharia, Matei, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. “Spark: Cluster computing with working sets.” HotCloud 10, no. 10-10 (2010): 95.
Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coded mapreduce.” In Communication, Control, and Computing (Allerton), 2015 53rd Annual Allerton Conference on, pp. 964-971. IEEE, 2015.
Li, Songze, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, and Salman Avestimehr. “Coded terasort.” In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International, pp. 389-398. IEEE, 2017. (Coded TeraShort Implementations: here)
Further reading:
Li, Songze, Mohammad Ali Maddah-Ali, Qian Yu, and A. Salman Avestimehr. “A fundamental tradeoff between computation and communication in distributed computing.” IEEE Transactions on Information Theory 64, no. 1 (2018): 109-128.
Ji, Mingyue, Giuseppe Caire, and Andreas F. Molisch. “Fundamental limits of caching in wireless D2D networks.” IEEE Transactions on Information Theory 62, no. 2 (2016): 849-869.
Coded Distributed Computing (CDC) and its constraints
References:
Li, Songze, Mohammad Ali Maddah-Ali, Qian Yu, and A. Salman Avestimehr. “A fundamental tradeoff between computation and communication in distributed computing.” IEEE Transactions on Information Theory 64, no. 1 (2018): 109-128.
Further reading:
Li, Songze, Qian Yu, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “A scalable framework for wireless distributed computing.” IEEE/ACM Transactions on Networking 25, no. 5 (2017): 2643-2654.
Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coding for distributed fog computing.” IEEE Communications Magazine 55, no. 4 (2017): 34-40.
Reisizadeh, Amirhossein, Saurav Prakash, Ramtin Pedarsani, and Amir Salman Avestimehr. “Coded computation over heterogeneous clusters.” arXiv preprint arXiv:1701.05973 (2017).
Kiamari, Mehrdad, Chenwei Wang, and A. Salman Avestimehr. “On heterogeneous coded distributed computing.” In GLOBECOM 2017-2017 IEEE Global Communications Conference, pp. 1-7. IEEE, 2017.
Ezzeldin, Yahya H., Mohammed Karmoose, and Christina Fragouli. “Communication vs distributed computation: an alternative trade-off curve.” In Information Theory Workshop (ITW), 2017 IEEE, pp. 279-283. IEEE, 2017.
Prakash, Saurav, Amirhossein Reisizadeh, Ramtin Pedarsani, and Salman Avestimehr. “Coded Computing for Distributed Graph Analytics.” arXiv preprint arXiv:1801.05522 (2018). (Coded PageRank Implementations: here)
Srinivasavaradhan, Sundara Rajan, Linqi Song, and Christina Fragouli. “Distributed Computing Trade-offs with Random Connectivity.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1281-1285. IEEE, 2018.
Song, Linqi, Sundara Rajan Srinivasavaradhan, and Christina Fragouli. “The benefit of being flexible in distributed computation.” In Information Theory Workshop (ITW), 2017 IEEE, pp. 289-293. IEEE, 2017.
Li, Songze, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Compressed Coded Distributed Computing.” arXiv preprint arXiv:1805.01993 (2018).
General Scheme for CDC
Coded Distributed Computing (CDC) with a hypercube design
Cascaded Coded Distributed Computing
Refeferences:
Woolsey, Nicholas, Rong-Rong Chen and Mingyue Ji, “A New Combinatorial Design of Coded Distributed Computing,” 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, 2018, pp. 726-730.
Li, Songze, Mohammad Ali Maddah-Ali, Qian Yu, and A. Salman Avestimehr. “A fundamental tradeoff between computation and communication in distributed computing.” IEEE Transactions on Information Theory 64, no. 1 (2018): 109-128.
Further reading:
Konstantinidis, Konstantinos, and Aditya Ramamoorthy. “Leveraging Coding Techniques for Speeding up Distributed Computing.” arXiv preprint arXiv:1802.03049 (2018).
Woolsey, Nicholas, Rong-Rong Chen and Mingyue Ji, “A New Combinatorial Design of Coded Distributed Computing,” 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, 2018, pp. 726-730.
Woolsey, Nicholas, Rong-Rong Chen, and Mingyue Ji. “Cascaded Coded Distributed Computing on Heterogeneous Networks.” arXiv preprint arXiv:1901.07670 (2019).
Yang, Yaoqing, Matteo Interlandi, Pulkit Grover, Soummya Kar, Saeed Amizadeh, and Markus Weimer. “Coded Elastic Computing.” arXiv preprint arXiv:1812.06411 (2018).
Overview of error control codes
References:
Roth, Ron. Introduction to coding theory. Cambridge University Press, 2006.
Straggler Mitigation using Codes
MDS Codes
Prouct Codes
Polynomial Codes
References:
Lee, Kangwook, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. “Speeding up distributed machine learning using codes.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1514-1529.
Lee, Kangwook, Changho Suh, and Kannan Ramchandran. “High-dimensional coded matrix multiplication.” In Information Theory (ISIT), 2017 IEEE International Symposium on, pp. 2418-2422. IEEE, 2017.
Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.
Further Reading:
Huang, Kuang-Hua. “Algorithm-based fault tolerance for matrix operations.” IEEE transactions on computers 100, no. 6 (1984): 518-528.
Lee, Kangwook, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. “Speeding up distributed machine learning using codes.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1514-1529.
Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.
Yu, Qian, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coded fourier transform.” In Communication, Control, and Computing (Allerton), 2017 55th Annual Allerton Conference on, pp. 494-501. IEEE, 2017.
Tandon, Rashish, Qi Lei, Alexandros G. Dimakis, and Nikos Karampatziakis. “Gradient coding: Avoiding stragglers in distributed learning.” In International Conference on Machine Learning, pp. 3368-3376. 2017.
Raviv, Netanel, Itzhak Tamo, Rashish Tandon, and Alexandros G. Dimakis. “Gradient coding from cyclic MDS codes and expander graphs.” arXiv preprint arXiv:1707.03858 (2017).
Ye, Min, and Emmanuel Abbe. “Communication-computation efficient gradient coding.” arXiv preprint arXiv:1802.03475 (2018).
Straggler Mitigation using Codes
Polynomial Codes
MatDot Codes
References:
Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.
Dutta, Sanghamitra, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, and Pulkit Grover. “On the optimal recovery threshold of coded matrix multiplication.” arXiv preprint arXiv:1801.10292 (2018).
Further Reading:
Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.
Dutta, Sanghamitra, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, and Pulkit Grover. “On the optimal recovery threshold of coded matrix multiplication.” arXiv preprint arXiv:1801.10292 (2018).
Dutta, Sanghamitra, Viveck Cadambe, and Pulkit Grover. “Short-dot: Computing large linear transforms distributedly using coded short dot products.” In Advances In Neural Information Processing Systems, pp. 2100-2108. 2016.
Dutta, Sanghamitra, Viveck Cadambe, and Pulkit Grover. “Coded convolution for parallel and distributed computing within a deadline.” In Information Theory (ISIT), 2017 IEEE International Symposium on, pp. 2403-2407. IEEE, 2017.
Sheth, Utsav, Sanghamitra Dutta, Malhar Chaudhari, Haewon Jeong, Yaoqing Yang, Jukka Kohonen, Teemu Roos, and Pulkit Grover. “An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation.” arXiv preprint arXiv:1811.11811 (2018).
Dutra, Sanghamitra, Ziqian Bai, Haewon Jeong, Tze Meng Low, and Pulkit Grover. “A unified coded deep neural network training strategy based on generalized polydot codes.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1585-1589. IEEE, 2018.
Yu, Qian, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr. “Coded fourier transform.” In Communication, Control, and Computing (Allerton), 2017 55th Annual Allerton Conference on, pp. 494-501. IEEE, 2017.
Tandon, Rashish, Qi Lei, Alexandros G. Dimakis, and Nikos Karampatziakis. “Gradient coding: Avoiding stragglers in distributed learning.” In International Conference on Machine Learning, pp. 3368-3376. 2017.
Raviv, Netanel, Itzhak Tamo, Rashish Tandon, and Alexandros G. Dimakis. “Gradient coding from cyclic MDS codes and expander graphs.” arXiv preprint arXiv:1707.03858 (2017).
Ye, Min, and Emmanuel Abbe. “Communication-computation efficient gradient coding.” arXiv preprint arXiv:1802.03475 (2018).
Yang, Yaoqing, Pulkit Grover, and Soummya Kar. “Coded distributed computing for inverse problems.” In Advances in Neural Information Processing Systems, pp. 709-719. 2017.
Haddadpour, Farzin, Yaoqing Yang, Malhar Chaudhari, Viveck R. Cadambe, and Pulkit Grover. “Straggler-resilient and communication-efficient distributed iterative linear solver.” arXiv preprint arXiv:1806.06140 (2018).
Kiani, Shahrzad, Nuwan Ferdinand, and Stark C. Draper. “Exploitation of stragglers in coded computation.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1988-1992. IEEE, 2018.
Ferdinand, Nuwan, and Stark C. Draper. “Hierarchical coded computation.” In 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1620-1624. IEEE, 2018.
Straggler Mitigation using Codes
Polydot Codes
Entangled Polynomial Codes/Generalized Polydot Codes
Overview of Probability and Random Processes
References:
Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.” In Advances in Neural Information Processing Systems, pp. 4403-4413. 2017.
Dutta, Sanghamitra, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, and Pulkit Grover. “On the optimal recovery threshold of coded matrix multiplication.” arXiv preprint arXiv:1801.10292 (2018).
Grimmett, Geoffrey, and David Stirzaker. Probability and random processes. Oxford university press, 2001.
Gallager, Robert G. Stochastic processes: theory for applications. Cambridge University Press, 2013.
Further Reading:
Same as in Lecture 7
Overview of Probability and Random Processes
Using Efficient Redundancy to Reduce Latency in Cloud Systems
References:
Grimmett, Geoffrey, and David Stirzaker. Probability and random processes. Oxford university press, 2001.
Gallager, Robert G. Stochastic processes: theory for applications. Cambridge University Press, 2013.
David, Herbert Aron, and Haikady Navada Nagaraja. “Order statistics.” Encyclopedia of Statistical Sciences 9 (2004) (Chapter 10).
De Haan, Laurens, and Ana Ferreira. Extreme value theory: an introduction. Springer Science & Business Media, 2007 (Chapter 1)
Wang, Da, Gauri Joshi, and Gregory Wornell. “Using straggler replication to reduce latency in large-scale parallel computing.” ACM SIGMETRICS Performance Evaluation Review 43, no. 3 (2015): 7-11.
Using Efficient Redundancy to Reduce Latency in Cloud Systems
References:
Wang, Da, Gauri Joshi, and Gregory Wornell. “Using straggler replication to reduce latency in large-scale parallel computing.” ACM SIGMETRICS Performance Evaluation Review 43, no. 3 (2015): 7-11.
Further reading:
S. Dutta, V. Cadambe and P. Grover, “Coded convolution for parallel and distributed computing within a deadline,” 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, 2017, pp. 2403-2407.
Mallick, Ankur, Malhar Chaudhari, and Gauri Joshi. “Rateless codes for near-perfect load balancing in distributed matrix-vector multiplication.” arXiv preprint arXiv:1804.10331 (2018).
Aktas, Mehmet Fatih, Pei Peng, and Emina Soljanin. “Effective straggler mitigation: Which clones should attack and when?.” arXiv preprint arXiv:1710.00748 (2017).
Overview of Queueing Theory
References:
Harchol-Balter, Mor. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press, 2013. (Chapter 6)
Gallager, Robert G. Stochastic processes: theory for applications. Cambridge University Press, 2013.
Durrett, Richard, and R. Durrett. Essentials of stochastic processes. Vol. 1. New York: Springer, 1999.
Tasks Assignment Policies in Server Farms
References:
Harchol-Balter, Mor. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press, 2013. (Chapter 24)
Using Efficient Redundancy of Queued Tasks to Reduce Latency and Computing Cost in Cloud Systems
References:
Gauri Joshi, Emina Soljanin, and Gregory Wornell. “Efficient redundancy techniques for latency reduction in cloud systems.” ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 2, no. 2 (2017): 12.
Further Reading:
Gauri Joshi, Emina Soljanin, and Gregory Wornell. “Efficient redundancy techniques for latency reduction in cloud systems.” ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 2, no. 2 (2017): 12.
Aktas, Mehmet Fatih, Pei Peng, and Emina Soljanin. “Straggler mitigation by delayed relaunch of tasks.” arXiv preprint arXiv:1710.00414 (2017).
Shuai, Qiqi, Victor OK Li, and Zhiyi Lu. “Which achieves lower latency with redundant requests, replication or coding?.” In GLOBECOM 2017-2017 IEEE Global Communications Conference, pp. 1-6. IEEE, 2017.
Chakravarthy, Srinivas R., and Alexander Rumyantsev. “Efficient redundancy techniques in cloud and desktop grid systems using MAPGc-type queues.” Open Engineering 8, no. 1 (2018): 17-31.
Joshi, Gauri. “Synergy via redundancy: Boosting service capacity with adaptive replication.” ACM SIGMETRICS Performance Evaluation Review 45, no. 2 (2018): 21-28.
Wang, Weina, Mor Harchol-Balter, Haotian Jiang, Alan Scheller-Wolf, and R. Srikant. “Delay asymptotics and bounds for multi-task parallel jobs.” ACM SIGMETRICS Performance Evaluation Review 46, no. 3 (2019): 2-7.
Kaler, Tim, Yuxiong He, and Sameh Elnikety. “Optimal Reissue Policies for Reducing Tail Latency.” In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 195-206. ACM, 2017.
Qiu, Zhan, Juan F. Pérez, and Peter G. Harrison. “Tackling latency via replication in distributed systems.” In Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering, pp. 197-208. ACM, 2016.
Zaryadov, Ivan, Andrey Kradenyh, and Anastasiya Gorbunova. “The Analysis of Cloud Computing System as a Queueing System with Several Servers and a Single Buffer.” In International Conference on Analytical and Computational Methods in Probability Theory, pp. 11-22. Springer, Cham, 2017.
Wang, Huajin, Jianhui Li, Zhihong Shen, and Yuanchun Zhou. “Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach.” In 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 422-431. IEEE, 2018.
Mukherjee, Debankur. “Scalable load balancing algorithms in networked systems.” arXiv preprint arXiv:1809.02018 (2018).
Overview of Lyapunov Optimizaiton Framework
References:
Neely, Michael J. “Stochastic network optimization with application to communication and queueing systems.” Synthesis Lectures on Communication Networks 3, no. 1 (2010): 1-211.
Georgiadis, Leonidas, Michael J. Neely, and Leandros Tassiulas. “Resource allocation and cross-layer control in wireless networks.” Foundations and Trends® in Networking 1, no. 1 (2006): 1-144.
Optimal Dynamic Cloud Network Control
References:
Feng, Hao, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal dynamic cloud network control.” IEEE/ACM Transactions on Networking (TON) 26, no. 5 (2018): 2118-2131.
Further Reading:
Feng, Hao, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal Control of Wireless Computing Networks.” IEEE Transactions on Wireless Communications 17, no. 12 (2018): 8283-8298.
Wang, Chang-Heng, Jaime Llorca, Antonia M. Tulino, and Tara Javidi. “Dynamic Cloud Network Control under Reconfiguration Delay and Cost.” IEEE/ACM Transactions on Networking (2019).
Feng, Hao, Jaime Llorca, Antonia M. Tulino, Danny Raz, and Andreas F. Molisch. “Approximation algorithms for the NFV service distribution problem.” In IEEE INFOCOM 2017-IEEE Conference on Computer Communications, pp. 1-9. IEEE, 2017.
Introduction to Distributed Storage Theory
Network Coding Theory
References;
A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.
Further Reading:
A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.
A. G. Dimakis, K. Ramchandran, Y. Wu and C. Suh, “A Survey on Network Codes for Distributed Storage,” in Proceedings of the IEEE, vol. 99, no. 3, pp. 476-489, March 2011.
Li, S-YR, Raymond W. Yeung, and Ning Cai. “Linear network coding.” IEEE transactions on information theory 49, no. 2 (2003): 371-381.
Koetter, Ralf, and Muriel Médard. “An algebraic approach to network coding.” IEEE/ACM Transactions on Networking (TON) 11, no. 5 (2003): 782-795.
T. Ho et al., “A Random Linear Network Coding Approach to Multicast,” in IEEE Transactions on Information Theory, vol. 52, no. 10, pp. 4413-4430, Oct. 2006.
Information Flow Graph and Applications
References;
A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.
Further Reading:
Same as in Lecture 16
Storage-Bandwidth Tradeoff
Proofs
Minimum Bandwidth Regenerating (MBR) codes
Minimum Storage Regerating (MSR) codes
References:
A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright and K. Ramchandran, “Network Coding for Distributed Storage Systems,” in IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010.
Further Reading:
Same as in Lecture 16
Private Information Retrieval
References:
Sun, Hua, and Syed Ali Jafar. “The capacity of private information retrieval.” IEEE Transactions on Information Theory 63, no. 7 (2017): 4075-4088.
Further Reading:
Sun, Hua, and Syed Ali Jafar. “The capacity of private information retrieval.” IEEE Transactions on Information Theory 63, no. 7 (2017): 4075-4088.
Sun, Hua, and Syed Ali Jafar. “The capacity of robust private information retrieval with colluding databases.” IEEE Transactions on Information Theory 64, no. 4 (2018): 2361-2370.
Sun, Hua, and Syed Ali Jafar. “The capacity of symmetric private information retrieval.” IEEE Transactions on Information Theory 65, no. 1 (2019): 322-329.
Sun, Hua, and Syed Ali Jafar. “Private Information Retrieval from MDS Coded Data With Colluding Servers: Settling a Conjecture by Freij-Hollantiet al.” IEEE Transactions on Information Theory 64, no. 2 (2018): 1000-1022.
Sun, Hua, and Syed Ali Jafar. “Optimal download cost of private information retrieval for arbitrary message length.” IEEE Transactions on Information Forensics and Security 12, no. 12 (2017): 2920-2932.
Sun, Hua, and Syed Ali Jafar. “Multiround private information retrieval: Capacity and storage overhead.” IEEE Transactions on Information Theory 64, no. 8 (2018): 5743-5754.
Sun, Hua, and Syed Ali Jafar. “The capacity of private computation.” IEEE Transactions on Information Theory (2018).
Tian, Chao, Hua Sun, and Jun Chen. “Capacity-achieving private information retrieval codes with optimal message size and upload cost.” arXiv preprint arXiv:1808.07536 (2018).
Banawan, Karim, and Sennur Ulukus. “The capacity of private information retrieval from coded databases.” IEEE Transactions on Information Theory 64, no. 3 (2018): 1945-1956.
Banawan, Karim, and Sennur Ulukus. “The capacity of private information retrieval from Byzantine and colluding databases.” IEEE Transactions on Information Theory 65, no. 2 (2019): 1206-1219.
Banawan, Karim, and Sennur Ulukus. “Multi-message private information retrieval: Capacity results and near-optimal schemes.” IEEE Transactions on Information Theory 64, no. 10 (2018): 6842-6862.
Wei, Yi-Peng, Karim Banawan, and Sennur Ulukus. “Fundamental limits of cache-aided private information retrieval with unknown and uncoded prefetching.” IEEE Transactions on Information Theory (2018).
Banawan, Karim, Batuhan Arasli, Yi-Peng Wei, and Sennur Ulukus. “The Capacity of Private Information Retrieval from Heterogeneous Uncoded Caching Databases.” arXiv preprint arXiv:1902.09512 (2019).
Wei, Yi-Peng, Batuhan Arasli, Karim Banawan, and Sennur Ulukus. “The capacity of private information retrieval from decentralized uncoded caching databases.” arXiv preprint arXiv:1811.11160 (2018).
Wei, Yi-Peng, Karim Banawan, and Sennur Ulukus. “Cache-aided private information retrieval with partially known uncoded prefetching: Fundamental limits.” IEEE Journal on Selected Areas in Communications 36, no. 6 (2018): 1126-1139.
Attia, Mohamed Adel, Deepak Kumar, and Ravi Tandon. “The capacity of private information retrieval from uncoded storage constrained databases.” arXiv preprint arXiv:1805.04104 (2018).
Tandon, Ravi. “The capacity of cache aided private information retrieval.” In 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1078-1082. IEEE, 2017.
Wang, Qiwen, and Mikael Skoglund. “Symmetric Private Information Retrieval from MDS Coded Distributed Storage with Non-colluding and Colluding Servers.” IEEE Transactions on Information Theory (2019).
Wang, Qiwen, Hua Sun, and Mikael Skoglund. “The capacity of private information retrieval with eavesdroppers.” IEEE Transactions on Information Theory (2018).
Shah, Nihar B., K. V. Rashmi, and Kannan Ramchandran. “One extra bit of download ensures perfectly private information retrieval.” In 2014 IEEE International Symposium on Information Theory, pp. 856-860. IEEE, 2014.
Fanti, Giulia, and Kannan Ramchandran. “Efficient private information retrieval over unsynchronized databases.” IEEE Journal of Selected Topics in Signal Processing 9, no. 7 (2015): 1229-1239.
Mirmohseni, Mahtab, and Mohammad Ali Maddah-Ali. “Private function retrieval.” In 2018 Iran Workshop on Communication and Information Theory (IWCIT), pp. 1-6. IEEE, 2018.
Woolsey, Nicholas, Rong-Rong Chen, and Mingyue Ji. “A New Design of Private Information Retrieval for Storage Constrained Databases.” arXiv preprint arXiv:1901.07490 (2019).
Exact Repair: Minimum Bandwidth Regenerating Codes
Fractional Repetition Code
References:
Rashmi, K. V., Nihar B. Shah, P. Vijay Kumar, and Kannan Ramchandran. “Explicit construction of optimal exact regenerating codes for distributed storage.” In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1243-1249. IEEE, 2009.
El Rouayheb, Salim, and Kannan Ramchandran. “Fractional repetition codes for repair in distributed storage systems.” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1510-1517. IEEE, 2010.
Pawar, Sameer, Nima Noorshams, Salim El Rouayheb, and Kannan Ramchandran. “Dress codes for the storage cloud: Simple randomized constructions.” In 2011 IEEE International Symposium on Information Theory Proceedings, pp. 2338-2342. IEEE, 2011.
Further Reading:
Rashmi, K. V., Nihar B. Shah, P. Vijay Kumar, and Kannan Ramchandran. “Explicit construction of optimal exact regenerating codes for distributed storage.” In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1243-1249. IEEE, 2009.
El Rouayheb, Salim, and Kannan Ramchandran. “Fractional repetition codes for repair in distributed storage systems.” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1510-1517. IEEE, 2010.
Pawar, Sameer, Nima Noorshams, Salim El Rouayheb, and Kannan Ramchandran. “Dress codes for the storage cloud: Simple randomized constructions.” In 2011 IEEE International Symposium on Information Theory Proceedings, pp. 2338-2342. IEEE, 2011.
Rashmi, Korlakai Vinayak, Nihar B. Shah, and P. Vijay Kumar. “Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction.” IEEE Transactions on Information Theory 57, no. 8 (2011): 5227-5239.
Pawar, Sameer, Salim El Rouayheb, Hao Zhang, Kangwook Lee, and Kannan Ramchandran. “Codes for a distributed caching based video-on-demand system.” In 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 1783-1787. IEEE, 2011.
Exact Repair: Minimum Bandwidth Regenerating Codes
DRESS code
Product-Matrix Code
References:
El Rouayheb, Salim, and Kannan Ramchandran. “Fractional repetition codes for repair in distributed storage systems.” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1510-1517. IEEE, 2010.
Pawar, Sameer, Nima Noorshams, Salim El Rouayheb, and Kannan Ramchandran. “Dress codes for the storage cloud: Simple randomized constructions.” In 2011 IEEE International Symposium on Information Theory Proceedings, pp. 2338-2342. IEEE, 2011.
Rashmi, Korlakai Vinayak, Nihar B. Shah, and P. Vijay Kumar. “Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction.” IEEE Transactions on Information Theory 57, no. 8 (2011): 5227-5239.
Further Reading:
Rashmi, K. V., Nihar B. Shah, P. Vijay Kumar, and Kannan Ramchandran. “Explicit construction of optimal exact regenerating codes for distributed storage.” In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1243-1249. IEEE, 2009.
El Rouayheb, Salim, and Kannan Ramchandran. “Fractional repetition codes for repair in distributed storage systems.” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1510-1517. IEEE, 2010.
Pawar, Sameer, Nima Noorshams, Salim El Rouayheb, and Kannan Ramchandran. “Dress codes for the storage cloud: Simple randomized constructions.” In 2011 IEEE International Symposium on Information Theory Proceedings, pp. 2338-2342. IEEE, 2011.
Rashmi, Korlakai Vinayak, Nihar B. Shah, and P. Vijay Kumar. “Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction.” IEEE Transactions on Information Theory 57, no. 8 (2011): 5227-5239.
Pawar, Sameer, Salim El Rouayheb, Hao Zhang, Kangwook Lee, and Kannan Ramchandran. “Codes for a distributed caching based video-on-demand system.” In 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 1783-1787. IEEE, 2011.
Exact Repair: Minimum Storage Regenerating Codes
Product-Matrix Code
Exact Repair MDS Codes using Interference Alignment
Piggyback-RS Codes
Locally Repairable Codes
References:
Wu, Yunnan, and Alexandros G. Dimakis. “Reducing repair traffic for erasure coding-based storage via interference alignment.” In 2009 IEEE International Symposium on Information Theory, pp. 2276-2280. IEEE, 2009.
Rashmi, Korlakai Vinayak, Nihar B. Shah, and P. Vijay Kumar. “Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction.” IEEE Transactions on Information Theory 57, no. 8 (2011): 5227-5239.
Rashmi, K. V., Nihar B. Shah, and Kannan Ramchandran. “A piggybacking design framework for read-and download-efficient distributed storage codes.” IEEE Transactions on Information Theory 63, no. 9 (2017): 5802-5820.
Sathiamoorthy, Maheswaran, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. “Xoring elephants: Novel erasure codes for big data.” In Proceedings of the VLDB Endowment, vol. 6, no. 5, pp. 325-336. VLDB Endowment, 2013.
Further Reading:
Wu, Yunnan, and Alexandros G. Dimakis. “Reducing repair traffic for erasure coding-based storage via interference alignment.” In 2009 IEEE International Symposium on Information Theory, pp. 2276-2280. IEEE, 2009.
Rashmi, K. V., Nihar B. Shah, and Kannan Ramchandran. “A piggybacking design framework for read-and download-efficient distributed storage codes.” IEEE Transactions on Information Theory 63, no. 9 (2017): 5802-5820.
Sathiamoorthy, Maheswaran, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. “Xoring elephants: Novel erasure codes for big data.” In Proceedings of the VLDB Endowment, vol. 6, no. 5, pp. 325-336. VLDB Endowment, 2013.
Rashmi, Korlakai Vinayak, Nihar B. Shah, and P. Vijay Kumar. “Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction.” IEEE Transactions on Information Theory 57, no. 8 (2011): 5227-5239.
Cadambe, Viveck R., Syed Ali Jafar, Hamed Maleki, Kannan Ramchandran, and Changho Suh. “Asymptotic interference alignment for optimal repair of MDS codes in distributed storage.” IEEE Transactions on Information Theory 59, no. 5 (2013): 2974-2987.
Suh, Changho, and Kannan Ramchandran. “Exact-repair MDS code construction using interference alignment.” IEEE Transactions on Information Theory 57, no. 3 (2011): 1425-1442.
Papailiopoulos, Dimitris S., and Alexandros G. Dimakis. “Locally repairable codes.” IEEE Transactions on Information Theory 60, no. 10 (2014): 5843-5855.
Distributed Stochastic Gradient Descent with Mini-batches
Parameter Server-Worker architecture
Convergence Analysis
References:
Yin, Dong, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, and Peter Bartlett. “Gradient diversity: a key ingredient for scalable distributed learning.” arXiv preprint arXiv:1706.05699 (2017).
Further Reading:
Yin, Dong, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, and Peter Bartlett. “Gradient diversity: a key ingredient for scalable distributed learning.” arXiv preprint arXiv:1706.05699 (2017).
Bottou, Léon, Frank E. Curtis, and Jorge Nocedal. “Optimization methods for large-scale machine learning.” Siam Review 60, no. 2 (2018): 223-311.
Bousquet, Olivier, and André Elisseeff. “Stability and generalization.” Journal of machine learning research 2, no. Mar (2002): 499-526.
Chen, Jianmin, Xinghao Pan, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. “Revisiting distributed synchronous SGD.” arXiv preprint arXiv:1604.00981 (2016).
Cotter, Andrew, Ohad Shamir, Nati Srebro, and Karthik Sridharan. “Better mini-batch algorithms via accelerated gradient methods.” In Advances in neural information processing systems, pp. 1647-1655. 2011.
De, Soham, Abhay Yadav, David Jacobs, and Tom Goldstein. “Big batch SGD: Automated inference using adaptive batch sizes.” arXiv preprint arXiv:1610.05792 (2016).
Karimi, Hamed, Julie Nutini, and Mark Schmidt. “Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition.” In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 795-811. Springer, Cham, 2016.
Lee, Jason D., Qihang Lin, Tengyu Ma, and Tianbao Yang. “Distributed stochastic variance reduced gradient methods by sampling extra data with replacement.” The Journal of Machine Learning Research 18, no. 1 (2017): 4404-4446.
Li, Mu, Tong Zhang, Yuqiang Chen, and Alexander J. Smola. “Efficient mini-batch training for stochastic optimization.” In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 661-670. ACM, 2014.
Distributed Stochastic Gradient Descent with Mini-batches
Gradient Diversity and applications
Asynchronous Parallel Sparse Gradient Descent
Problem Formulation
Examples
Hogwild
References:
Yin, Dong, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, and Peter Bartlett. “Gradient diversity: a key ingredient for scalable distributed learning.” arXiv preprint arXiv:1706.05699 (2017).
Recht, Benjamin, Christopher Re, Stephen Wright, and Feng Niu. “Hogwild: A lock-free approach to parallelizing stochastic gradient descent.” In Advances in neural information processing systems, pp. 693-701. 2011.
Mania, Horia, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. “Perturbed iterate analysis for asynchronous stochastic optimization.” arXiv preprint arXiv:1507.06970 (2015).
Further Reading:
Yin, Dong, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, and Peter Bartlett. “Gradient diversity: a key ingredient for scalable distributed learning.” arXiv preprint arXiv:1706.05699 (2017).
Recht, Benjamin, Christopher Re, Stephen Wright, and Feng Niu. “Hogwild: A lock-free approach to parallelizing stochastic gradient descent.” In Advances in neural information processing systems, pp. 693-701. 2011.
Mania, Horia, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. “Perturbed iterate analysis for asynchronous stochastic optimization.” arXiv preprint arXiv:1507.06970 (2015).
Pan, Xinghao, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, and Christopher Ré. “Cyclades: Conflict-free asynchronous machine learning.” In Advances in Neural Information Processing Systems, pp. 2568-2576. 2016.
Asynchronous Parallel Sparse Gradient Descent
Hogwild and convergence analysis
Asynchronous Mini-batch SGD (didn't cover in class)
References:
Recht, Benjamin, Christopher Re, Stephen Wright, and Feng Niu. “Hogwild: A lock-free approach to parallelizing stochastic gradient descent.” In Advances in neural information processing systems, pp. 693-701. 2011.
Mania, Horia, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. “Perturbed iterate analysis for asynchronous stochastic optimization.” arXiv preprint arXiv:1507.06970 (2015).
Dutta, Sanghamitra, Gauri Joshi, Soumyadip Ghosh, Parijat Dube, and Priya Nagpurkar. “Slow and stale gradients can win the race: Error-runtime trade-offs in distributed SGD.” arXiv preprint arXiv:1803.01113 (2018).
Further Reading:
Recht, Benjamin, Christopher Re, Stephen Wright, and Feng Niu. “Hogwild: A lock-free approach to parallelizing stochastic gradient descent.” In Advances in neural information processing systems, pp. 693-701. 2011.
Mania, Horia, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. “Perturbed iterate analysis for asynchronous stochastic optimization.” arXiv preprint arXiv:1507.06970 (2015).
Dutta, Sanghamitra, Gauri Joshi, Soumyadip Ghosh, Parijat Dube, and Priya Nagpurkar. “Slow and stale gradients can win the race: Error-runtime trade-offs in distributed SGD.” arXiv preprint arXiv:1803.01113 (2018).
Bottou, Léon, Frank E. Curtis, and Jorge Nocedal. “Optimization methods for large-scale machine learning.” Siam Review 60, no. 2 (2018): 223-311.
Federated Learning
System Design
Detailed Implementations using Raspberry Pi's and Sochket Programming
References:
Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “When edge meets learning: Adaptive control for resource-constrained distributed machine learning.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp. 63-71. IEEE, 2018.
Tuor, Tiffany, Shiqiang Wang, Theodoras Salonidis, Bong Jun Ko, and Kin K. Leung. “Demo abstract: Distributed machine learning at resource-limited edge nodes.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1-2. IEEE, 2018.
Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “Adaptive federated learning in resource constrained edge computing systems.” IEEE Journal on Selected Areas in Communications (2019).
Further Reading:
Konečný, Jakub, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. “Federated learning: Strategies for improving communication efficiency.” arXiv preprint arXiv:1610.05492 (2016).
Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “When edge meets learning: Adaptive control for resource-constrained distributed machine learning.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp. 63-71. IEEE, 2018.
Tuor, Tiffany, Shiqiang Wang, Theodoras Salonidis, Bong Jun Ko, and Kin K. Leung. “Demo abstract: Distributed machine learning at resource-limited edge nodes.” In IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1-2. IEEE, 2018.
Wang, Shiqiang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. “Adaptive federated learning in resource constrained edge computing systems.” IEEE Journal on Selected Areas in Communications (2019).
Tuor, Tiffany, Shiqiang Wang, Kin K. Leung, and Kevin Chan. “Distributed machine learning in coalition environments: overview of techniques.” In 2018 21st International Conference on Information Fusion (FUSION), pp. 814-821. IEEE, 2018.
Yousefpour, Ashkan, Caleb Fung, Tam Nguyen, Krishna Kadiyala, Fatemeh Jalali, Amirreza Niakanlahiji, Jian Kong, and Jason P. Jue. “All one needs to know about fog computing and related edge computing paradigms: a complete survey.” Journal of Systems Architecture (2019).
Yang, Qiang, Yang Liu, Tianjian Chen, and Yongxin Tong. “Federated Machine Learning: Concept and Applications.” ACM Transactions on Intelligent Systems and Technology (TIST) 10, no. 2 (2019): 12.
B. McMahan and D. Ramage, “Federated learning: Collaborative machine learning without centralized training data,” Apr. 2017. Online. Available: https:ai.googleblog.com201704/federated- learning-collaborative.html
J. Wang and G. Joshi, “Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms,” arXiv, Jan. 2019. Online. Available: http:arxiv.orgabs1808.07576
J. Wang and G. Joshi, “Adaptive communication strategies to achieve the best error-runtime trade-off in local-update SGD,” in SysML, Mar.–Apr. 2019. Online. Available: http:arxiv.orgabs1810.08313
Federated Learning
Convergence Analysis
References:
S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge computing systems,” Tech. Rep., 2019. Online. Available: https:arxiv.orgabs1804.05271
Further Reading:
Same as in Lecture 26
Communication Bottleneck and Gradient Quantization
References:
Alistarh, Dan, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. “QSGD: Communication-efficient SGD via gradient quantization and encoding.” In Advances in Neural Information Processing Systems, pp. 1709-1720. 2017.
Wen, Wei, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. “Terngrad: Ternary gradients to reduce communication in distributed deep learning.” In Advances in neural information processing systems, pp. 1509-1519. 2017.
Further Reading:
Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH, 2014.
Nikko Strom. Scalable distributed DNN training using commodity GPU cloud computing. In INTERSPEECH, 2015.
Alistarh, Dan, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. “QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding.” arXiv preprint arXiv:1610.02132 (2016).
Wen, Wei, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. “TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning.” arXiv preprint arXiv:1705.07878 (2017).
Christopher M De Sa, Ce Zhang, Kunle Olukotun, and Christopher Ré. Taming the wild: A unified analysis of hogwild-style algorithms. In NIPS, 2015.