分布式系统在互联网时代,尤其是大数据时代到来之后,成为了每个程序员的必备技能之一。分布式系统从上个世纪80年代就开始有了不少出色的研究和论文,我在这里只列举最近15年范围以内我觉得有重大影响意义的15篇论文(15 within 15)。1. The Google File System: 这是分布式文件系统领域划时代意义的论文,文中的多副本机制、控制流与数据流隔离和追加写模式等概念几乎成为了分布式文件系统领域的标准,其影响之深远通过其5000+的引用就可见一斑了,Apache Hadoop鼎鼎大名的HDFS就是GFS的模仿之作;2. MapReduce: Simplified Data Processing on Large Clusters:这篇也是Google的大作,通过Map和Reduce两个操作,大大简化了分布式计算的复杂度,使得任何需要的程序员都可以编写分布式计算程序,其中使用到的技术值得我们好好学习:简约而不简单!Hadoop也根据这篇论文做了一个开源的MapReduce;3. Bigtable: A Distributed Storage System for Structured Data:Google在NoSQL领域的分布式表格系统,LSM树的最好使用范例,广泛使用到了网页索引存储、YouTube数据管理等业务,Hadoop对应的开源系统叫HBase(我在前公司任职时也开发过一个相应的系统叫BladeCube,性能较HBase有数倍提升);4. The Chubby lock service for loosely-coupled distributed systems:Google的分布式锁服务,基于Paxos协议,这篇文章相比于前三篇可能知道的人就少了,但是其对应的开源系统zookeeper几乎是每个后端同学都接触过,其影响力其实不亚于前三篇;5. Finding a Needle in Haystack: Facebook's Photo Storage:facebook的在线图片存储系统,目前来看是对小文件存储的最好解决方案之一,facebook目前通过该系统存储了超过300PB的数据,一个师兄就在这个团队工作,听过很多有意思的事情(我在前公司的时候开发过一个类似的系统pallas,不仅支持副本,还支持Reed Solomon-LRC,性能也有较多优化);6. Windows Azure Storage: a highly available cloud storage service with strong consistency:windows azure的总体介绍文章,是一篇很好的描述云存储架构的论文,其中通过分层来同时保证可用性和一致性的思路在现实工作中也给了我很多启发;7. GraphLab: A New Framework for Parallel Machine Learning:CMU基于图计算的分布式机器学习框架,目前已经成立了专门的商业公司,在分布式机器学习上很有两把刷子,其单机版的GraphChi在百万维度的矩阵分解都只需要2~3分钟;8. Resilient Distributed Datasets: A Fault-Tolerant Abstraction forIn-Memory Cluster Computing:其实就是 Spark,目前这两年最流行的内存计算模式,通过RDD和lineage大大简化了分布式计算框架,通常几行scala代码就可以搞定原来上千行MapReduce代码才能搞定的问题,大有取代MapReduce的趋势;9. Scaling Distributed Machine Learning with the Parameter Server:百度少帅李沐大作,目前大规模分布式学习各家公司主要都是使用ps,ps具备良好的可扩展性,使得大数据时代的大规模分布式学习成为可能,包括Google的深度学习模型也是通过ps训练实现,是目前最流行的分布式学习框架,豆瓣的开源系统paracell也是ps的一个实现;10. Dremel: Interactive Analysis of Web-Scale Datasets:Google的大规模(近)实时数据分析系统,号称可以在3秒相应1PB数据的分析请求,内部使用到了查询树来优化分析速度,其开源实现为Drill,在工业界对实时数据分析也是比价有影响力;11. Pregel: a system for large-scale graph processing: Google的大规模图计算系统,相当长一段时间是Google PageRank的主要计算系统,对开源的影响也很大(包括GraphLab和GraphChi);12. Spanner: Google's Globally-Distributed Database:这是第一个全球意义上的分布式数据库,Google的出品。其中介绍了很多一致性方面的设计考虑,简单起见,还采用了GPS和原子钟确保时间最大误差在20ns以内,保证了事务的时间序,同样在分布式系统方面具有很强的借鉴意义;13. Dynamo: Amazon’s Highly Available Key-value Store:Amazon的分布式NoSQL数据库,意义相当于BigTable对于Google,于BigTable不同的是,Dynamo保证CAP中的AP,C通过vector clock做弱保证,对应的开源系统为Cassandra;14. S4: Distributed Stream Computing Platform:Yahoo出品的流式计算系统,目前最流行的两大流式计算系统之一(另一个是storm),Yahoo的主要广告计算平台;15. Storm @Twitter:这个系统不多说,开启了流式计算的新纪元,几乎是所有公司流式计算的首选,绝对值得关注;
一、云计算概念二、云计算历史三、云计算现状四,云计算发展前景五、云计算实现,目前存在的问题。写论文多参考:华为的云计算,wingdows云计算,goole云计算。明天看看再补充些,多参考
完全是个纯理论式的论文?!还是要编程的?
您好,已经发到QQ指定邮箱,请注意查收。仅供参考,请自借鉴希望对您有帮助
分布式领域论文译序sql&nosql年代记SMAQ:海量数据的存储计算和查询一.google论文系列1. google系列论文译序2. The anatomy of a large-scale hypertextual Web search engine (译 zz)3. web search for a planet :the google cluster architecture(译)4. GFS:google文件系统 (译)5. MapReduce: Simplied Data Processing on Large Clusters (译)6. Bigtable: A Distributed Storage System for Structured Data (译)7. Chubby: The Chubby lock service for loosely-coupled distributed systems (译)8. Sawzall:Interpreting the Data--Parallel Analysis with Sawzall (译 zz)9. Pregel: A System for Large-Scale Graph Processing (译)10. Dremel: Interactive Analysis of WebScale Datasets(译zz)11. Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications(译zz)12. MegaStore: Providing Scalable, Highly Available Storage for Interactive Services(译zz)13. Case Study GFS: Evolution on Fast-forward (译)14. Google File System II: Dawn of the Multiplying Master Nodes15. Tenzing - A SQL Implementation on the MapReduce Framework (译)16. F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business17. Elmo: Building a Globally Distributed, Highly Available Database18. PowerDrill:Processing a Trillion Cells per Mouse Click19. Google-Wide Profiling:A Continuous Profiling Infrastructure for Data Centers20. Spanner: Google’s Globally-Distributed Database(译zz)21. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure(笔记)22. Omega: flexible, scalable schedulers for large compute clusters23. CPI2: CPU performance isolation for shared compute clusters24. Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams(译)25. F1: A Distributed SQL Database That Scales26. MillWheel: Fault-Tolerant Stream Processing at Internet Scale(译)27. B4: Experience with a Globally-Deployed Software Defined WAN28. The Datacenter as a Computer29. Google brain-Building High-level Features Using Large Scale Unsupervised Learning30. Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing(译zz)31. Large-scale cluster management at Google with Borg google系列论文翻译集(合集)二.分布式理论系列00. Appraising Two Decades of Distributed Computing Theory Research 0. 分布式理论系列译序1. A brief history of Consensus_ 2PC and Transaction Commit (译)2. 拜占庭将军问题 (译) --Leslie Lamport3. Impossibility of distributed consensus with one faulty process (译)4. Leases:租约机制 (译)5. Time Clocks and the Ordering of Events in a Distributed System(译) --Leslie Lamport6. 关于Paxos的历史7. The Part Time Parliament (译 zz) --Leslie Lamport 8. How to Build a Highly Available System Using Consensus(译)9. Paxos Made Simple (译) --Leslie Lamport10. Paxos Made Live - An Engineering Perspective(译) 11. 2 Phase Commit(译) 12. Consensus on Transaction Commit(译) --Jim Gray & Leslie Lamport 13. Why Do Computers Stop and What Can Be Done About It?(译) --Jim Gray 14. On Designing and Deploying Internet-Scale Services(译) --James Hamilton 15. Single-Message Communication(译)16. Implementing fault-tolerant services using the state machine approach 17. Problems, Unsolved Problems and Problems in Concurrency 18. Hints for Computer System Design 19. Self-stabilizing systems in spite of distributed control 20. Wait-Free Synchronization 21. White Paper Introduction to IEEE 1588 & Transparent Clocks 22. Unreliable Failure Detectors for Reliable Distributed Systems 23. Life beyond Distributed Transactions:an Apostate’s Opinion(译zz) 24. Distributed Snapshots: Determining Global States of a Distributed System --Leslie Lamport 25. Virtual Time and Global States of Distributed Systems 26. Timestamps in Message-Passing Systems That Preserve the Partial Ordering 27. Fundamentals of Distributed Computing:A Practical Tour of Vector Clock Systems 28. Knowledge and Common Knowledge in a Distributed Environment 29. Understanding Failures in Petascale Computers 30. Why Do Internet services fail, and What Can Be Done About It? 31. End-To-End Arguments in System Design 32. Rethinking the Design of the Internet: The End-to-End Arguments vs. the Brave New World 33. The Design Philosophy of the DARPA Internet Protocols(译zz) 34. Uniform consensus is harder than consensus 35. Paxos made code - Implementing a high throughput Atomic Broadcast 36. RAFT:In Search of an Understandable Consensus Algorithm分布式理论系列论文翻译集(合集)三.数据库理论系列0. A Relational Model of Data for Large Shared Data Banks 19701. SEQUEL:A Structured English Query Language 19742. Implentation of a Structured English Query Language 19753. A System R: Relational Approach to Database Management 19764. Granularity of Locks and Degrees of Consistency in a Shared DataBase --Jim Gray 19765. Access Path Selection in a RDBMS 1979 6. The Transaction Concept:Virtues and Limitations --Jim Gray7. 2pc-2阶段提交:Notes on Data Base Operating Systems --Jim Gray8. 3pc-3阶段提交:NONBLOCKING COMMIT PROTOCOLS9. MVCC:Multiversion Concurrency Control-Theory and Algorithms --1983 10. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging-199211. A Comparison of the Byzantine Agreement Problem and the Transaction Commit Problem --Jim Gray 12. A Formal Model of Crash Recovery in a Distributed System - Skeen, D. Stonebraker13. What Goes Around Comes Around - Michael Stonebraker, Joseph M. Hellerstein 14. Anatomy of a Database System -Joseph M. Hellerstein, Michael Stonebraker 15. Architecture of a Database System(译zz) -Joseph M. Hellerstein, Michael Stonebraker, James Hamilton四.大规模存储与计算(NoSql理论系列)0. Towards Robust Distributed Systems:Brewer's 2000 PODC key notes1. CAP理论2. Harvest, Yield, and Scalable Tolerant Systems3. 关于CAP 4. BASE模型:BASE an Acid Alternative5. 最终一致性6. 可扩展性设计模式7. 可伸缩性原则8. NoSql生态系统9. scalability-availability-stability-patterns10. The 5 Minute Rule and the 5 Byte Rule (译) 11. The Five-Minute Rule Ten Years Later and Other Computer Storage Rules of Thumb12. The Five-Minute Rule 20 Years Later(and How Flash Memory Changes the Rules)13. 关于MapReduce的争论14. MapReduce:一个巨大的倒退15. MapReduce:一个巨大的倒退(II)16. MapReduce和并行数据库,朋友还是敌人?(zz)17. MapReduce and Parallel DBMSs-Friends or Foes (译)18. MapReduce:A Flexible Data Processing Tool (译)19. A Comparision of Approaches to Large-Scale Data Analysis (译)20. MapReduce Hold不住?(zz) 21. Beyond MapReduce:图计算概览22. Map-Reduce-Merge: simplified relational data processing on large clusters23. MapReduce Online24. Graph Twiddling in a MapReduce World25. Spark: Cluster Computing with Working Sets26. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing27. Big Data Lambda Architecture28. The 8 Requirements of Real-Time Stream Processing29. The Log: What every software engineer should know about real-time data's unifying abstraction30. Lessons from Giant-Scale Services五.基本算法和数据结构1. 大数据量,海量数据处理方法总结2. 大数据量,海量数据处理方法总结(续)3. Consistent Hashing And Random Trees4. Merkle Trees5. Scalable Bloom Filters6. Introduction to Distributed Hash Tables7. B-Trees and Relational Database Systems8. The log-structured merge-tree (译)9. lock free data structure10. Data Structures for Spatial Database11. Gossip12. lock free algorithm13. The Graph Traversal Pattern六.基本系统和实践经验1. MySQL索引背后的数据结构及算法原理2. Dynamo: Amazon’s Highly Available Key-value Store (译zz)3. Cassandra - A Decentralized Structured Storage System (译zz)4. PNUTS: Yahoo!’s Hosted Data Serving Platform (译zz)5. Yahoo!的分布式数据平台PNUTS简介及感悟(zz)6. LevelDB:一个快速轻量级的key-value存储库(译)7. LevelDB理论基础8. LevelDB:实现(译)9. LevelDB SSTable格式详解10. LevelDB Bloom Filter实现11. Sawzall原理与应用12. Storm原理与实现13. Designs, Lessons and Advice from Building Large Distributed Systems --Jeff Dean14. Challenges in Building Large-Scale Information Retrieval Systems --Jeff Dean15. Experiences with MapReduce, an Abstraction for Large-Scale Computation --Jeff Dean16. Taming Service Variability,Building Worldwide Systems,and Scaling Deep Learning --Jeff Dean17. Large-Scale Data and Computation:Challenges and Opportunitis --Jeff Dean18. Achieving Rapid Response Times in Large Online Services --Jeff Dean19. The Tail at Scale(译) --Jeff Dean & Luiz André Barroso 20. How To Design A Good API and Why it Matters21. Event-Based Systems:Architect's Dream or Developer's Nightmare?22. Autopilot: Automatic Data Center Management七.其他辅助系统1. The ganglia distributed monitoring system:design, implementation, and experience2. Chukwa: A large-scale monitoring system3. Scribe : a way to aggregate data and why not, to directly fill the HDFS?4. Benchmarking Cloud Serving Systems with YCSB5. Dynamo Dremel ZooKeeper Hive 简述八. Hadoop相关0. Hadoop Reading List1. The Hadoop Distributed File System(译)2. HDFS scalability:the limits to growth(译)3. Name-node memory size estimates and optimization . HBase Architecture(译)5. HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs6. HFile V27. Hive - A Warehousing Solution Over a Map-Reduce Framework8. Hive – A Petabyte Scale Data Warehouse Using Hadoop转载请注明作者:phylips@bmy 2011-4-30
146 浏览 3 回答
137 浏览 5 回答
250 浏览 2 回答
140 浏览 8 回答
221 浏览 2 回答
94 浏览 4 回答
253 浏览 5 回答
140 浏览 4 回答
305 浏览 8 回答
345 浏览 2 回答
182 浏览 3 回答
155 浏览 7 回答
189 浏览 4 回答
291 浏览 5 回答
335 浏览 7 回答