The practice of system and network administration / Thomas A. Limoncelli, Christina Frame Relay cloud,” as appropriate, and the T1 provider's circuit ID and. 4 The Practice of Cloud System Administration Designing and Operating Large Distributed Systems Volume 2 Thomas A. Limoncelli Strata R. Chalup Christina. The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Read online, or download in secure PDF or secure EPUB format.
|Language:||English, Spanish, Portuguese|
|Genre:||Academic & Education|
|ePub File Size:||18.70 MB|
|PDF File Size:||14.38 MB|
|Distribution:||Free* [*Sign up for free]|
Library of Congress Cataloging-in-Publication Data. Limoncelli, Tom. The practice of cloud system administration: designing and operating. The Practice of Cloud System Administration was written by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan. I’m reading this book in hopes that by understanding the art of being an amazing Cloud SysAdmin, it will help me become an excellent Cloud Security Architect. The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 [Thomas A. Limoncelli, Strata R. Chalup, Christina J.
Types of Clouds There are four different cloud models that you can subscribe according to business needs: Private Cloud: Here, computing resources are deployed for one particular organization. This method is more used for intra-business interactions. Where the computing resources can be governed, owned and operated by the same organization. Community Cloud: Here, computing resources are provided for a community and organizations. Here the computing resource is owned, governed and operated by government, an academic or business organization.
Monitor: Online access control systems send real-time alerts to administrators or security should any irregularity or attempted breach take place at any access point, allowing them to investigate immediately and record the event. Troubleshoot: Modern access control systems allow administrators to remotely configure permissions, or seek support from the vendor, should access points or users have issues—a huge advantage over locally-hosted systems.
In addition, it helps certain sectors meet special requirements. Scale: Businesses can perform regularly-scheduled system reviews to make sure everything on the access control system is set up properly. It can also tell them if someone no longer employed by the company has been inadvertently left in the system. Suspicious Events: Since many access points are routinely tracked during any access event, auditing can prove useful to security officers when investigating unusual behavior.
The data can be used to flag or highlight unusual access behavior or analyze it against historical data. Compliance Reports: Companies that process sensitive data like patient healthcare information, banking financial reports, or credit card payments must deal with audit requirements in the access control space when filing compliance reports in accordance with HIPAA, SOC2 or PCI. Some special categories like cyber security or ISO certifications also require managed and auditable access control.
The audit phase can pull up the proper data for these periodic reports. This can create confusion for anyone charged with outfitting their facility with one—but if they take it step by step, everything will come together. The first step a company should take is obvious—do a count of all the doors that need to be secured; not just the entry doors, but also IT room doors where expensive equipment and security-related devices are installed, and for companies handling sensitive healthcare or financial data, the file rooms or offices where computers processing this data are kept.
New to eBooks. How many copies would you like to download? Limoncelli , Strata R. Chalup , Christina J. Add to Cart Add to Cart. Add to Wishlist Add to Wishlist. The new companion to the best-selling first volume, The Practice of System and Network Administration, Second Edition, this guide offers expert coverage of the following and many other crucial topics: Being available is important because the Internet is open 24 7 and has users in every time zone.
Being fast is important because users are frustrated by slow services, so slow services lose out to faster rivals.
Being secure is important because, as caretakers of other people s data, we are duty-bound and legally responsible to protect people s data. These requirements are intermixed. If a site is not secure, by definition, it cannot be made reliable. If a site is not fast, it is not sufficiently available. If a site is down, by definition, it is not fast. The most visible cloud-scale services are web sites. However, there is a huge ecosystem of invisible internet-accessible services that are not accessed with a browser.
For example, smartphone apps use API calls to access cloud-based services. For the remainder of this book we will tend to use the term distributed computing rather than cloud computing. Cloud computing is a marketing term that means different things to different people.
Distributed computing describes an architecture where applications and services are provided using many machines rather than one.
This is a book of fundamental principles and practices that are timeless. Therefore we don t make recommendations about which specific products or technologies to use.
We could provide a comparison of the top five most popular web servers or NoSQL databases or continuous build systems. If we did, then this book would be out of date the moment it is published. Instead, we discuss the qualities one should look for when selecting such things.
We provide a model to work from. This approach is intended to prepare you for a long career where technology changes over time but you are always prepared. We will, of course, illustrate our points with specific technologies and products, but not as an endorsement of those products and services.
This book is, at times, idealistic. This is deliberate. We set out to give the reader a vision of how things can be, what to strive for. We are here to raise the bar. Part I captures our thinking on the design of large, complex, cloud-based distributed computing systems. After the Introduction, we tackle each element of design from the bottom layers to the top.
We cover distributed systems from the point of view of a system administrator, not a computer scientist. To operate a system, one must be able to understand its internals. The first chapters cover the most fundamental issues. Later chapters delve into more esoteric technical activities, then high-level planning and strategy that tie together all of the above.
At the end is extra material including an assessment system for operations teams, a highly biased history of distributed computing, templates for forms mentioned in the text, recommended reading, and other reference material. We re excited to present a new feature of our book series: our operational assessment system. This system consists of a series of assessments you can use to evaluate your operations and find areas of improvement.
The assessment questions and Look For recommendations are found in Appendix A. Chapter 20 is the instruction manual.
Acknowledgments This book wouldn t have been possible without the help and feedback we received from our community and people all over the world.
The DevOps community was generous in its assistance. Your love and patience make all this possible. If we have seen further, it is by standing on the shoulders of giants. Thanks to Gene Kim for the strategic inspiration and encouragement.
Dozens of people helped us some by supplying anecdotes, some by reviewing parts of or the entire book. Last but not least, thanks to everyone from Addison-Wesley. In particular, thanks to Debra Williams Cauley, for guiding us to Addison-Wesley and steering 27 xxvi Preface us the entire way; Michael Thurston, for editing our earliest drafts and reshaping them to be much, much better; Kim Boedigheimer, who coordinated and assisted us calmly even when we were panicking; Lori Hughes, our LaTeX wizard; Julie Nahil, our production manager; Jill Hobbs, our copyeditor; and John Fuller and Mark Taub, for putting up with all our special requests!
Chapter 2: Designing for Operations Features software should have to enable smooth operations. Chapter 3: Selecting a Service Platform Physical and virtual machines, private and public clouds.
Chapter 4: Application Architectures Building blocks for creating web and other applications. Chapter 5: Design Patterns for Scaling Building blocks for growing a service.
Chapter 6: Design Patterns for Resiliency Building blocks for creating systems that survive failure. Chapter Upgrading Live Services How to upgrade services without downtime. Chapter Automation Creating tools and automating operational work. Chapter Design Documents Communicating designs and intentions in writing. Chapter Oncall Handling exceptions. Chapter Disaster Preparedness Making systems stronger through planning and practice.
Chapter Monitoring Architecture and Practice The components and practice of monitoring. Chapter Capacity Planning Planning for and providing additional resources before we need them.
Chapter Operational Excellence Strategies for constant improvement. Epilogue Some final thoughts. Limoncelli is an internationally recognized author, speaker, and system administrator. His hobbies include grassroots activism, for which his work has been recognized at state and national levels. He lives in New Jersey.
Strata R. Chalup has been leading and managing complex IT projects for many years, serving in roles ranging from project manager to director of operations. Strata has authored numerous articles on management and working with teams and has applied her management skills on various volunteer boards, including BayLISA and SAGE.
She lives in Santa Clara County, California. Christina J. Hogan has twenty years of experience in system administration and network engineering, from Silicon Valley to Italy and Switzerland. She has gained experience in small startups, mid-sized tech companies, and large global corporations.
She worked as a security consultant for many years and her customers included site, Silicon Graphics, and SystemExperts. She has a bachelor s degree in mathematics, a master s degree in computer science, a doctorate in aeronautical engineering, and a diploma in law.
She also worked for six years as an aerodynamicist in a Formula 1 racing team and represented Ireland in the Chess Olympiad. She lives in Switzerland.
What is the ideal environment that we seek to create? Business Objectives Simply stated, the end result of our ideal environment is that business objectives are met. That may sound a little boring but actually it is quite exciting to work where the entire company is focused and working together on the same goals. To achieve this, we must understand the business objectives and work backward to arrive at the system we should build.
Meeting business objectives means knowing what those objectives are, having a plan to achieve them, and working through the roadblocks along the way. Well-defined business objectives are measurable, and such measurements can be collected in an automated fashion. A dashboard is automatically generated so everyone is aware of progress. This transparency enhances trust. Here are some sample business objectives: Sell our products via a web site Provide service percent of the time Process x million downloads per month, growing 10 percent monthly Introduce new features twice a week Fix major bugs within 24 hours In our ideal environment, business and technical teams meet their objectives and project goals predictably and reliably.
Because of this, both types of teams trust that other teams will meet their future objectives. As a result, teams can plan better. They can make more aggressive plans because there is confidence that external dependencies will not fail.
This permits even more aggressive planning. Such an approach creates an upward spiral that accelerates progress throughout the company, benefiting everyone. It meets the requirements of the service today and provides an obvious path for growth as the system becomes more popular and receives more traffic.
The system is resilient to failure. Rather than being surprised by failures and treating them as exceptions, the architecture accepts that hardware and software failures are a part of the physics of information technology IT.
As a result, the architecture includes redundancy and resiliency features that work around failures. Components fail but the system survives. Each subsystem that makes up our service is itself a service. All subsystems are programmable via an application programming interface API. Thus, the entire system is an ecosystem of interconnected subservices. This is called a service-oriented architecture SOA.
Because all these systems communicate over the same underlying protocol, there is uniformity in how they are managed. Because each subservice is loosely coupled to the others, all of these services can be independently scaled, upgraded, or replaced.
The geometry of the infrastructure is described electronically. This electronic description is read by IT automation systems, which then build the production environment without human intervention. Because of this automation, the entire infrastructure can be re-created elsewhere. Software engineers use the automation to make micro-versions of the environment for their personal use. Quality and test engineers use the automation to create environments for system tests.
This infrastructure as code can be achieved whether we use physical machines or virtual machines, and whether they are in datacenters we run or are hosted by a cloud provider.