标签:Datebases

Advanced Analytics with Spark, 2nd Edition 封面

Advanced Analytics with Spark, 2nd Edition

作者:Josh Wills, Sandy Ryza, Sean Owen, Uri Laserson

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming. You’ll start with an introduction to Spark and its ecosystem, a

Cassandra: The Definitive Guide, 2nd Edition 封面

Cassandra: The Definitive Guide, 2nd Edition

作者:Eben Hewitt, Jeff Carpenter

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s

High Performance Spark 封面

High Performance Spark

作者:Holden Karau, Rachel Warren

Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources. Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data

Usage-Driven Database Design 封面

Usage-Driven Database Design

作者:George Tillmann

Design great databases―from logical data modeling through physical schema definition. You will learn a framework that finally cracks the problem of merging data and process models into a meaningful and unified design that accounts for how data is actually used in production systems. Key to the framework is a method for taking the logical data model that is a static look at the definition of the data, and merging that static look with the process models describing how the data will be used i

Beginning Data Science in R 封面

Beginning Data Science in R

作者:Thomas Mailund

Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You’ll see how to efficiently structure and mine data to extract useful patterns

DevOps, DBAs, and DBaaS 封面

DevOps, DBAs, and DBaaS

作者:Michael S. Cuppett

Learn how DBAs in a DevOps environment manage data platforms and change requests to support and optimize continuous integration, delivery, testing, and deployment in the application development life cycle. On the Dev side, DBAs evaluate change requests to ensure compliance with organizational best practices and guard against degradation of database performance and the validity of dependent objects. On the Ops side, DBAs perform release and troubleshooting activities in support of the applicatio

Pro Tableau 封面

Pro Tableau

作者:Seema Acharya, Subhashini Chellappan

Leverage the power of visualization in business intelligence and data science to make quicker and better decisions. Use statistics and data mining to make compelling and interactive dashboards. This book will help those familiar with Tableau software chart their journey to being a visualization expert. Pro Tableau demonstrates the power of visual analytics and teaches you how to: Connect to various data sources such as spreadsheets, text files, relational databases (Microsoft SQL Server, MySQL,

Power Pivot and Power BI 封面

Power Pivot and Power BI

作者:Avichal Singh, Rob Collie

Microsoft PowerPivot is a free add-on to Excel from Microsoft that allows users to produce new kinds of reports and analyses that were simply impossible before, and this book is the first to tackle DAX formulas, the core capability of PowerPivot, from the perspective of the Excel audience. Written by the world’s foremost PowerPivot blogger and practitioner, the book’s concepts and approach are introduced in a simple, step-by-step manner tailored to the learning style of Excel users everywhere. T

Database Systems, 2nd Edition 封面

Database Systems, 2nd Edition

作者:Elvis C. Foster, Shripad Godbole

This book provides a comprehensive, yet concise introduction to database systems, with special emphasis on the relational database model. The book discusses the database as an essential component of a software system, as well as a valuable, mission critical corporate resource. New in this second edition is updated SQL content covering the latest release of the Oracle Database Management System along with a reorganized sequence of the topics which is more useful for teaching. Also included are re

Spark 2.0 for Beginners 封面

Spark 2.0 for Beginners

作者:Rajanarayanan Thottuvaikkatumana

Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools which that are equally useful for application developers as well as data scientists. SparkR or “R on Spark” in the Spark framework opened the door of Spark data processing capability to the R users. This book starts with the fundamentals of Spark 2.0 and covers the core data processing framework and API, installation, and application development setup. Then the Spark

Spark for Data Science 封面

Spark for Data Science

作者:Bikramaditya Singhal, Srinivas Duvvuri

This is the era of Big Data and Internet of Things! Big Data implies big innovation and enables a competitive advantage for businesses. Apache Spark was designed to perform Big Data analytics at scale, and so Spark is equipped with the necessary algorithms and supports multiple programming languages. Whether you are a technologist, a data scientist, or a beginner to Big Data analytics, this book will provide you with all the skills necessary to perform statistical data analysis, data visualizati

Cassandra 3.x High Availability, 2nd Edition 封面

Cassandra 3.x High Availability, 2nd Edition

作者:Robbie Strickland

Apache Cassandra is a massively scalable, peer-to-peer database designed for 100 percent uptime, with deployments in the tens of thousands of nodes, all supporting petabytes of data. This book offers a practical insight into building highly available, real-world applications using Apache Cassandra. The book starts with the fundamentals, helping you to understand how Apache Cassandra’s architecture allows it to achieve 100 percent uptime when other systems struggle to do so. You’ll get an excelle

· · · · · · · · · · · · · · 第 1 页 · · · · · · · · · · · · · ·