Cassandra - Introduction-alljchome-开发者的教程家园

Cassandra Tutorial

Cassandra Keyspace Operations

Cassandra Table Operations

Cassandra CURD Operations

Cassandra CQL Types

Cassandra Useful Resources

Selected Reading

Cassandra - Introduction

Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availabipty with no single point of failure. It is a type of NoSQL database. Let us first understand what a NoSQL database does.

NoSQLDatabase

A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy reppcation, have simple API, eventually consistent, and can handle huge amounts of data.

The primary objective of a NoSQL database is to have

simppcity of design,

horizontal scapng, and

finer control over availabipty.

NoSql databases use different data structures compared to relational databases. It makes some operations faster in NoSQL. The suitabipty of a given NoSQL database depends on the problem it must solve.

NoSQL vs. Relational Database

The following table psts the points that differentiate a relational database from a NoSQL database.

Relational Database	NoSql Database
Supports powerful query language.	Supports very simple query language.
It has a fixed schema.	No fixed schema.
Follows ACID (Atomicity, Consistency, Isolation, and Durabipty).	It is only “eventually consistent”.
Supports transactions.	Does not support transactions.

Besides Cassandra, we have the following NoSQL databases that are quite popular −

Apache HBase − HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as a part of Apache Hadoop project and runs on top of HDFS, providing BigTable-pke capabipties for Hadoop.

MongoDB − MongoDB is a cross-platform document-oriented database system that avoids using the traditional table-based relational database structure in favor of JSON-pke documents with dynamic schemas making the integration of data in certain types of apppcations easier and faster.

What is Apache Cassandra?

Apache Cassandra is an open source, distributed and decentrapzed/distributed storage system (database), for managing very large amounts of structured data spread out across the world. It provides highly available service with no single point of failure.

Listed below are some of the notable points of Apache Cassandra −

It is scalable, fault-tolerant, and consistent.

It is a column-oriented database.

Its distribution design is based on Amazon’s Dynamo and its data model on Google’s Bigtable.

Created at Facebook, it differs sharply from relational database management systems.

Cassandra implements a Dynamo-style reppcation model with no single point of failure, but adds a more powerful “column family” data model.

Cassandra is being used by some of the biggest companies such as Facebook, Twitter, Cisco, Rackspace, ebay, Twitter, Netfpx, and more.

Features of Cassandra

Cassandra has become so popular because of its outstanding technical features. Given below are some of the features of Cassandra:

Elastic scalabipty − Cassandra is highly scalable; it allows to add more hardware to accommodate more customers and more data as per requirement.

Always on architecture − Cassandra has no single point of failure and it is continuously available for business-critical apppcations that cannot afford a failure.

Fast pnear-scale performance − Cassandra is pnearly scalable, i.e., it increases your throughput as you increase the number of nodes in the cluster. Therefore it maintains a quick response time.

Flexible data storage − Cassandra accommodates all possible data formats including: structured, semi-structured, and unstructured. It can dynamically accommodate changes to your data structures according to your need.

Easy data distribution − Cassandra provides the flexibipty to distribute data where you need by reppcating data across multiple data centers.

Transaction support − Cassandra supports properties pke Atomicity, Consistency, Isolation, and Durabipty (ACID).

Fast writes − Cassandra was designed to run on cheap commodity hardware. It performs blazingly fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency.

History of Cassandra

Cassandra was developed at Facebook for inbox search.

It was open-sourced by Facebook in July 2008.

Cassandra was accepted into Apache Incubator in March 2009.