Hadoop – Wikipedia, la enciclopedia libre

Plataforma.

SQL o NoSQL technologies such as Hadoop or Cassandra. We do use some less-than-conventional storage technologies such as CouchDB and Redis.

A strong recommendation is that you master the fundamentals and prove out your thesis in a slightly less complex environment first before migrating to an inherently more complex dis- tributed system—and then be ready to make major adjustments to your algorithms to make them performant once data access is no longer local. A good option to investigate if you want to go this route is Dumbo. Stay tuned to this book’s Twitter account (@SocialWebMining) for extended examples that involve Dumbo.

MySQL, NoSQL, Hadoop or Cassandra, CouchDB and Redis

 

NoSQL

From Wikipedia, the free encyclopedia

In computingNoSQL is a class of database management system identified by its non-adherence to the widely used relational database management system (RDBMS) model:

  • It does not use SQL as its query language
NoSQL database systems rose alongside major internet companies, such as GoogleAmazon, and Facebook, which had significantly different challenges in dealing with huge quantities of data that the traditional RDBMS solutions could not cope with. NoSQL database systems are developed to manage large volumes of data that do not necessarily follow a fixed schema. Data is partitioned among different machines (for performance reasons and size limitations) so JOIN operations are not usable and ACID guarantees are not given.
  • It may not give full ACID guarantees
Usually only eventual consistency is guaranteed or transactions limited to single data items. This means that given a sufficiently long period of time over which no changes are sent, all updates can be expected to propagate eventually through the system.
  • It has a distributed, fault-tolerant architecture
Several NoSQL systems employ a distributed architecture, with the data held in a redundant manner on several servers. In this way, the system can easily scale out by adding more servers, and failure of a server can be tolerated. This type of database typically scales horizontally and is used for managing big amounts of data, when the performance and real-time nature is more important than consistency (as indexing a large number of documents, serving pages on high-traffic websites, and delivering streaming media).

NoSQL database systems are often highly optimized for retrieve and append operations and often offer little functionality beyond record storage (e.g. key-value stores). The reduced run time flexibility compared to full SQL systems is compensated by significant gains in scalability and performance for certain data models.

In short, NoSQL database management systems are useful when working with a huge quantity of data and the data’s nature does not require a relational model for the data structure. The data could be structured, but it is of minimal importance and what really matters is the ability to store and retrieve great quantities of data, and not the relationships between the elements. For example, to store millions of key-value pairs in one or a few associative arrays or to store millions of data records. This is particularly useful for statistical or real-time analyses for growing list of elements (such as Twitter posts or the Internet server logs from a big group of users).

 

Hadoop

Apache Hadoop

Desarrollador

Apache Software Foundation

http://hadoop.apache.org/

Información general

Última versión estable 1.0.0

27 de diciembre de 2011; hace 5 meses

Género Sistema de archivos distribuido

Programado en Java

Sistema operativo Multiplataforma

Plataforma Java

Licencia Apache License 2.0

Estado actual Activo

Idiomas inglés

En español

Apache Hadoop es un framework de software que soporta aplicaciones distribuidas bajo una licencia libre.1 Permite a las aplicaciones trabajar con miles de nodos y petabytes de datos. Hadoop se inspiró en los documentos Google para MapReduce y Google File System (GFS).

Hadoop es un proyecto de alto nivel Apache que está siendo construido y usado por una comunidad global de contribuidores,2 mediante el lenguaje de programación Java. Yahoo! ha sido el mayor contribuidor al proyecto,3 y usa Hadoop extensivamente en su negocio.4

vía Hadoop – Wikipedia, la enciclopedia libre.

 

CASSANDRA

Welcome to Apache Cassandra

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

http://cassandra.apache.org/

 

Redis

Redis es un motor de base de datos en memoria, basado en el almacenamiento en tablas de hashes(llave, valor) pero que opcionalmente puede ser usada como una base de datos durable o persistente. Está escrito en ANSI C por Salvatore Sanfilippo quien es patrocinado por VMware.1 2 y esta liberado bajo licencia BSD por lo que es considerado software de código abierto.

 

 

COUCHDB

Apache CouchDB, commonly referred to as CouchDB, is an open source database that focuses on ease of use and on being “a database that completely embraces the web”.[1] It is a NoSQL database that uses JSON to store data, JavaScriptas its query language using MapReduce and HTTP for an API.[1] One of its distinguishing features is easy replication. CouchDB was first released in 2005 and later became an Apache project in 2008.

CouchDB is used in certain applications for Android like “SpreadLyrics” and applications for Facebook like “Will you Kissme” or “Birthday Greeting Cards” or webs like“Friendpaste

Meebo, for their social platform (web and applications)

http://en.wikipedia.org/wiki/CouchDB

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s

A %d blogueros les gusta esto: