January 2018 | A Term A Day

January 31, 2018

Kafka

Apache Kafka is an open source distributed streaming platform. It allows to publish/subscribe to data feeds (stream), it stores the data in a fault tolerant way and it allows consumers process the data as per the consumer’s need.

Key concepts in Kafka

It runs in a cluster, with configurable replication across the nodes.

Topics – These can be considered as message queues where publisher post their messages and consumers pick their messages.
Record – each record consists of a key, value and a timestamp.
Producer API – It allows publishers aka producers to post the messages or stream of messages to one or more topics in Kafka.
Consumer API – it allows consumers to poll the topics and fetch messages. Consumers can subscribe to one or more topics.
Stream API – it consumes the messages posted by publishers, processes them and posts the output messages back to Kafka system topics. Essentially they transform the messages.
Connector API – These connect Kafka topics to various applications and/or data sources.

History of Kafka

This platform was originally developed by the LinkedIn team and was eventually made open source. It is written in Scala and Java language. This platform can be used as a storage system, messaging system or simply stream processing. It achieves high throughput and low latency by batching the messages to optimize the n/w overhead and by using Java NIO. At its core, Kafka uses a simple binary protocol over TCP to communicate amongst various components.

Guarantees given by the platform:

Records will always appear in the same sequence in the queue as they were received from the producers
A consumer will see the records in the same sequence as they appear in the queue
A topic with replication factor N, will be usable up to N-1 server failures, without losing any data.

Sample use cases

Message broker – can be compared with ActiveMQ or RabbitMQ
Website activity tracking – pushing various data points and then writing consumers to process the data
Log aggregation across various nodes

Related Keywords

ActiveMQ, RabbitMQ, AMQP, Broker

January 30, 2018January 31, 2018

Vagrant

You are part of a development team and you want to try out the latest version of the VM that your application uses. Or you want to make some configuration changes and test them out in your development environment. However, you are not sure if your changes would be useful or successful. And hence you want an ability to roll back the changes or potentially version control them. How do you do it? Vagrant comes to the rescue!!

What is Vagrant?

Vagrant is an open source tool for building and maintaining portable development environments. With the rise of complex architectures involving several different servers and technology stacks, Vagrant simplifies the task of creating and maintaining the required stack of software/libraries etc.

Vagrant stores the configuration in the form of a text file(s) and these files could be put under your favorite version control system such as git. If changes don’t work out, you can easily roll back and go to the earlier working stage. This capability improves the development productivity a lot and hence Vagrant has become a darling of several development teams.

This tool uses a concept of “provisioners” and “providers”. Provisioners are tools that allow to customize/modify the environments – examples are Chef and/or Puppet. Whereas Providers are services which provide virtual machines such as AWS, Docker, VMWare etc.

Vagrant Workflow — Vagrant – workflow [Source: https://objectcomputing.com/resources/publications/sett/march-2015-docker-vs-vagrant]

Vagrant abstracts the machine. It sits as a wrapper on top of underlying hardware. You can throw away the vagrant configuration and create new one very easily, by simply changing the config files. It provides command line interface, which can work in any environment irrespective of the underlying VM or OS. This additional layer provides the portability, simplicity, and interoperability. One developer can easily share this environment she has created with others and other can quickly set up the replica of the shared environment.

One might be tempted to compare Vagrant with other configuration management tools such as Chef/Puppet/Ansible. However, Vagrant is commonly used along with one of these tools as they serve different purposes.

Related Keywords

Configuration Mangement, Docker, Chef, Puppet, Ansible, VMWare, AWS, DevOps

January 29, 2018

BAN / WBAN

We all have heard of LAN/WLAN (Wireless Local Area Network), MAN (Metropolitan Area Network), and WAN (Wide Area Network). Maybe you have also heard about PAN (Personal Area Network). Today. let’s know more about BAN / WBAN – Wireless Body Area Network.

What is BAN?

This is a network of multiple interconnected devices worn or implanted in a human’s body. These devices include monitoring devices such as pacemakers or BP monitors. Such a network typically includes a smartphone which acts as a mobile hub to collect data from wearables and implants and push it to a central repository to process/ analyze further.

Now you probably can guess one of the biggest beneficiaries of BAN/WBAN – medical field! Patients can be equipped with wearable monitoring devices and a smartphone. Then the medical team can monitor the patient’s health from a single location. This technology comes handy for remote monitoring or could also be extremely useful during a medical emergency where a small group of medics needs to monitor a large number of patients.

BAN – source: https://www.waves.intec.ugent.be/files/images/WBAN.img_assist_custom.jpg

So is this a standard terminology?

Yes, there’s an IEEE standard 802.15.6 defined for WBAN from Healthcare point of view.

It is a standard for short-range, low power, and highly reliable wireless communication in, on and around human body.

[Source: IEEE website – http://ieeexplore.ieee.org/document/7581523/]

BAN network cannot just monitor the human body, but can also initiate action. e.g. Insulin can be injected automatically into a diabetic patients body if the BAN detects lower levels of insulin in the body.

In some other use cases, a sports person can create BAN to gather data about her performance and subsequently make improvements to her game.

Some concerns about BAN/WBAN

Since this is a mobile network, which moves along with the human, security of the data becomes an important consideration. WBAN also needs to ensure that the data is collected from correct human even if there are multiple humans in the vicinity.
Privacy – this could be treated as an invasion of privacy and it is utmost important to obtain consent from the human to create and use BAN around his/her body.
Data management – data collected using such close monitoring is going to be humongous and hence needs to be managed well.

Related Keywords

LAN, WAN, MAN, WLAN, RFID,

January 28, 2018January 28, 2018

Graph Database

Graph Database is a database that uses graph structures. Ok, let me explain it more. A graph is made up of nodes, edges, and properties, which represent data. Nodes represent entities such as persons or businesses or any other object to be tracked. An edge is a relation between two nodes. Each node can have more than one relations. Property is a relevant information about the node. A database that makes use of such structures, is referred to as Graph Database.

So, what are the advantages of Graph Database?

It stores the data about nodes using edges and properties along with the record itself. This allows applications to retrieve the data much faster as compared to the relational database. It reduces the complexity of traditional “join” statements required in the relational database as data is already linked using edges and properties. Thus, it also improves the performance of the overall database and application as well.

Graph Database PropertyGraph — Pictorial representation of Graph Database – Property Graph (Source: Wikipedia)

You can find a good comparison of Graph Database and Relational Database here.

The underlying implementation of Graph Databases may vary. Some may use the relational engine and store the “graph” data in a separate table. Others may use Key-value store (like NoSQL) or document database for storage. As a result, to reap the benefits of the new structure a separate query language is required and one can’t use standard SQL for that. Some of the available query languages are Gremlin, SPARQL, and Cypher. Note that GraphQL is not the query language that is used for Graph Databases.

Graph Database is good for highly connected data such as social networks, or recommendations in e-commerce. E.g. A user of social networking site – represented by Node – can have a membership with various groups and can have several friends. Each one of those friends, in turn, would have similar connections – relationships – represented by Edges. There would be attributes like birthdates, college etc – represented by Properties.

Related Keywords

GraphQL, Relational Databases, Neo4j, MySQL, OLTP, NoSQL

January 27, 2018

ASIC

ASIC stands for Application Specific Integrated Circuits. Almost everyone is aware of what is an integrated circuit (IC) – it is a set of multiple electronic circuits placed on a single small flat piece. By making ICs specific to a given application, one can reap benefits of enhanced performance and optimized power consumption. A common example is a chip designed to run a digital voice recorder.

What are the latest uses of ASIC?

Bitcoin uses blockchain technology and needs a lot of computational power to mine new bitcoins. There have been multiple ASIC developed for mining the bitcoins and have varied performances. Some of them have also been developed which mine Bitcoin as well as Litecoin. You can check this list for Bitcoin mining ASICs.

More Information about ASIC

In general, you will find such ICs in almost every electronic device. Typically, it does one thing and it does it well! These circuits embed custom logic and hence are considered as proprietary devices. An extension on ASIC is ASSP – Application-specific standard product. ASSP implements a specific function that in itself is a product.

Since ASICs are custom made, the design team focuses on speed optimization and also ensure lowest power consumption. It usually involves a lot of R&D work. This pushes the cost northwards, however, you tend to gain on mass production and long-term benefits to retain market position.

Reasons why it is used:

Compact size
Power and performance
IP Protection
Speed
Reliability

Types if ASIC

Gate-Array Design
Full-custom
Semi-custom
Platform

Related Keyword

Hardware

Month: January 2018

Kafka

Key concepts in Kafka

History of Kafka

Guarantees given by the platform:

Sample use cases

Related Links

Related Keywords

Vagrant

What is Vagrant?

Related Links

Related Keywords

BAN / WBAN

What is BAN?

So is this a standard terminology?

Some concerns about BAN/WBAN

Related Keywords

Graph Database

So, what are the advantages of Graph Database?

Related Links

Related Keywords

ASIC

What are the latest uses of ASIC?

More Information about ASIC

Reasons why it is used:

Types if ASIC

Related Keyword