Amazon Kinesis helps in collecting a large amount of data and process it. Using this service, we can stream the data economically. Various data sources such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications can be stored. This service also helps to monitor the data as it streams and responds quickly. Customers get the benefit of accuracy and current by collecting, storing, and analyzing the important data. This service reduces the workload and the cost by avoiding the use of expensive software and the corresponding infrastructure. AWS Kinesis helps to set up high capacity pipes which can collect and analyze the data very quickly. Systems must just stream the data to Kinesis to get it analyzedquickly. Kinesis service can be easily integrated with storage services such as Dynamo DB, Redshift, and S3
AWS Kinesis helps to ingest, buffer, and process streaming data as
needed. Insights can be derived very quickly.
Ease of Management
Management of infrastructure are fully managed. So, it is very easy to use and maintain services like:
Amazon Kinesis can handle a significant amount of data and process it from many sources even the sources that are of low latency.
Minimize the number of disconnected management tools in use, implement common processes and fully automate error-prone manual processes
A video stream can be used to securely stream the data to AWS for analytics, machine learning, and other processing from connected devices.
User can build applications that process data streams in real-time using popular stream processing frameworks.
The following diagram illustrates the high-level architecture of Kinesis Data Streams. The producers continually push data to Kinesis Data Streams, and the consumers process the data in real time. Consumers (such as a custom application running on Amazon EC2 or an Amazon Kinesis Data Firehose delivery stream) can store their results using an AWS service such as Amazon DynamoDB, Amazon Redshift, or Amazon S3.
Amazon Kinesis stream could be used to reliably maintain a real-time audit trail of every single financial transaction, generating real-time metrics, reports generation, optimizing the marketing spend, increase the responsive to clients and data producers can publish the data within seconds.
Data streams can be easily loaded into AWS Datastores using Firehouse. This can also be used to capture and transform data.
Data Analytics is the easiest way to process data. With just the knowledge of SQL user can perform the task without having to learn any new programming language or framework.
Video Analysis Applications
Video data from homes, offices, factories, and public places can be easily streamed to AWS. It can be used for playback, security monitoring, face detection, machine learning, and other analytics.
Time period analytics can be performed on the historical data using batch processing in data warehouses or using distributed processing frameworks. Other common use cases for storing and processing the real-time dataare data lakes,data science, and machine learning. Large amount of streaming data can also be loaded into S3 data lakes using firehouse service. As new data flows thought the streams, the machine learning models could be refreshed for an accurate and consistent data output.
IoT Data analytics
Kinesis can process streaming data from IoT devices like embedded sensors, television set-top boxes and security cameras. The data then used to send alerts based on the time period or further actions could be taken programmatically for any element exceeding boundaries of operation thresholds.
A Real Estate Listing Company : Find a
Ensure that the data integrity is maintained, and latest
related to agent and profile information are reflected within a short
span of time.
Due to the on-premise setup, performance was a challenge and the
response time was ~ 100 ms
Provided an API (FAR-BACKEND API) which provides search
facility for agent/team/office for any city/state/postal-
After collecting data from various APIs storing them in the
and Elasticsearch to make search faster. Below are the different
components and their purpose:
Created DynamoDB and stored Agent/Team/Office and
their properties details in DynamoDB
Created Elasticsearch to store data and query the data to make search faster.
Created a Lambda function which will get invoked by DynamoDB stream, so whenever any data gets added/updated/deleted in the DynamoDB this lambda is invoked. This lambda is used to update/add/delete the data in the Elasticsearch
Created a Lambda function which will listen the above kinesis stream and call the Document-Builder (API which does CRUD operations on Agent/Team/Office in DynamoDB) to add/update/delete the data to and from the DynamoDB.
Created S3 bucket to store the Lambda code and update the Elasticsearch.
Created an ECS (Scheduler) which will run continuously to update the agent/team/office information in the DynamoDB/Elasticsearch
Created an ECS (Final FAR-Backend API) Which provides search facility for agent/team/office for any city/state/postal-code/name.
With this solution all API’s are now migrated to cloud from the on-premise data center.
Existing API response time was ~100ms and new API response time using the AWS Cloud is around ~25ms. This is a drastic improvement in performance which has helped to quicken the Search page results to the end users.
Leverage all the AWS services and the primary focus is towards Autoscaling and Maintainability
A Major Pharmaceutical Company: Log Standardization
In an Enterprise handling sensitive clinical information with disparate sources spread across different divisions, enforcing measures of security and governance is a challenge.
In the current landscape,
Unified view of security and utilization aspects of resources is Lacking
Challenge to access Logs which is available across multiple systems including OnpremiseVMs,AWS Components.
Separate monitoring and alerting systems required
Limited Capability to storing large number of logs is
Limited Capability to storing large number of logs is
Getting Information on Real Time is difficult.
A Framework to collect, organize and enable analysis of
Data Streams Ingestion
Streaming data from on premise and cloud VMs through Fluentd Agent
Event Logs subscribed to get a real time feed of log events
Kinesis Streams to ingest, process in shards & trigger Lambda function
Kinesis Analytics to process streaming data in real time with standard SQL
Scales automatically to match the volume and throughput rate of your incoming data
Gather data from various sources and drop it in S3 data lake
Create Tables on the Glue Catalog with the metadata information on the data
Create Dataformat including Parquet for faster query results
Creates Partitions and organise the data to efficiently analyse the data
Analytics & Monitoring
CloudWatch collects & keeps track of all performance metrics & generates alerts
It takes in data from Lambda & Firehose to enable the customer to go from raw data to actionable insights quickly
Business Users can get the analytical insights from AWS Athena & can also query from ElasticSearch
With the availability of Logs real-time in a single location, the client is able to enhance the capability of analysis on the below aspects