Blog

Kafka security via data encryption

19 Jun 2017 · by Antonios Chalkiopoulos · Read in about 3 min

Secure cluster, but what happens with your data?

When it comes to security, Apache Kafka as every distributed system provides the mechanisms to transfer data securely across the components being involved. Depending on your set up this might involve different services such as Kerberos, relying on multiple TLS certificates and advanced ACL setup in brokers and Zookeeper. In many cases, with encryption features enabled, performance is also taking a penalty hit.

Read more →

Kafka Topics UI (rest proxy v2)

12 May 2017 · by Christina Daskalaki · Read in about 1 min

The new version of Kafka Topics UI is now available!

The new version implements the Rest proxy v2 API so make sure you upgrade το the right version of rest proxy.

Read more →

Fast Avro Write

3 May 2017 · by Stefan Bocutiu · Read in about 5 min

This article presents how Avro lib writes to files and how we can achieve significant performance improvements by parallelizing the write. A (JVM) library has been implemented and is available on Github fast-avro-write The reason we proceeded with this implementation was a project that required writing multiple Μillions of Avro messages from Kafka onto a star DW (data warehouse) in HIVE (HDFS). You might have heard about (or even dealt with) the challenges of working with HDFS.

Read more →

Kafka Connect Pipelines, sink to Elasticsearch

4 Apr 2017 · by Christina Daskalaki · Read in about 9 min

Introduction In this mini tutorial we will explore how to create a Kafka Connect Pipeline using the Kafka Development Environment (fast-data-dev) in order to move real time telemetry data into Elasticsearch and finally visualize the positions in a Kibana Tile Map by writing zero code…! Kafka Connect You have most probably come across Kafka Connect when you require to move large amount of data between datastores. In case you haven’t, Kafka Connect is one of the core Kafka APIs that allows you to create custom connectors, or find one for your case and run it in an easily scalable distributed mode.

Read more →

Kafka connect for FTP data

20 Feb 2017 · by Antonios Chalkiopoulos · Read in about 8 min

An FTP server, together with a pair of credentials is a common pattern, on how data providers expose data as a service. In this article we are going to implement custom file transformers to efficiently load files over FTP and using Kafka Connect convert them to meaningful events in Avro format. Depending on data subscriptions we might get access to FTP locations with files updated daily , weekly or monthly. File structures might be positional, csv, json , xml or even binary.

Read more →

From MQTT to Kafka with Connect and Stream Reactor

28 Jan 2017 · by Marios Andreopoulos · Read in about 8 min

MQTT stands for MQ Telemetry Transport. It is a lightweight messaging protocol, designed for embedded hardware, low-power or limited-network applications and microcontrollers with limited RAM and/or CPU. It is a protocol that drives the IoT expansion. On the other hand, large numbers of small devices that produce frequent readings, lead to big data and the need for analysis in both time and space domain (spatial-temporal analysis). Kafka can be the highway that connects your IoT with your backend analytics and persistence.

Read more →

Apache Kafka London Meetup - by Landoop

18 Jan 2017 · by Antonios Chalkiopoulos · Read in about 1 min

How to simplify your ETL process using Kafka Connect for (E) and (L). Introducing KCQL - the Kafka Connect Query Language for fast-data pipelines. Using KCQL to set up Kafka Connectors for popular in-memory and analytical systems (live demos) such as HazelCast, Redis and InfluxDB. Use fast-data-dev docker for your kafka development environment. Enhancing your existing Cloudera (Hadoop) clusters with fast-data capabilities. Demos: http://schema-registry-ui.landoop.com http://kafka-topics-ui.landoop.com http://kafka-connect-ui.landoop.com https://fast-data-dev.demo.landoop.com/ Code https://github.com/landoop/ Connectors

Read more →

Time-Series with Kafka, Kafka Connect & InfluxDB

1 Dec 2016 · by Christina Daskalaki · Read in about 6 min

Time-series datastores are of particular interest these days and influxDB is a popular open source distributed time-series database. In this tutorial we will integrate Kafka with InfluxDB using Kafka Connect and implement a Scala avro message producer to test the setup. The steps we are going to follow are: Setup a docker development environment Run an InfluxDB Sink Kafka Connector Create a Kafka Avro producer in Scala (use the schema registry) Generate some messages in Kafka Finally, we will verify the data in influxDB and visualise them in Chronograph.

Read more →

Coyote Testing Tool

21 Aug 2016 · by Marios Andreopoulos · Read in about 7 min

A few days ago we open source’d Coyote, a tool we created in order to automate testing of our Landoop Boxes, which features a large range of environments for Big Data and Fast Data (see Kafka). Coyote does one simple thing: it takes a .yml file with a list of commands to setup, run and check their exit code and/or output. It has some other functionality too, but its essence is this.

Read more →

Confluent Platform 3.0.0 CSD

8 Aug 2016 · by Marios Andreopoulos · Read in about 2 min

This is now in General Availability. See it here and request a trial today!

Today we release our first beta CSD for Confluent Platform 3.0.0. It is robust enough to use in production; but we want to add at least some small touches before the final release which we expect to be fully compatible with the beta —drop-in replacement and upgrade.

Read more →

Kafka Topics UI

7 Aug 2016 · by Antonios Chalkiopoulos · Read in about 3 min

Hey,
check out the new version for Kafka Topic UI here.

Kafka is now the de-facto platform for streaming architectures, and it’s eco-system is maturing, but is not just yet as Enterprise Ready as many people in Big | Fast Data would like it to be. Landoop is a London based start-up that wants to drive Kafka faster to the future, and thus..

We are announcing the kafka-topics-ui a User Interface that allows browsing data from Kafka Topics and a lot more

Read more →

Schema Registry UI for Kafka

6 Aug 2016 · by Antonios Chalkiopoulos · Read in about 3 min

If you are looking for a safe way to interchange messages while using a fast streaming architecture such as Kafka, you need to look no further than Confluent’s schema-registry. This simple and state-less micro-service, uses the _schemas topic to hold schema versions, can run as a single-master multiple-slave architecture and supports multi data-center deployments.

We are happy to announce a UI, the schema-registry-ui a fully-featured tool for your underlying schema registry that allows visualization and exploration of registered schemas and a lot more…

Read more →

Our Argos and Accenture presentation on Big and Fast Data

14 Jul 2016 · by Antonios Chalkiopoulos · Read in about 1 min

We want to thank @Argos - the third largest retailer in UK - for inviting Landoop and @Accenture for hosting our presentation in one of the most beautiful theaters in the world, the IMAX theater in SCIENCE MUSEUM, London.

View our presentation on Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and a bit about #stream-processing

Read more →

Confluent Platform 2.0.1 on Cloudera (CSD)

8 Jul 2016 · by Marios Andreopoulos · Read in about 5 min

Important. Our Confluent CSD is deprecated and replaced by our most complete yet solution for a managed Kafka stack through Cloudera Manager, including monitoring, alerts and our exclusive UIs. See it here and request a trial today!

We are happy to announce the first version of our Confluent CSD.

Utilizing Landoop’s Confluent CSD you can create a Kafka Cluster with support services such as REST Proxy, Schema Registry and Kafka Connect in a few clicks.

Read more →

Confluent's Kafka Sink Connector to JDBC

10 Jun 2016 · by Antonios Chalkiopoulos · Read in about 5 min

In this article we introduce how Apache Kafka and the Confluent Platform can be effectively utilized in order to stream data from Kafka into MySQL, PostgreSQL, Oracle or MS SQL Server. The new Kafka JDBC Sink Connector build and certified in collaboration with Datamountaineer, provides powerful schematics and capabilities.

Read more →

Ansible Nginx Let's Encrypt Automation

7 Feb 2016 · by Marios Andreopoulos · Read in about 5 min

Automatic SSL certificate issuance and renew with Ansible and Let’s Encrypt Here on Landoop we prototype fast and new (sub)domains are frequently added to complement our back and front-end services. Since the beginning our specifications included “ssl everywhere”. The journey into providing fully secure and encrypted services is a long one; hence we need an adventure in the SSL land. The tools, the needs. We use ansible to manage our servers.

Read more →