StreamAnalytix (now known as Gathr) now includes new capabilities to drive further enterprise adoption of Apache Spark; Impetus also presenting a tech talk at the Spark + AI Summit

LOS GATOS, Calif., June 5, 2018 – Impetus Technologies, a big data software products and services company, today unveiled a number of enhancements to StreamAnalytix (now known as Gathr), a powerful visual platform for unified streaming and batch data processing based on best-of-breed open source technologies. Numerous Fortune 500 companies are making their real-time enterprise architecture a reality with StreamAnalytix (now known as Gathr).

StreamAnalytix (now known as Gathr) simplifies the use of Apache Spark, and is designed to help organizations address the rising demand for Spark development talent. With an intuitive visual integrated development environment (IDE), StreamAnalytix (now known as Gathr) enables even those with limited development experience to build and operationalize Spark applications end to end.

Developers are invited to try StreamAnalytix (now known as Gathr) by downloading StreamAnalytix (now known as Gathr) Lite, a free development tool and compact version of the platform, on their desktop. StreamAnalytix (now known as Gathr) Lite is less than 1 gigabyte on disk and can be downloaded at: Developers can also sign up for a free 7-day trial of an Apache Spark-only version of StreamAnalytix in the cloud by visiting:

“Apache Spark is already the de facto standard for stream processing. Advancements like Structured Streaming have made Spark-streaming even more powerful,” said Anand Venugopal, head of StreamAnalytix at Impetus. “Now all of these capabilities are included in, and supported by, StreamAnalytix (now known as Gathr) along with a visual drag-and-drop interface, an exhaustive set of pre-built Spark operators, and full application lifecycle support – enabling developers to realize the full potential of enhancements to Apache Spark with unprecedented ease.”

The most recent enhancements to StreamAnalytix (now known as Gathr), include:

  • Full support for Spark Structured Streaming: With Structured Streaming, StreamAnalytix (now known as Gathr) now enables continuous applications by exposing a single API to write streaming as well as batch queries. It handles streaming complexities by ensuring exactly-once-semantics, doing incremental aggregations, and providing data consistency across sources and sinks.
  • Late data handling and watermarking: Allows handling of delayed data by maintaining intermediate calculations; as new data arrives aggregates are updated based on the time windows specified. These time windows can be defined by watermarking specific time intervals.
  • 5X faster performance: Enables significantly faster processing with Spark Structured Streaming as the underlying technology.
  • Auto Schema Detection: Automates the creation of schema within pre-built operators. Data can be accessed from a data storage system, or configured from a source such as Kafka, JDBC and more. StreamAnalytix (now known as Gathr) then automatically examines each field and assigns a data type to that field based on the values within the data to enable the identification of columns.
  • Auto pipeline inspect: Allows the use of a data inspect feature during pipeline development, and before and after the use of every individual operator for an end-to-end view of data transformation at every step.
  • Real-time event monitoring: Users now receive all performance metrics in real-time, and can keep a continuous watch on their application data pipelines, as well as the cluster infrastructure on both local and cloud-based environments.
  • Support for Apache Spark 2.2 and Hadoop 2.7.3

Impetus also announced that it will present a tech talk titled “Leveraging Spark Machine Learning for Real-time Credit Card Approvals” at the Spark + AI Summit. The session will describe how StreamAnalytix (now known as Gathr) leveraged Spark Streaming and Spark Machine Learning (ML) models to build and operationalize real-time credit card approvals for a major bank. It will include a deep dive into the Spark-based ML capabilities used, as well as how a typical ML pipeline looks. The 30-minute session will be presented by Impetus’ Anand Venugopal, product head, and Saurabh Dutta, technical architect, on Tuesday, June 5, at 3:30 p.m. Pacific time in room 2000. Session details can be found here.

Visit the StreamAnalytix (now known as Gathr) booth, # 209, at Moscone West Convention Center for more information on how StreamAnalytix (now known as Gathr) can help accelerate the development of Spark applications. Impetus’ experts will showcase how users with very little Spark development experience can utilize the visual IDE and drag-and-drop features of StreamAnalytix (now known as Gathr) to build and operationalize a Spark pipeline in minutes.

About Impetus Technologies

Impetus Technologies is focused on creating big business impact through big data solutions for Fortune 1000 enterprises. The company offers a unique mix of software products, consulting services, data science capabilities and technology expertise. It offers full life-cycle services for big data technology implementations, including technology strategy, solution architecture, proof of concept, production implementation and on-going support to its clients. To learn more, visit: or write to:, and follow us on Twitter and LinkedIn.