Customer Story

Real-time multi-lingual sentiment analysis for a top U.S. telecom provider


A major telecom company providing nationwide telecom services wanted a system that performs real-time, multi-lingual classification and sentiment analysis of text data. The client was looking for a solution that allows storing, indexing, and querying Petabytes (PBs) of data with a very high throughput. Some of the critical requirements were:

  • Ingest and parse high volume of data [250M (15 TB) records/day] of varied types (for example, weblogs, email, chat, and files)
  • Apply real-time multi-lingual classification and sentiment analysis with very high accuracy (four nines)
  • Store metadata and raw binary data for querying
  • Query SLA – 5s on cold data


Gathr provided a solution with three modules:

  • Analytics Module: Responsible for performing text categorization and sentiment analysis. It implements a matrix decomposition-based text-classification algorithm. The incoming test document had to pass through a series of pre-processing and numerical computations. Impetus designed the classifier to achieve very low latency.
  • Event Store/ Indexer Abstraction Layer: Responsible for storing and indexing the information based on the configuration
  • Publish Module: Responsible for publishing the analytical result or event data to the external system


  • Rapid and accurate real-time text categorization and sentiment analysis
  • Adjustable text categorization for domain-specific classes
  • Multi-lingual support
  • Enhanced sentiment analysis to focus on feature-specific opinion mining
  • Linear scalability to increase the number of nodes in the cluster
  • Provision to add custom component for added functionalities

    Gathr Data Inc will use the data provided here in accordance with our Privacy Policy.

      Gathr Data Inc will use the data provided here in accordance with our Privacy Policy.

      Meet Gathr.

      The only all-in-one data pipeline platform

      • One platform to do it all - ETL, ELT, ingestion, CDC, ML
      • Self Service, zero-code, drag and drop interface
      • Built-in DataOps, MLOps, and DevOps tools
      • Cloud-agnostic and interoperable
      • Data

      • Change Data

      • ETL/ELT Data

      • Streaming

      • Data

      • Machine

      Expert Opinion

      Gathr is an end-to-end, unified data platform that handles ingestion, integration/ETL (extract, transform, load), streaming analytics, and machine learning. It offers strengths in usability, data connectors, tools, and extensibilty.

      Customer Speak

      Gathr helped us build “in-the-moment” actionable insights from massive volumes of complex operational data to effectively solve multiple use cases and improve the customer experience.


      Learning and Insights

      Stay ahead of the curve

      Q&A with Forrester

      Building a modern data stack: What playbooks don’t tell you


      4 common data integration pitfalls to avoid


      Why modernizing ETL is imperative for massive scale, real-time data processing

      Fireside Chat

      Don’t just migrate. Modernize your legacy ETL.