Customer Story

Real-time multi-lingual sentiment analysis for a top U.S. telecom provider


A major telecom company providing nationwide telecom services wanted a system that performs real-time, multi-lingual classification and sentiment analysis of text data. The client was looking for a solution that allows storing, indexing, and querying Petabytes (PBs) of data with a very high throughput. Some of the critical requirements were:

  • Ingest and parse high volume of data [250M (15 TB) records/day] of varied types (for example, weblogs, email, chat, and files)
  • Apply real-time multi-lingual classification and sentiment analysis with very high accuracy (four nines)
  • Store metadata and raw binary data for querying
  • Query SLA – 5s on cold data


Gathr provided a solution with three modules:

  • Analytics Module: Responsible for performing text categorization and sentiment analysis. It implements a matrix decomposition-based text-classification algorithm. The incoming test document had to pass through a series of pre-processing and numerical computations. Impetus designed the classifier to achieve very low latency.
  • Event Store/ Indexer Abstraction Layer: Responsible for storing and indexing the information based on the configuration
  • Publish Module: Responsible for publishing the analytical result or event data to the external system


  • Rapid and accurate real-time text categorization and sentiment analysis
  • Adjustable text categorization for domain-specific classes
  • Multi-lingual support
  • Enhanced sentiment analysis to focus on feature-specific opinion mining
  • Linear scalability to increase the number of nodes in the cluster
  • Provision to add custom component for added functionalities

    By submitting this form you agree to have read the privacy policy and receive our emails.

      By submitting this form you agree to have read the privacy policy and receive our emails.

      Meet Gathr.

      The only all-in-one data pipeline platform

      • One platform to do it all - ETL, ELT, ingestion, CDC, ML
      • Self Service, zero-code, drag and drop interface
      • Built-in DataOps, MLOps, and DevOps tools
      • Cloud-agnostic and interoperable
      • Data

      • Change Data

      • ETL/ELT Data

      • Streaming

      • Data

      • Machine