Customer Story

Real-time multi-lingual sentiment analysis for a top U.S. telecom provider

Challenges

A major telecom company providing nationwide telecom services wanted a system that performs real-time, multi-lingual classification and sentiment analysis of text data. The client was looking for a solution that allows storing, indexing, and querying Petabytes (PBs) of data with a very high throughput. Some of the critical requirements were:

  • Ingest and parse high volume of data [250M (15 TB) records/day] of varied types (for example, weblogs, email, chat, and files)
  • Apply real-time multi-lingual classification and sentiment analysis with very high accuracy (four nines)
  • Store metadata and raw binary data for querying
  • Query SLA – 5s on cold data

Solution

Gathr provided a solution with three modules:

  • Analytics Module: Responsible for performing text categorization and sentiment analysis. It implements a matrix decomposition-based text-classification algorithm. The incoming test document had to pass through a series of pre-processing and numerical computations. Impetus designed the classifier to achieve very low latency.
  • Event Store/ Indexer Abstraction Layer: Responsible for storing and indexing the information based on the configuration
  • Publish Module: Responsible for publishing the analytical result or event data to the external system

Results

  • Rapid and accurate real-time text categorization and sentiment analysis
  • Adjustable text categorization for domain-specific classes
  • Multi-lingual support
  • Enhanced sentiment analysis to focus on feature-specific opinion mining
  • Linear scalability to increase the number of nodes in the cluster
  • Provision to add custom component for added functionalities

    By submitting this form you agree to have read the privacy policy and receive our emails.

      By submitting this form you agree to have read the privacy policy and receive our emails.

      Meet Gathr.

      The only all-in-one data pipeline platform

      • One platform to do it all - ETL, ELT, ingestion, CDC, ML
      • Self Service, zero-code, drag and drop interface
      • Built-in DataOps, MLOps, and DevOps tools
      • Cloud-agnostic and interoperable
      • Data
        Ingestion

      • Change Data
        Capture

      • ETL/ELT Data
        Integration

      • Streaming
        Analytics

      • Data
        Preparation

      • Machine
        Learning

      Expert Opinion

      Gathr is an end-to-end, unified data platform that handles ingestion, integration/ETL (extract, transform, load), streaming analytics, and machine learning. It offers strengths in usability, data connectors, tools, and extensibilty.


      Customer Speak

      Gathr helped us build “in-the-moment” actionable insights from massive volumes of complex operational data to effectively solve multiple use cases and improve the customer experience.


      IN THE SPOTLIGHT

      Learning and Insights

      Stay ahead of the curve

      Q&A with Forrester

      Building a modern data stack: What playbooks don’t tell you

      Blog

      4 common data integration pitfalls to avoid

      Blog

      Why modernizing ETL is imperative for massive scale, real-time data processing

      Fireside Chat

      Don’t just migrate. Modernize your legacy ETL.