Problem background:

The client is an online cyber security startup based in Australia with a goal of making the internet a more civil, inclusive and open space where everyone can openly and uninhibitedly express themselves.

Underlying Tech

Computational linguistics is a branch of linguistics that covers the analysis and computational modelling of natural language. In the present usecase we are focusing on its sub branch that is Sentiment analysis. Sentiment analysis aka opinion mining applies methodologies of natural language processing, text analysis, computational linguistics, and occasionally biometrics to systematically identify, extract, calculate and produce insights on user's behaviors. More general classifications are positive, negetive or neutral expressions. However, in present case deeper insights were provided, in terms of the specific types of toxicities e.g. racist, sexist, disability related or religion based atrocities.

Project objective:

Toxicity detection – Part 1 : Detecting abuse and insults. Classify the type of toxicity such as racism, gender bias, neurodiversity and so on. This breakup would further be useful in designing actions, alerts and automated blocking of the deviated users.

How AI can help with cyber security:

  • Automated comments moderation for public forum / social media.
  • Dating apps can flag one-sided obscene messaging, ban the creeps
  • Companies can ensure healthy and inclusive communication channels.

How can you use these deep learning models of sentiment analysis ?

  • A. Pretrained models are wrapped as microservices, callable as API, Ready to deploy and be integrated into your existing apps/ web / systems
  • B. Available with basic UI where user can upload a text and see the filtered version for demo or testing before integration
  • C. Customizable for deeper insights. It can also be integrated into dashboards that TotemX team can help you ideate and conceptualize.
  • D. Flexible Machine learning structure that is easiliy retrainable and updated models for different classifications or analytical view points

What is included:

  • A ready to deploy web API service + basic test UI
  • Fully documented functionalities as a github repo
  • Data extraction possibility for analysis and further investigations

AI Tech stack:

  • SOTA deep learning based NLP models – custom trained for the usecase
  • Optimized response times for edge devices
  • Scalable Flask API depending on the traffic
  • Deployment on AWS EC2 leveraging async processing for faster response time and easy traffic-driven scaling

Future development:

  • Monitoring the undercurrents with analytical dashboards.
  • Generative AI to provide automated counter responses to offensive users, making suggestions to re-align their communication to accepted levels and policies.

Are you Interested in-

  • Wider applications of toxicity detection / Sentiment analysis / NLP?
  • Do you have large unstructured text corpus or natural language text or voice data with unrealized business value? TotemX can streamline the text data pipes and distill structured knowledge via NLP analysis. We have wide spectrum of tools starting from simple statisitical measures to cutting-edge deep learning techniques.
  • Do you deal with text data but want to design more efficient data pipelines?
  • Build an MVP with web / mobile?
  • Deploy on AWS / GCP ?

Get in touch!