Thematic Analysis: How Machines Learn to Understand Semantics

Thematic Analysis focuses on collecting and analyzing text, document, image or audio data to understand concepts, opinions, or experiences. This is sometimes called Classification, Entity Extraction or Topic Modeling although in practice these are separate functions (service) that make use of different algorithms for quantitative analysis. Classification relies on clustering techniques, that can automatically identify key terms or phrases (features) that describe a document or text. Classified texts may them be used as data dictionaries or labeled sets to train other analytic services. Topic Modeling and related services make use of neural networks and deep learning techniques to perform specialized text analysis that can discover similar terms, related phrases (engrams) and semantic relationships between words.

Thematic Analysis is often used to gain deeper insight into a problem, identify themes, synonyms and key phrases or to generate new ideas for research. Understanding human semantics cognitive AI can discover meaning and context by analyzing large volumes of related text, its language and sentence structure to build Semantic Graphs (semagraphs). Such specialized knowledge graphs may be generated using a cognitive automation process without human interaction. Re-computing of the semagraphs may also be triggered by changes in critical information or the arrival of new data; a process known as unsupervised learning.

However, given the highly specialized nature of thematic analysis, it may be desirable to create knowledge graphs that are domain-specific. For example using medical research documents specific to cancer or vaccine efficacy trials can produce very specialized semagraphs linked to a focused area of research, that understand the vocabulary of a given domain. In cancer research, for instance the term "life saving" means something very different than in vaccine research. The primary goal of cancer treatment at this time is to extend human life by some significant measure. Whereas the goal of vaccines is to reduce symptoms and eliminate fatal outcomes. Yet the machine can be trained to understand such subtle, contextual differences quite easily. And it can also be trained to easily sort out documents and texts related to vaccines from those that discuss cancer research, compare such documents to similar texts; and even identify document with overlapping interests, such those theoretical works that discuss cancer vaccines. How is this done?

Feature Engineering


Synonymy + Hyponymy


The Future of Thematic Analytics

4 views0 comments

Recent Posts

See All

Contact Us


  • Google+ Social Icon
  • Facebook Social Icon
  • LinkedIn Social Icon
  • Twitter Social Icon

© 2021 StreamScape Technologies