top of page

Using NLP and machine learning to classify companies by sector

Writer's picture: Justin HolmJustin Holm

Updated: Aug 12, 2020

Starting with a data set of companies on the Toronto TSX Ventures Exchange the following outlines the analysis of natural language process and machine learning in the hopes of getting a better understanding of which industries these companies operate in.


For each company we have the description and also the sector of the economy that it belongs to. It shouldn't be very surprising that this model performed at 86% accuracy the defined sectors were Basic Materials, Communication Services, Consumer Cyclical, Consumer Defensive, Energy, Financial Services, Healthcare, Industrials, Real Estate, Technology, Utilities.


The interesting results appeared in predicted classifications that do not match the companies stated industry, here are some examples:

  • Noble Iron: is listed under Industrials as a business services company. The prediction was within Technology which is actually more appropriate given they provide a cloud technology platform to companies within Industry.

  • Voyageur Pharmaceuticals Ltd: is listed as a Healthcare company, however, the prediction was Basic Materials as it's core business is to extract minerals which are made into health products

  • Else Nutrition Holdings: is listed as Consumer Defensive, however the prediction was Healthcare. The differentiation is likely between a nutrition product vs a medicine.

  • DIAGNOS Inc : is listed as a Healthcare company, however, the algorithm is classifying it as a Technology company. The company builds software which can help detect anomolies in images.

  • A.I.S. Resources Limited - listed as a Basic Material company, this is really venture capital firm

It is very difficult to clearly define companies by industry as there is significant overlap between them as the economy evolves.


Next steps:

  • Further analysis at a more granular level

  • Try with other markets

  • Incorporate financial data

SciKit Learn packages used:



20 views0 comments

Recent Posts

See All

Comments


bottom of page