NLP is a branch of data science that enables automated processes to analyze and extract meaningful insights from human language. It can be used to supplement manual processing, extract insights that might otherwise have been missed, and reduce some of the manual processing that does not add any value. If done well, it can reduce the cost of many operations, while improving the quality of results.
Our white paper Gaining Greater Insights from Public Consultations with Data Science and Natural Language Processing explores how data science techniques can be applied to public consultations, with a particular focus on how this can help humans get more information out of a time-consuming task, while reducing bias in the analysis. As a practical application, techniques such as topic modeling and named entity recognition can help the reader to extract key features found in a text, highlighting the context in which it was used. Link named entities allows you to link specific entities to a knowledge graph to obtain additional information such as definitions, aliases, and conceptual categories. This also provides the entities with a context by creating connections and associations, taking into account permutations and synonyms. Bias reduction can be seen from this process: the contribution of individuals using less frequent terms or keywords will be taken into account in the analysis. If we want to move forward, we can learn from opinions and texts that are usually excluded because they do not set the exact frequency limit, making sure that they are addressed, if appropriate. This would ensure that more voices are heard, and fairer and more profound results than were previously possible with traditional techniques and techniques.
The keywords, organizations, and people the audience cites, along with general sentiment in responses to open-ended questions, can change across different demographic, economic, and geographic groups. By knowing the data better, it is possible to account for any potential under-representation when building a model and testing the algorithm, and verify that minorities are not affected. Standardizing text and reducing human bias can help increase diversity in counseling responses while improving government policies and responses.
NLP has also been used to reduce bias in other areas, for example by matching skills to the recruitment process in the United States. While this topic may be controversial, the goal is to use NLP to standardize skills in resumes and successfully match people from different backgrounds to job postings. Algorithm testing is definitely an important aspect of this process, making sure this tool is not used to make automatic decisions about candidates.
As the example above shows, ethics play an important role in the field of AI. I am a strong advocate of using AI to augment human processes, rather than replace them, allowing for more scrutiny of feedback, rather than reducing it.
Felicia Zepparro is a Principal Data Scientist with Methods Analytics and a finalist in the Team Leader of the Year category at the upcoming Women in Technology Excellence Awards.