Text mining and machine learning

InfinIT invites you to a seminar where we will look at how we use machine learning and natural language processing today, and what the future holds.

Companies like Google, Apple, Microsoft, Amazon and IBM already apply machine learning and text mining on a large scale as part of their core business. At this meeting we will present the potential of machine learning and natural language processing.

The digital transformation has changed the way we process data and has enabled us to develop better products. With machine learning and other data-driven methods we can extract information from data and use it to automate processes that would otherwise have to be performed manually, and we can develop products that better meet customer needs.

Especially natural language processing (NLP) has had a major impact on search engines, machine translation between languages, and sentiment analysis to determine whether reviews are positive or negative. In the future, these technologies will not only be improved; we will also be able to design more intelligent systems that better “understand” the content of texts.

For example, we can extract persons that are mentioned in a text and identify if they are mentioned elsewhere (e.g. in newspaper articles). It could also be a chat function that can interact directly with the users and answer their questions.

Target audience
The target audience includes for example law firms, tax authorities, intelligence agencies, industries holding patents (e.g. pharma), insurance companies as well as recruitment and news agencies.

NLP - a lifebuoy in a sea of bytes

Anders Søgaard, Professor with special responsibilities, Centre for Language Technology, University of Copenhagen

Natural language processing (NLP) is, essentially, software for mapping bytes to meaning. NLP can provide us more adequate aggregates of information and help us detect events more robustly. Without NLP, information retrieval, speech technology and the like will be heavily biased. We introduce NLP as a machine learning problem, with a focus on state-of-the-art developments in knowledge extraction (fishing facts) and fraud detection (fighting fakes). Both have countless applications in law, insurance, and media monitoring.

Automatic Text Analysis – a practical view on document classification with today's solutions 

Brian Jacobsen, CTO, Taxon ApS

Everybody wants to do machine learning. Not everybody has the resources of Google, Apple, Facebook or Microsoft. So, what to do? When it comes to classifying texts like mails, web/intranet pages etc., in respect to a given structure like KLE, FORM etc., there are quite a few considerations to take into account when choosing a solution. Do you already have a taxonomy? How many resources does a solution require from you? And how to get started? Errors are bound to happen – how do you correct them? There are always exceptions to the rules – how do you handle them? Once the text/documents have been classified, how can you use the classification to generate value?


Finding sensitive information in text data

Jan Neerbek, IT Solutions Architect, Data Science and Engineering Lab, the Alexandra Institute

In this talk we take a look at finding secretive or sensitive data in natural language text. This sensitive information may take different forms, such as personal information (e.g. health history, different kinds of interaction logging, etc.) or it can be business secrets (research, processes, etc.). The talk deals with state-of-the-art natural language processing methods for finding sensitive information and how we apply these methods in one of our partner projects.


Tid og sted

Dato:  2. juni 2016
Tid:  14:00 - 16:00
Sted:  IT University of Copenhagen, Rued Langaards Vej 7, 2300 København S – Auditorium 4
Pris:  Free of charge. Please note that you will be charged a no-show / same-day cancellation fee of DKK 200 excl. Danish VAT.
Kontakt navn:  Merete Carlson
Kontakt e-mail
Tilmeldingsfrist:  1. juni 2016

Se alle Infinit arrangementer

InfinIT er finansieret af en bevilling fra Styrelsen for Forskning og Innovation og drives af et konsortium bestående af:
Alexandra Instituttet . BrainsBusiness . CISS . Datalogisk Institut, Københavns Universitet . DELTA . DTU Compute, Danmarks Tekniske Universitet . Institut for Datalogi, Aarhus Universitet . IT-Universitetet . Knowledge Lab, Syddansk Universitet . Væksthus Hovedstadsregionen . Aalborg Universitet