Over the past two years, the work on developing AI/NLP based applications aimed at supporting users of the Integrated Qualifications Register has been in progress. The applications integrate data from various sources to help search and explore qualifications, and create personalised development paths. We talked to Marcin Będkowski (IBE) on defining goals, dangerous flexibility, latest technology, high expectations, and overcoming barriers.
Interview with Marcin Będkowski, Lead Expert at the Educational Research Institute, Warsaw
Over the past two years, you have been involved in developing applications* aimed at supporting users of the Integrated Qualifications Register in making educational choices. Could you give us a brief introduction to these apps and comment on the background to their development?
Marcin Będkowski: Let me start by saying that I came to the Educational Research Institute (IBE) and joined the IQR (Integrated Qualifications Register) project in February 2019 when the project had been carried out for a good few months. The general design guidelines of the applications were created much earlier and were a realization of the vision of the project leader, Marek Kopyt. At the beginning of 2019, the project was plagued by various delays, and probably the only person working on the topic at the time was Wojtek Stęchły with the support of Bartek Iwańczak and Andrzej Walczak.
The project application set a general direction for our work; we were looking for inspiration from other countries - including the Danish compass - and we also had a lot of operational freedom – e.g. especially in relation to what specific functions these apps would offer. This flexibility was quite convenient, but also dangerous. Convenient, because the applications were to use so called artificial intelligence (AI) and natural language processing (NLP) which have been booming over the past few years, and as a result, the frontier of what is possible is shifting by the day. Dangerous, because various extremely interesting and creative ideas needed to be confronted with the range of available data and capabilities offered by AI algorithms. And all this under a lot of time pressure due to the project realities.
It was certain that the applications would offer the ability to group qualifications - a very useful feature that would allow one to divide tens of thousands of qualifications into groups of similar elements, e.g. by industry: Education, IT, Agriculture, etc. Together with outstanding experts from the Institute of Computer Science of Polish Academy of Sciences (IPI PAN) and Wrocław University of Science and Technology (PWr): Łukasz Kobyliński, Michał Marcińczuk, Tomasz Walkowiak, Ryszard Tuora, Mateusz Gniewkowski, Grzegorz Wojdyga, as contractors, we tested numerous methods and came up with quite an interesting and satisfactory final outcome.
In July 2019, my brother, Leopold, joined the project, and in September - Joanna Rabiega-Wiśniewska. They both contributed a lot of energy, knowledge and experience to our team, but even in this expanded group we would not be able to carry out some of the operations. Much had been accomplished with the support of other members of the IQR team who had been involved, for example, in collecting the descriptions of learning outcomes from universities. We were also very lucky with our contractors. I've already mentioned IPI PAN and PWr, but I couldn't help but mention Emplocity and Whiteaster - two companies that we worked with on the expert opinion and preparation of the final applications themselves.
We sometimes jokingly refer to the applications as only semi-intelligent - meaning that automated artificial intelligence methods are only part of their value - and they owe much to the manual labour of many people. It is worth mentioning that they integrate data from many sources: regulations, occupations barometers, occupation cards, the Register of Schools and Educational Institutions, POL-on, ELA, etc. To some extent this data shows in the applications, and to some extent it was used as input for artificial intelligence algorithms. Many of these sources and ideas were suggested to us by our contractors - including Aleksander Nowak, who also developed an impressive micro-services architecture that combines all of these sources into a coherent whole.
Let me add that the term "semi-intelligent" is not, of course, meant to imply that applications are half dumb. It's just about breaking away from a certain misconception that's associated with the popular notion of artificial intelligence, namely that it just happens: that there's some machine that works better than a human. This notion is dangerous insofar as it plays down the amount of human work it takes to train a model: collecting, processing, cleaning data, and then inspecting the results produced by the algorithms and refining them. Working with artificial intelligence is very often more like a painstaking process of teaching a clumsy infant than giving orders to an army of robots that will do everything for you.
In the context of the work on the applications, how would you define the weaknesses of the existing systems for browsing/searching qualifications? What are the potential benefits for the user? Who is the target audience of the applications?
MB: Metaphorically speaking, the integration of the qualifications system is about reconciling different worlds, bringing them, so to speak, to a common denominator and making their elements comparable. In the case of the Integrated System/Register of Qualifications these worlds are e.g. sectoral vocational education, higher education or qualifications awarded outside the formal education path. It is very difficult to create a mechanism that would collect up-to-date data and at the same time present information useful for users - and, still at the same time, would do it in an accessible and attractive form. This requires action on multiple fronts.
On the one hand, it seems perfectly natural and feasible to gather information about qualifications in one place: after all, there are databases of courses of study, i.e. offered at a given university. On the other hand, however, it requires a lot of integrating efforts - concerning both the legal environment and the real flow of information.
So is this type of effort worth it? Gathering information from various sources in one place seems valuable in that it allows users of the system to get an idea of their educational options. The term "users of the system" may sound abstract, but we are talking about very real students, prospective students, their parents or career counselors, and ultimately, people who want to change their educational or career paths. Some of them want to find out, for example, which universities offer computer science or psychology degrees; others want to acquire competences in a new field as quickly as possible through courses or validation of existing skills.
It is to this end that various efforts are being undertaken to promote the idea of Lifelong Learning - not as an imposed requirement (although it is increasingly common for us to have to retrain), but rather as a palette of possibilities from which we can choose. The language of learning outcomes is one of the ways to make the skills associated with a certain level of education more tangible. This language, standardized in some ways, is only a step away from artificial intelligence mechanisms that turn the general idea of comparing qualifications into reality.
The clustering mechanism I mentioned earlier is arguably one of the most tangible realizations of the idea of comparing qualifications, of reducing them to a common denominator. Based on descriptions, such as learning outcomes, we can create interpretable groups of qualifications. An exemplary advantage of this mechanism is that it allows us to see what courses are similar to a given course. Because this mechanism does not take into account whether a course is offered by the same institution or whether courses belong to the same field of study, the results can be surprising. In a double sense: both revealing and, shall we say, intriguing.
Working on applications of this type was, and probably still is, a major technological challenge. Could you tell us a few words about the tools you used?
MB: Yes, it's true - I don't even know where or with whom to start. I will focus on three points, however.
First, gathering descriptions of the qualifications was a major challenge. We searched, collected and processed thousands of multi-page files to extract, for example, learning outcomes. We processed the 2019 core curriculum for sectoral vocational education, resolutions of university senates etc. Generally we only had access to pdf files and the information was contained in tables - with variable and complex layouts.
Second, the selection of methods for grouping qualifications was also challenging. Grouping requires specifying a number of parameters and consequently can generate a great deal of combinations. As grouping - also known as clustering - falls into an unsupervised learning category, it is not easy to determine and interpret the results - which method produces the best effect. Solving this problem required a lot of patience, commitment and ingenuity from us and our contractors from IPI PAN and PWr. A separate problem was the computational complexity of working with artificial intelligence methods - even using powerful computers, some operations could take dozens of hours. Conducting a series of experiments or training models could take a really long time.
Third, a non-trivial task was to design the architecture of the application - especially of the so-called backend. Since our activities were going on in parallel with an upgrade of the registry, we were not able to use the data coming from this very source (through the so-called API). It became necessary to build databases from scratch and combine information from different sources. It is also worth noting that the number of links between qualifications needed to determine possible development paths is enormous, and our contractors had to use a graph database and the latest search engines to optimize this process.
To what extent were you able to use off-the-shelf solutions, and to what extent would you describe your work as innovative?
MB: I think the novelty of our work could be judged using different criteria. Those could include: the technologies used, the focus on qualifications, the type of final product, i.e. a simple recommender system, or the process itself and the institutional environment in which we worked.
Although we tried to use state-of-the-art computer technology and latest artificial intelligence algorithms/models, I would be hard pressed to say with any confidence that our work was innovative, at least not in the sense of blazing new and completely unknown trails.
However, if we take all these criteria together, that is, the technology together with the domain and the institutional environment, I think we could be tempted to say that we have done innovative work on a European scale. We were able to see this during several study visits or pilot experiments we participated in. Work on comparing qualifications is taking place in various European institutions, and I think that the most interesting thing - achieving comparability of qualifications at the international level - is still ahead of us. This, perhaps not humble enough, opinion is also due to my assessment of the contractors we were lucky to work with. They were internationally recognized and acknowledged experts, involved in numerous innovative projects and providing solutions for the Polish language. Solutions that are truly innovative.
In your opinion, could your work on the applications "retroactively" affect the way qualifications are described? Could better refinement of qualification descriptions be a byproduct?
MB: Yes, definitely. It is, of course, difficult for me to say whether this will happen - i.e. whether the way of describing qualifications will be redefined - but several directions, in which the work on improving the methodology of describing qualifications could go, seem fairly obvious.
During the course of our work, we felt quite strongly the consequences of the fact that universities and lecturers treat descriptions of educational results without due attention. This manifests itself, for example, in the fact that certain elements of the descriptions are repeated on a "copy-paste" basis. The effect of this, for example, is that qualifications from the same institution tend to cluster together - they are similar because of "stylistic" similarity, the same template wording used.
But there are certainly more areas where a redefinition of how qualifications are described would be beneficial. For example, there is a disproportion in the number of learning outcomes used to describe qualifications. The range is from 4 to over 1,500, and the difference does not necessarily translate into the level of sophistication of the qualification and related skills.
The above observations do not have to immediately result in the introduction of substantial standardization and pushing descriptions into a template with specific limits on the number of learning outcomes or the number of characters. It seems necessary, however, to link descriptions more strongly to reality, to break with "templateism" and to link the features of the description with the features of the qualification itself. One such feature could be the number of sentences used to describe the learning outcomes which should correspond to the number of learning outcomes, or in simple terms, the skills.
Within the IQR2 project, we expect to gain more knowledge concerning the techniques of describing qualifications and processing texts. The project schedule includes development of an application to support “privileged” users, such as the institutions describing qualifications and experts assigning the PQF (Polish Qualifications Framework) levels. Some of these processes can be automated, some can be supported in one way or another: by offering prompts or giving access to suitably adapted database of descriptions.
In terms of plans for the near future - how would you describe the next steps?
MB: Keeping the products updated and complete is of critical importance. Although we have collected a lot of ideas for further development of the applications, it is really hard to comment on the next steps at the moment. New project, new challenges.
On a lighter note, did any situations arise in the course of your work that you would describe as anecdotal?
MB: During one of the presentations, when we showed preliminary versions of the applications, a colleague noticed that the algorithm for determining the similarity of qualifications and their grouping produces strange “companions” for the qualification of a hairdresser - e.g. furrier and leather tanner. In the heat of the speech, I didn't quite know how to explain this fact... I was convinced that there were better candidates in the qualifications system in terms of similarity to the hairdresser, e.g. hairdressing assistant or hairdressing technician. It was difficult to explain this fact based on the content of the qualifications system - everything seemed to indicate a malfunction of the algorithms used, e.g. setting the wrong value for the similarity threshold.
On review, it turned out that these better candidates were indeed among the selections, which reassured me as to how the algorithms work - apparently this result was influenced by the similarity of words such as hair, fur, skin, etc. found in the descriptions of all of the listed qualifications. The content of the qualifications database and the adopted parameters resulted in furrier and tanner being identified as similar to hairdresser.
During one of the subsequent presentations, I referred to this example as an illustration of the relationship between the results and the structure of the database, and the algorithms used, pointing out the arbitrariness of some decisions related to the adopted representation of texts or the parameters we select. Much to my surprise, a group of career counselors said that the result was not really unexpected to them; rather, it was consistent with some theory.
Martin Achtnich, a Swiss psychologist and the creator of the Vocational Picture Test (VPT), identified a "W factor" associated with softness, sensitivity, and the need for physical contact, and represented, at the linguistic level, by verbs such as touch or comb. Consequently, according to his view, occupations such as furrier and hairdresser belong to the same group ...
Ania, Slawek, Jerzy, Magda - thanks!
* Note on the applications
- Sectoral Education Compass - an application that collects data on full qualifications from the 2019 vocational education core curriculum. In addition to the information from the Integrated Qualifications Register, it combines information about educational institutions, the projected demand for qualifications, and occupation descriptions from VET materials. Qualifications can be browsed using a compass-like graphical interface.
- Qualifications Compass - an application that collects information on full qualifications from sectoral and higher education, as well as on market partial qualifications. In addition to searching for information on qualifications, the application also helps users to explore hierarchical groups of qualifications created using artificial intelligence techniques (see p. 4). The application integrates data from POL-on (The Integrated System of Information on Science and Higher Education) and ELA (Polish graduate tracking system).
- Development Paths - an application that allows users to create their own, personalised development paths. The user can specify a starting point (e.g. a qualification from the 4th level of the Polish Qualification Framework) and explore the possibilities of gaining further qualifications (e.g. from higher PQF levels). Two basic modules are available, i.e. selecting qualifications available from a given point onward based on content similarity or based on formal requirements. The application allows users to filter the qualifications available from a given point onward based on various criteria and description elements.
- An application for clustering qualifications - an experimental, technical application that allows users to create text corpora, e.g. descriptions of qualifications, to query them for linguistic features, and then to apply selected artificial intelligence techniques to the corpora. At the heart of the application is a clustering function taking various input parameters, e.g. text representation method (TF-IDF, fastText, WordNet etc.) and similarity measures (cosine, Euclidean etc.), and visualizing the results.