Deep Learning
Indaba X - Zambia 2021
Lighton Phiri <lighton.phiri@unza.zm>
Department of Library & Information Science
University of Zambia
http://lis.unza.zm/~lightonphiri
Using Machine Learning Techniques
for Solving
Locally Relevant Problems
2
May 25, 2021
About The DataLab Research Group at The
University of Zambia
● The DataLab research group
at The University of Zambia is
composed of faculty staff and
students—undergraduate
and postgraduate—working
in three main areas
○ Data Mining
○ Digital Libraries
○ Technology-Enhanced
Learning
http://datalab.unza.zm
3
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
● Part III. Potential Problems
4
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
○ Introduction
○ Data Mining Pipelines
○ Data Mining Models
● Part II. Past and Current Projects
● Part III. Potential Problems
5
May 25, 2021
Machine Learning 101 [...]
https://commons.wikimedia.org/
● Artificial Intelligence encompases
a broad spectrum of sub-fields
○ Traditional machine learning
techniques and approaches
○ Deep Learning approaches
6
May 25, 2021
Machine Learning 101 [...]
https://commons.wikimedia.org/
● Artificial Intelligence encompases
a broad spectrum of sub-fields
○ Traditional machine learning
techniques and approaches
○ Deep Learning approaches
7
May 25, 2021
Data is Key to ML-Centric Problem Solving
8
May 25, 2021
Data Mining Pipelines
● Fundamentally,
machine learning
aims to extract
knowledge from
data
○ Historical data is
used to
infer/predict
outcomes
associated with
new observations
9
May 25, 2021
Data Mining Pipelines
● Input features
identified during
feature engineering
are used to train
models
○ Features correlated
with outcome to be
identified
10
May 25, 2021
Data Mining Pipelines
● The ML inference
model is used to
predict future
patterns
○ Models can then be
deployed as Web
services and/or
standalone
applications
11
May 25, 2021
Data Mining Models (1/5)
https://doi.org/10.1017/S0269888910000032
● Numerous data
mining models and
frameworks have
been proposed
○ Most trace their
roots from the
KDD Process
proposed by
Fayyad et al.
12
May 25, 2021
Data Mining Models (2/5)
https://doi.org/10.1017/S0269888910000032
13
May 25, 2021
Data Mining Models (2/5)
https://doi.org/10.1017/S0269888910000032
14
May 25, 2021
Data Mining Models (2/5)
https://doi.org/10.1017/S0269888910000032
15
May 25, 2021
Data Mining Models (3/5)
https://doi.org/10.1017/S0269888906000737
16
May 25, 2021
Data Mining Models (3/5)
https://doi.org/10.1017/S0269888906000737
17
May 25, 2021
Data Mining Models (4/5)
https://www.kdnuggets.com
● CRISP-DM model is one
of the most widely
used data mining
models
● Data understanding
and preparation are
the most time
consuming
18
May 25, 2021
Data Mining Models (5/5)
https://arxiv.org/abs/2003.05155
19
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
○ Scholarly Research Output in Zambia
○ Predicting Learning Outcome at UNZA
○ Medical Imaging Workflows in Zambia
○ Automatic Weather Prediction in Zambia
● Part III. Potential Problems
20
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
○ Scholarly Research Output in Zambia
○ Predicting Learning Outcome at UNZA
○ Medical Imaging Workflows in Zambia
○ Automatic Weather Prediction in Zambia
● Part III. Potential Problems
21
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (1/4)
https://worldmapper.org
22
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (1/4)
https://worldmapper.org
23
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (2/4)
24
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (2/4)
25
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (3/4)
http://www.webometrics.info
26
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (3/4)
http://www.webometrics.info
27
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (4/4)
Phiri, L. (2018)
“Towards Increased Online Visibility of Scholarly Research Output in Zambia”.
URL: http://lis.unza.zm/archive/handle/123456789/227
28
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (4/4)
Phiri, L. (2018)
“Towards Increased Online Visibility of Scholarly Research Output in Zambia”.
URL: http://lis.unza.zm/archive/handle/123456789/227
29
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Problem (4/4)
Phiri, L. (2018)
“Towards Increased Online Visibility of Scholarly Research Output in Zambia”.
URL: http://lis.unza.zm/archive/handle/123456789/227
30
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Multipronged Approach
31
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Multipronged Approach
32
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Multipronged Approach
33
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Multipronged Approach
34
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (1/7)
● Implementation of classification models to
automatically classify IR digital objects
using the minimum possible input from
graduate students: “The ETD Manuscript”
○ The ETD manuscript bitstream is considered
the “single source of truth”
○ Metadata prepared by staff that work with IR
potentially have inconsistencies
Phiri, L. (2021)
“Automatic Classification of Digital Objects for Improved Metadata Quality of ETDs”
URL: https://doi.org/10.1504/IJMSO.2020.112804
35
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (2/7)
● Text features extracted from a set of core
bitstream portions—ETD Title, ETD
Abstract, ETD Title Page and ETD pages—to
classify ETD manuscripts
ETD Type
ETD Subjects
IR Collection
36
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (3/7)
● Textual content mined from PDF
manuscripts
○ Cover/title pages
○ Preliminary pages
● Textual content mined from
metadata for training
● PDF document metadata
● Curated datasets from external
repositories
37
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (3/7)
● Textual content mined
from PDF manuscripts
○ Cover/title pages
○ Preliminary pages
● Textual content mined
from metadata for
training
● PDF document metadata
● Curated datasets from
external repositories
38
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (3/7)
● Textual content mined
from PDF manuscripts
○ Cover/title pages
○ Preliminary pages
● Textual content mined
from metadata for
training
● PDF document metadata
● Curated datasets from
external repositories
39
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (3/7)
● Textual content mined
from PDF manuscripts
○ Cover/title pages
○ Preliminary pages
● Textual content mined
from metadata for
training
● PDF document metadata
● Curated datasets from
external repositories
40
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (4/7)
● OAI-PMH used to
harvest all ETD
descriptive metadata
elements
● OAI-ORE used to
harvest all ETD PDF
documents
41
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (5/7)
● ETD Type—98.1%
● ETD Collection— 81.1%
● ETD Subjects—81.7%
● The models would still
need to be
incorporated into an
application that
requires “some”
human intervention
42
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (6/7)
https://github.com/lightonphiri/etd_autoclassifier
43
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—ETDs Automatic Classification (7/7)
https://datalab-apis.herokuapp.com/api/collection
44
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Current Work (1/3)
M’sendo R. (2019—Present)
MSc Computer Science, University of Zambia
“Multi-Faceted Automatic Classification of Institutional Repository Objects”
45
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Current Work (2/3)
Chisale A. (2021—Present)
MLIS, University of Zambia
“Automatic Generation of Electronic Theses and Dissertations Metadata”
46
May 25, 2021
Project #1: Online Visibility of Research in
Zambia—Current Work (3/3)
http://lis.unza.zm/portal
47
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
○ Scholarly Research Output in Zambia
○ Predicting Learning Outcome at UNZA
○ Medical Imaging Workflows in Zambia
○ Automatic Weather Prediction in Zambia
● Part III. Potential Problems
48
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Problem (1/2)
● ICT 1110 performance is as issue. The poor performance
transcends all assessments: quizzes, tests and practical
programming questions.
49
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Problem (1/2)
● ICT 1110 performance is as issue. The poor performance
transcends all assessments: quizzes, tests and practical
programming questions.
50
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Problem (2/2)
● Potential solution: implement a prediction model aimed at
identifying at-risk students .
○ Initiate interventions on at-risk students.
51
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Data Sources (1/5)
● Demographics information
● LMS interaction logs
● Course workload
● Subject responses
52
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Data Sources (2/5)
● Assessment results
broken down by question
○ Concepts associated with
question
○ Topics associated with
question
53
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Data Sources (3/5)
● Assessment results broken
down by question
○ Concepts associated with
question
○ Topics associated with
question
54
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Data Sources (4/5)
● LMS interaction logs
○ How often do
students access
Moodle (login
attempts)
○ Which Moodle
features are being
access (GradeBook,
Messaging)
○ Time spent on Moodle
55
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Data Sources (5/5)
● ICT 1110 information survey to
capture information not available
in SIS
○ Experience with computers
○ Motivation for taking the course
○ Specific location where student lives
(although this can be inferred from
next of kin address perhaps?)
56
May 25, 2021
Project #2: Predicting Student Learning
Outcomes—Current Work
Chaibela, M., Chisha, I., Pungwa, D., Siabbaba D. and Simukoko B. (2021)
“Performance Predictor: Machine Learning Tool for Student Performance Outcomes”.
Work-in-Progress
57
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
○ Scholarly Research Output in Zambia
○ Predicting Learning Outcome at UNZA
○ Medical Imaging Workflows in Zambia
○ Automatic Weather Prediction in Zambia
● Part III. Potential Problems
58
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Problem
https://mjz.co.zm/index.php/mjz/article/view/560
59
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Current Work (1/2)
60
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Current Work (1/2)
61
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Current Work (2/2)
62
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Current Work (2/2)
63
May 25, 2021
Project #3: Medical Imaging Workflows in
Zambia—Current Work (2/2)
64
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
○ Scholarly Research Output in Zambia
○ Predicting Learning Outcome at UNZA
○ Medical Imaging Workflows in Zambia
○ Automatic Weather Prediction in Zambia
● Part III. Potential Problems
65
May 25, 2021
Project #3: Automatic Forecasting of
Seasonal Rainfall—Current Work
66
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
● Part III. Potential Problems
○ Exemplar Projects in Zambia
○ Potential Locally Relevant Problems
67
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
● Part III. Potential Problems
○ Exemplar Projects in Zambia
○ Potential Locally Relevant Problems
68
May 25, 2021
Agriculture: Automatic identification and
Early Warning of Fall Armyworms
http://dspace.unza.zm/handle/123456789/7141
69
May 25, 2021
Telecommunications: Automatic Customer
Segmentation
http://dspace.unza.zm/handle/123456789/7069
70
May 25, 2021
Banking: Automatic Data Mining for Fraud
Detection
https://bit.ly/3wxJICk
71
May 25, 2021
Outline
● Part I. Data-Driven Problem Solving
● Part II. Past and Current Projects
● Part III. Potential Problems
○ Exemplar Projects in Zambia
○ Potential Locally Relevant Problems
72
May 25, 2021
Potential Locally Relevant Problems in
Zambia (1/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
73
May 25, 2021
Potential Locally Relevant Problems in
Zambia (2/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
Zambia Daily Mail | August 18, 2019 | Volume 22 No. 033
74
May 25, 2021
Potential Locally Relevant Problems in
Zambia (3/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
75
May 25, 2021
Potential Locally Relevant Problems in
Zambia (3/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
76
May 25, 2021
Potential Locally Relevant Problems in
Zambia (4/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
77
May 25, 2021
Potential Locally Relevant Problems in
Zambia (4/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
78
May 25, 2021
Potential Locally Relevant Problems in
Zambia (4/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
79
May 25, 2021
Potential Locally Relevant Problems in
Zambia (4/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development perhaps?
80
May 25, 2021
Potential Locally Relevant Problems in
Zambia (5/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development
perhaps?
81
May 25, 2021
Potential Locally Relevant Problems in
Zambia (5/6)
● Impact-driven
research/studies
○ Education
○ Health
○ So-called ICT for
development
perhaps?
82
May 25, 2021
Potential Locally Relevant Problems in
Zambia (6/6)
● Education
● Health
● So-called ICT for
development
perhaps?
83
May 25, 2021
Potential Locally Relevant Problems in
Zambia (6/6)
● Education
● Health
● So-called ICT for
development
perhaps?
84
May 25, 2021
Q & A Session
● Comments, concerns and complaints?
[1] Phiri, L. (2018). Research Visibility in the Global South: Towards
Increased Online Visibility of Scholarly Research Output in
Zambia. IEEE International Conference in Information and
Communication Technologies.
[2] Phiri, L. (2020). A Multi-Faceted Multi-Stakeholder Approach for
Increased Visibility of ETDs in Zambia. Cadernos BAD, (1).
https://doi.org/10.1017/S0269888910000032
[3] Phiri, L. (2020). Automatic classification of digital objects for
improved metadata quality of electronic theses and dissertations
in institutional repositories. International Journal of Metadata,
Semantics and Ontologies, 14(3), 234-248.
Bibliography
lighton.phiri@unza.zm
http://datalab.unza.zm
http://lis.unza.zm/~lightonphiri

Using Machine Learning Techniques for Solving Locally Relevant Problems

  • 1.
    Deep Learning Indaba X- Zambia 2021 Lighton Phiri <[email protected]> Department of Library & Information Science University of Zambia http://lis.unza.zm/~lightonphiri Using Machine Learning Techniques for Solving Locally Relevant Problems
  • 2.
    2 May 25, 2021 AboutThe DataLab Research Group at The University of Zambia ● The DataLab research group at The University of Zambia is composed of faculty staff and students—undergraduate and postgraduate—working in three main areas ○ Data Mining ○ Digital Libraries ○ Technology-Enhanced Learning http://datalab.unza.zm
  • 3.
    3 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ● Part III. Potential Problems
  • 4.
    4 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ○ Introduction ○ Data Mining Pipelines ○ Data Mining Models ● Part II. Past and Current Projects ● Part III. Potential Problems
  • 5.
    5 May 25, 2021 MachineLearning 101 [...] https://commons.wikimedia.org/ ● Artificial Intelligence encompases a broad spectrum of sub-fields ○ Traditional machine learning techniques and approaches ○ Deep Learning approaches
  • 6.
    6 May 25, 2021 MachineLearning 101 [...] https://commons.wikimedia.org/ ● Artificial Intelligence encompases a broad spectrum of sub-fields ○ Traditional machine learning techniques and approaches ○ Deep Learning approaches
  • 7.
    7 May 25, 2021 Datais Key to ML-Centric Problem Solving
  • 8.
    8 May 25, 2021 DataMining Pipelines ● Fundamentally, machine learning aims to extract knowledge from data ○ Historical data is used to infer/predict outcomes associated with new observations
  • 9.
    9 May 25, 2021 DataMining Pipelines ● Input features identified during feature engineering are used to train models ○ Features correlated with outcome to be identified
  • 10.
    10 May 25, 2021 DataMining Pipelines ● The ML inference model is used to predict future patterns ○ Models can then be deployed as Web services and/or standalone applications
  • 11.
    11 May 25, 2021 DataMining Models (1/5) https://doi.org/10.1017/S0269888910000032 ● Numerous data mining models and frameworks have been proposed ○ Most trace their roots from the KDD Process proposed by Fayyad et al.
  • 12.
    12 May 25, 2021 DataMining Models (2/5) https://doi.org/10.1017/S0269888910000032
  • 13.
    13 May 25, 2021 DataMining Models (2/5) https://doi.org/10.1017/S0269888910000032
  • 14.
    14 May 25, 2021 DataMining Models (2/5) https://doi.org/10.1017/S0269888910000032
  • 15.
    15 May 25, 2021 DataMining Models (3/5) https://doi.org/10.1017/S0269888906000737
  • 16.
    16 May 25, 2021 DataMining Models (3/5) https://doi.org/10.1017/S0269888906000737
  • 17.
    17 May 25, 2021 DataMining Models (4/5) https://www.kdnuggets.com ● CRISP-DM model is one of the most widely used data mining models ● Data understanding and preparation are the most time consuming
  • 18.
    18 May 25, 2021 DataMining Models (5/5) https://arxiv.org/abs/2003.05155
  • 19.
    19 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ○ Scholarly Research Output in Zambia ○ Predicting Learning Outcome at UNZA ○ Medical Imaging Workflows in Zambia ○ Automatic Weather Prediction in Zambia ● Part III. Potential Problems
  • 20.
    20 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ○ Scholarly Research Output in Zambia ○ Predicting Learning Outcome at UNZA ○ Medical Imaging Workflows in Zambia ○ Automatic Weather Prediction in Zambia ● Part III. Potential Problems
  • 21.
    21 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (1/4) https://worldmapper.org
  • 22.
    22 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (1/4) https://worldmapper.org
  • 23.
    23 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (2/4)
  • 24.
    24 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (2/4)
  • 25.
    25 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (3/4) http://www.webometrics.info
  • 26.
    26 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (3/4) http://www.webometrics.info
  • 27.
    27 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (4/4) Phiri, L. (2018) “Towards Increased Online Visibility of Scholarly Research Output in Zambia”. URL: http://lis.unza.zm/archive/handle/123456789/227
  • 28.
    28 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (4/4) Phiri, L. (2018) “Towards Increased Online Visibility of Scholarly Research Output in Zambia”. URL: http://lis.unza.zm/archive/handle/123456789/227
  • 29.
    29 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Problem (4/4) Phiri, L. (2018) “Towards Increased Online Visibility of Scholarly Research Output in Zambia”. URL: http://lis.unza.zm/archive/handle/123456789/227
  • 30.
    30 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Multipronged Approach
  • 31.
    31 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Multipronged Approach
  • 32.
    32 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Multipronged Approach
  • 33.
    33 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Multipronged Approach
  • 34.
    34 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (1/7) ● Implementation of classification models to automatically classify IR digital objects using the minimum possible input from graduate students: “The ETD Manuscript” ○ The ETD manuscript bitstream is considered the “single source of truth” ○ Metadata prepared by staff that work with IR potentially have inconsistencies Phiri, L. (2021) “Automatic Classification of Digital Objects for Improved Metadata Quality of ETDs” URL: https://doi.org/10.1504/IJMSO.2020.112804
  • 35.
    35 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (2/7) ● Text features extracted from a set of core bitstream portions—ETD Title, ETD Abstract, ETD Title Page and ETD pages—to classify ETD manuscripts ETD Type ETD Subjects IR Collection
  • 36.
    36 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (3/7) ● Textual content mined from PDF manuscripts ○ Cover/title pages ○ Preliminary pages ● Textual content mined from metadata for training ● PDF document metadata ● Curated datasets from external repositories
  • 37.
    37 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (3/7) ● Textual content mined from PDF manuscripts ○ Cover/title pages ○ Preliminary pages ● Textual content mined from metadata for training ● PDF document metadata ● Curated datasets from external repositories
  • 38.
    38 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (3/7) ● Textual content mined from PDF manuscripts ○ Cover/title pages ○ Preliminary pages ● Textual content mined from metadata for training ● PDF document metadata ● Curated datasets from external repositories
  • 39.
    39 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (3/7) ● Textual content mined from PDF manuscripts ○ Cover/title pages ○ Preliminary pages ● Textual content mined from metadata for training ● PDF document metadata ● Curated datasets from external repositories
  • 40.
    40 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (4/7) ● OAI-PMH used to harvest all ETD descriptive metadata elements ● OAI-ORE used to harvest all ETD PDF documents
  • 41.
    41 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (5/7) ● ETD Type—98.1% ● ETD Collection— 81.1% ● ETD Subjects—81.7% ● The models would still need to be incorporated into an application that requires “some” human intervention
  • 42.
    42 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (6/7) https://github.com/lightonphiri/etd_autoclassifier
  • 43.
    43 May 25, 2021 Project#1: Online Visibility of Research in Zambia—ETDs Automatic Classification (7/7) https://datalab-apis.herokuapp.com/api/collection
  • 44.
    44 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Current Work (1/3) M’sendo R. (2019—Present) MSc Computer Science, University of Zambia “Multi-Faceted Automatic Classification of Institutional Repository Objects”
  • 45.
    45 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Current Work (2/3) Chisale A. (2021—Present) MLIS, University of Zambia “Automatic Generation of Electronic Theses and Dissertations Metadata”
  • 46.
    46 May 25, 2021 Project#1: Online Visibility of Research in Zambia—Current Work (3/3) http://lis.unza.zm/portal
  • 47.
    47 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ○ Scholarly Research Output in Zambia ○ Predicting Learning Outcome at UNZA ○ Medical Imaging Workflows in Zambia ○ Automatic Weather Prediction in Zambia ● Part III. Potential Problems
  • 48.
    48 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Problem (1/2) ● ICT 1110 performance is as issue. The poor performance transcends all assessments: quizzes, tests and practical programming questions.
  • 49.
    49 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Problem (1/2) ● ICT 1110 performance is as issue. The poor performance transcends all assessments: quizzes, tests and practical programming questions.
  • 50.
    50 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Problem (2/2) ● Potential solution: implement a prediction model aimed at identifying at-risk students . ○ Initiate interventions on at-risk students.
  • 51.
    51 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Data Sources (1/5) ● Demographics information ● LMS interaction logs ● Course workload ● Subject responses
  • 52.
    52 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Data Sources (2/5) ● Assessment results broken down by question ○ Concepts associated with question ○ Topics associated with question
  • 53.
    53 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Data Sources (3/5) ● Assessment results broken down by question ○ Concepts associated with question ○ Topics associated with question
  • 54.
    54 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Data Sources (4/5) ● LMS interaction logs ○ How often do students access Moodle (login attempts) ○ Which Moodle features are being access (GradeBook, Messaging) ○ Time spent on Moodle
  • 55.
    55 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Data Sources (5/5) ● ICT 1110 information survey to capture information not available in SIS ○ Experience with computers ○ Motivation for taking the course ○ Specific location where student lives (although this can be inferred from next of kin address perhaps?)
  • 56.
    56 May 25, 2021 Project#2: Predicting Student Learning Outcomes—Current Work Chaibela, M., Chisha, I., Pungwa, D., Siabbaba D. and Simukoko B. (2021) “Performance Predictor: Machine Learning Tool for Student Performance Outcomes”. Work-in-Progress
  • 57.
    57 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ○ Scholarly Research Output in Zambia ○ Predicting Learning Outcome at UNZA ○ Medical Imaging Workflows in Zambia ○ Automatic Weather Prediction in Zambia ● Part III. Potential Problems
  • 58.
    58 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Problem https://mjz.co.zm/index.php/mjz/article/view/560
  • 59.
    59 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Current Work (1/2)
  • 60.
    60 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Current Work (1/2)
  • 61.
    61 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Current Work (2/2)
  • 62.
    62 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Current Work (2/2)
  • 63.
    63 May 25, 2021 Project#3: Medical Imaging Workflows in Zambia—Current Work (2/2)
  • 64.
    64 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ○ Scholarly Research Output in Zambia ○ Predicting Learning Outcome at UNZA ○ Medical Imaging Workflows in Zambia ○ Automatic Weather Prediction in Zambia ● Part III. Potential Problems
  • 65.
    65 May 25, 2021 Project#3: Automatic Forecasting of Seasonal Rainfall—Current Work
  • 66.
    66 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ● Part III. Potential Problems ○ Exemplar Projects in Zambia ○ Potential Locally Relevant Problems
  • 67.
    67 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ● Part III. Potential Problems ○ Exemplar Projects in Zambia ○ Potential Locally Relevant Problems
  • 68.
    68 May 25, 2021 Agriculture:Automatic identification and Early Warning of Fall Armyworms http://dspace.unza.zm/handle/123456789/7141
  • 69.
    69 May 25, 2021 Telecommunications:Automatic Customer Segmentation http://dspace.unza.zm/handle/123456789/7069
  • 70.
    70 May 25, 2021 Banking:Automatic Data Mining for Fraud Detection https://bit.ly/3wxJICk
  • 71.
    71 May 25, 2021 Outline ●Part I. Data-Driven Problem Solving ● Part II. Past and Current Projects ● Part III. Potential Problems ○ Exemplar Projects in Zambia ○ Potential Locally Relevant Problems
  • 72.
    72 May 25, 2021 PotentialLocally Relevant Problems in Zambia (1/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 73.
    73 May 25, 2021 PotentialLocally Relevant Problems in Zambia (2/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps? Zambia Daily Mail | August 18, 2019 | Volume 22 No. 033
  • 74.
    74 May 25, 2021 PotentialLocally Relevant Problems in Zambia (3/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 75.
    75 May 25, 2021 PotentialLocally Relevant Problems in Zambia (3/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 76.
    76 May 25, 2021 PotentialLocally Relevant Problems in Zambia (4/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 77.
    77 May 25, 2021 PotentialLocally Relevant Problems in Zambia (4/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 78.
    78 May 25, 2021 PotentialLocally Relevant Problems in Zambia (4/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 79.
    79 May 25, 2021 PotentialLocally Relevant Problems in Zambia (4/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 80.
    80 May 25, 2021 PotentialLocally Relevant Problems in Zambia (5/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 81.
    81 May 25, 2021 PotentialLocally Relevant Problems in Zambia (5/6) ● Impact-driven research/studies ○ Education ○ Health ○ So-called ICT for development perhaps?
  • 82.
    82 May 25, 2021 PotentialLocally Relevant Problems in Zambia (6/6) ● Education ● Health ● So-called ICT for development perhaps?
  • 83.
    83 May 25, 2021 PotentialLocally Relevant Problems in Zambia (6/6) ● Education ● Health ● So-called ICT for development perhaps?
  • 84.
    84 May 25, 2021 Q& A Session ● Comments, concerns and complaints?
  • 85.
    [1] Phiri, L.(2018). Research Visibility in the Global South: Towards Increased Online Visibility of Scholarly Research Output in Zambia. IEEE International Conference in Information and Communication Technologies. [2] Phiri, L. (2020). A Multi-Faceted Multi-Stakeholder Approach for Increased Visibility of ETDs in Zambia. Cadernos BAD, (1). https://doi.org/10.1017/S0269888910000032 [3] Phiri, L. (2020). Automatic classification of digital objects for improved metadata quality of electronic theses and dissertations in institutional repositories. International Journal of Metadata, Semantics and Ontologies, 14(3), 234-248. Bibliography
  • 86.