FM1AZDP05 Data Pre-Processing

FM1AZDP05 Data Pre-Processing

  • Course description
    • Course Code
      FM1AZDP05
    • Level of Study
      5.1
    • Program of Study
      Applied Machine Learning
    • Credits
      5
    • Study Plan Coordinator
      Leon Grobbelaar
Teaching Term(s)
2025 Autumn
About the Course

The course provides knowledge of and skills in data pre-processing, data standardisation, feature engineering and feature selection for modelling. Candidates are provided knowledge to study and understand data quality issues and skills to link pre-processing and data. The course also provides skills and statistical techniques that allow the candidate to prepare and format data in a structured way.

This course is relevant to the program because data pre-processing is an essential step in any machine-learning project. The course will provide students with the knowledge and skills required to standardise data so that it is in the right format for the machine learning model, create new features to best leverage the information in the dataset and select the best features to improve the machine learning model output.

Course Learning Outcomes
Knowledge

The candidate:

  • has knowledge of concepts and processes that are used to review and understand data quality issues
  • has knowledge of methods and tools that are used to standardise data
  • has insight into relevant standards and requirements for optimal data and the relation between data and data pre-processing
  • can update his/her knowledge of feature engineering and selections
Skills

The candidate:

  • can apply knowledge of selections and large-scale data to solve machine-learning tasks
  • masters relevant tools and techniques to format and structure data suitable for machine learning
  • masters relevant tools and statistical techniques to clean data and remove incomplete variables
  • can study data sets and identify issues and the need for data preparation to clean and expose the information content
General Competence

The candidate:

  • can carry out data pre-processing based on the needs of an overall machine learning methodology
  • can develop statistics and basic data visualisation
Learning Activities

Digital Learning Resources
The learning management system (LMS) is the primary learning platform where students access most of their course materials. The content is presented in various formats, such as text, images, models, videos or podcasts. Each course follows a progression plan, designed to lead students through weekly modules at their own pace. Exercises and assignments (individual or in groups) are embedded throughout the courses to support continuous practice and assessment of the learning outcomes.

Campus Resources
In addition to the digital learning resources, campus students participate in physical learning activities led by teachers as part of the overall delivery.

Guidance
Guidance and feedback from teachers support students' learning journeys, and may be provided synchronously or asynchronously, individually or in groups, via text, video or in-person feedback.

Assessments
Form of assessmentGrading scaleGroupingDuration of assessment
Course Assignment
Pass / Fail
Group/Individual
4 Week(s)
Reading List

Teaching materials, reading lists, and essential resources will be shared in the learning platform and software user manuals where applicable.