Advanced Analytics and Machine Learning (AAML)

Sommersemester 2025

This lecture will be held in english.

This lecture will be offered as a block lecture ('Blockvorlesung').

Enrollment will be done via Moodle (Enrollment Key: AAML) and remaining seats during the initial event on Saturday, March 1, 2025,

The rapid digitalization of science, industry, and society has led to an unprecedented increase in data generation, necessitating scalable and efficient data processing, storage, and analytics solutions, as well as enabling AI and machine learning. The exponential growth of machine-generated data—such as sensor readings, server logs, and transactional records—poses significant computational challenges. The proliferation of the Internet of Things (IoT) continues to accelerate this data explosion, reinforcing the need for advanced methodologies in high-performance and distributed computing. This course explores the foundations and practical implementations of large-scale data processing, distributed machine learning, and scalable AI solutions. We will investigate computational frameworks designed for high-throughput data analytics, deep learning, and generative AI, with a focus on modern transformer-based architectures and foundation models such as LLaMA and DeepSeek. The curriculum also covers emerging trends in quantum machine learning and AI ethics, ensuring a comprehensive understanding of both technological advancements and their broader implications. Students will engage with state-of-the-art distributed computing frameworks such as Spark/Dask/Ray and Pytorch/Transformers to develop scalable AI solutions with applications spanning computer vision, natural language processing (NLP), and generative AI.

This class will cover the following topics:

  • Data applications in industry and sciences
  • Data-intensive methods in high performance computing
  • Large-scale data processing using Spark, Dask, Flink and Ray
  • SQL for unstructured data: Hive, Spark-SQL, Presto
  • Stream processing: Kafka, Spark Streaming, Flink
  • Data science and machine learning: unsupervised and supervised methods, tools (numpy, pandas, scikit-learn)
  • Deep learning for computer vision: convolutional neural networks (Pytorch))
  • Natural language processing: word embeddings, large language models (RNNs, LSTMs, Transformers) incl. recent development (reasoning models like DeepSeek R1)
  • Quantum machine learning
  • AI ethics and responsible AI

Empfohlene Vorkenntnisse:
Attendance of the lectures on computer networks and distributed systems, operating systems, computer architecture or comparable knowledge required.
Programming knowledge in Python and handling Linux command line required.

Kontakt via E-Mail.