Coding of Audiovisual Signals

  • Typ: Vorlesung (V)
  • Lehrstuhl: Institut für Nachrichtentechnik
  • Semester: SS 2026
  • Zeit: Di. 21.04.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)


    Di. 28.04.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 05.05.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 12.05.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 19.05.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 02.06.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 09.06.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 16.06.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 23.06.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 30.06.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 07.07.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 14.07.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 21.07.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)

    Di. 28.07.2026
    09:45 - 11:15, wöchentlich
    30.28 Seminarraum 1 (R220)
    30.28 Lernzentrum 2012 (2. OG)


  • Dozent: Prof. Dr.-Ing. Laurent Schmalen
  • SWS: 2
  • LVNr.: 2310565
  • Hinweis: Präsenz
VortragsspracheEnglisch

Coding of Audiovisual Signals

The course discusses procedures and methods that arise when considering source coding. Source coding is an indispensable tool in communications engineering for the compact representation and preparation of multimedia signals for transmission on the one hand and the efficient and economical use of storage capacity on the other. Source coding is the direct link between the user of the message system and the actual data transmission. The lecture deals with various methods of signal processing and examines these with regard to their use in modern source coding methods such as MP3, JPEG or H264. Many of the applications are illustrated with example implementations in software (python).

The largest part of the lecture is dedicated to the source coding of multimedia signals and looks in particular at the source coding of audio and image signals. First, different methods of quantization of multimedia signals are discussed. In addition to uniform quantizers, we consider non-uniform quantizers, optimal quantizers, companding quantizers, adaptive quantizers and vector quantizers and evaluate them in terms of signal quality. For example, the following figure shows the reconstruction quality (signal-to-noise ratio) after quantization of a speech signal with a uniform and an adaptive quantizer (AQF).

Following the description of quantization, the lecture deals with signal processing methods that prepare the signal in such a way that any existing correlation and temporal dependencies are removed. This so-called linear prediction optimally prepares the signal for subsequent quantization. In the lecture, students learn how linear prediction can be automatically adapted to the signal and applied to practical, non-stationary signals. The following figure illustrates the linear prediction of a short speech signal and illustrates how the two adaptive filters of linear prediction and long-term prediction smooth the spectrum of the signal and thus prepare it for quantization.

Based on quantization and linear prediction methods, the combination of the overall system is then discussed, as well as different methods of combining prediction and quantization. Various quantizers are considered and both simple systems are described (and demonstrated with python code) and more complex methods, such as the CELP method, which forms the basis for voice coding with extremely low data rates in modern telephony and video telephony services (e.g. Zoom). The block diagram of the CELP process is shown below.

The lecture then focuses on the spectral analysis and coding of audio signals (e.g. music). In most cases, quantization takes place in the frequency domain and some special features, such as psychoacoustic models, are described. The following figure shows an example of the spectral analysis of a piece of music.

In the further course of the lecture, students learn how to efficiently encode image and video signals. The focus here is on a combination of multidimensional spectral transformations, quantization and entropy coding. In the lecture, a simple variant of the JPEG method is discussed and illustrated with python code. For example, the following figure shows the coding of an image using this method and the subsequent reconstruction, from which the reconstruction artifacts become visible.

After the description of video coding methods and algorithms for motion compensation, the lecture concludes with some newer concepts for the transmission of multimedia signals in networks. Intelligent methods are necessary, especially when packet losses occur, which still allow the signals to be reconstructed. One example of this is multiple description coding (MDC), which is illustrated in the following figure. Here, the signal is transmitted using two (or more) packets. If both packets reach the receiver, the signal can be reconstructed with the highest quality. If only one packet reaches the receiver, this can still be used to reconstruct the signal in lower quality in a "side decoder".