Demonstration of SWMUMDIS

(Diese Seite auf Deutsch)

Each link found in the column "signal name" of the table points to a directory containing a number of wav-files (22.05 kHz, single channel) for audio demonstration and jpg-files (all 800x400, scales vary) for visual demonstration. If your browser is configured correctly all you have to do is to click on the file name. Some remarks follow:

Remarks on the wav-audio files

Remarks on the jpg-pictures

Abbreviations of analysing procedures

ZFKI Time-Frequency-Contours, 4P1, B3dB = 0,5 Bark, delay compensation
ZFKII Time-Frequency-Contours, 4P1, B3dB = 0,3 Bark, delay compensation
ZFKI+S as ZFKI, visualize together with FTT-spectrogram
ZFKII+S as ZFKII, visualize together with FTT-spectrogram
KTX Contour/Texture-Representation, 4P1, B3dB = 0,3 Bark, delay compensation
KTXOZ Contour/Texture-Representation, 4P1, B3dB = 0,3 Bark, delay compensation
M-TTZM optimized Part-Tone-Time-Pattern, 4P1, B3dB = 0,3 Bark, delay compensation
SM-TTZM         improved Part-Tone-Time-Pattern, 2P1, B3dB = 0,25 Bark, time-smoothed spectrum
HB-TTZM Heinbach's Part-Tone-Time-Pattern, P1, B3dB = 0,1 Bark, time-smoothed spectrum
AMS FTT-Spectrogram (Auditory-Magnitude-Spectrogram), 4P1 , B3dB = 0,3 Bark, delay compensation

Abbreviations of reconstruction procedures

HORN-RS Horn's spectrogram-resynthesis, N = 5
HORN-RS1        Horn's spectrogram-resynthesis, N = 1
RKHP reconstruction from contours using phase-heuristic
RKHPTX as RKHP, reconstruction from texture added
RKOP reconstruction from contours using original phases
RKOPTX as RKOP, reconstruction from texture added
TTSD part-tone-resynthesis using triangular window
TTSR Heinbach's Part-tone-resynthesis using rectangular window

Abbreviations of speech codecs

HB-4k4 Heinbach's speech codec 4.4 kbit/s, based on Part-Tone-Time-Pattern
MUM-4k4           Speech codec 4.4 kbit/s, based on Contour/Texture-Representation
MUM-30k Speech codec 30 kbit/s, based on Contour/Texture-Representation

signal name signal description comment    (* = any string of characters)
  • 1kwr
  • sinusoidal burst 1 kHz, hard-switched, white noise superimposed, signal duration 0.2 s the presence of white noise renders phase heuristic for time contours unsuitable, therefore clicks almost inaudible with *RKHP*
  • 2tb
  • two tone beat, both tones start at 1 kHz and move to 1040 resp. 960 Hz, signal duration 2 s artefacts caused by synthesis window with *TTSR* and *TTSD*; all reconstructions - except *ZFKI.RKOP* - have passages that sound like narrow-band noise, caused by phase incoherence or because tonal portions move over into texture
  • dp20_200
  • dirac-impulse train, impulse rate increasing from 20 to 200 Hz, signal duration 2 s distinct change in sound with *TTZM* which can be prevented by processing time-contours or texture; yet artefacts may appear due to double-representation, phase incoherence and/or time-localization jitter; texture can only be a coarse replacement for time contours
  • ea
  • female speaker, ("electroacoustics"), signal duration 1.5 s sound proves quite uncritical
  • fm-3db
  • frequency modulation, sinusoidal carrier 1 kHz, sinusoidal modulator moving from 0 to 100 Hz, frequency lift +/- 100 Hz, signal duration 2 s see two tone beat; perceptible amplitude modulation even with *ZFKI.RKOP* due to double-representation of signal portions by time and frequency contours
  • gser1kea
  • 4 sinusoidal bursts 1 kHz, Gaussian-switched, Gaussian-3dB-bandwidths (B=2f) 50/100/500/infinity Hz, signal duration 0.8 s clicks caused by increasing steepness of slopes are truly represented by time contours only; representing clicks via texture results in a perceptual approximation (noise bursts, with *KTXOZ.RKHPTX*) ; *AMS.HORN-RS* renders clicks weakened
  • job
  • male speaker with music (German "Interessiert Sie ein neuer Job?", from commercial), signal duration 2 s sound to demonstrate robustness of the speech codecs against interfering sound sources
  • kalk
  • male speaker (German "Kalk setzt sich bei jeder ...", from commercial), signal duration 2.07 s very critical sound because pronunciation is over-articulated and accelerated, and because it is spoken by a male speaker (its dense harmonics being prone to audible phase incoherence in reconstruction); listening by headphone essential
  • repeated
  • male speaker ("The demonstration is repeated once"), signal duration 2 s processing of time contours helps to retain naturalness
  • wr
  • white noise (sampled analog thermal noise source), signal duration 2 s nasal, comb-filter-like tinge, swirling or rippling ("tonalization") caused by disregarding time-contours and/or by phase incoherence within reconstruction

    $Date: 1999/07/06 23:39:40 $