Pulse Audio On Ubuntu PDF Print E-mail

PulseAudio on Ubuntu

Sound on the modern Linux distributions is difficult to understand and next to impossible to configure correctly consistently. There are several reasons for this. First, and I will try to make it much simpler than it really is, developers create sound systems and applications from different development programs. This nasty problem is like mixing apples and oranges....there are just different and yet they are the same. Here is a couple examples. The two sound servers that were developed, I realize there are others, for the two foundation Desktops, KDE and GNOME were developed with different programs so they are not totally compatible....no fun. KDE Desktop has focused on aRts. GNOME Desktop with Enlightened Sound Daemon (ESD) and (GStreamer) handles the codecs. To make matters a lot worse OSS, Open Sound System, an older system is part of almost all Linux distros as an option as well. The second problem is those who develop applications which users like to use for sound. Whether it is for listening to music, creating sound files, watching videos with sound or sound editing programs, they were developed from different development programs with some creating drivers for specific hardware and some which did not create hardware drivers at all. Some tried to be independent of other organizations for hardware drivers while some just tried to do it all. Some developed for one sound system and some for others. The fact is, chaos has developed and the freedom of Linux has allowed developers to go into many different directions with no unifying force. The third problem is that many of the projects for sound in Linux simply have goals that overlap so at times they work well together and at other times it results in a disgusting disaster...when things just don't work.

PulseAudio is trying to be the unifying force to bring all of these different threads together so that each has freedom to develop but the actual use of sound can be brought into a situation where there is a more consistent result for users to enjoy. In addition, pulse audio brings a lot of important possibilities to sound that will certainly make some huge changes.

Comparable to ESOUND, aRts and NAS, Pulse Audio is a sound daemon that does more than any other free sound server out there. Pulse Audio is a replacement for ESOUND so you can load a ESOUND compatibility module which implements an ESOUND compatible protocol which allows you to use most of the ESOUND compatible programs.

Pulse Audio previously known as Polypaudio, is now included in Ubuntu 8.04 Hardy Heron and offers many advanced sound server features. Pulse Audio allows you to apply operations on your sound data as it passes between your application and your the hardware in your computer.

Pulse Audio has many plugins, support for static linking of modules, module autoloading, support for more than one sink/source, good low latency behavior, client side latency interpolation, is embeddable into other software, flexible, implicit sample type conversion and resampling, "zero-copy" architecture, combine multiple sound cards, fully synchronize multiple playback streams and more.

To view more details about Pulse Audio click here

 PulseAudio Purpose

The goal of PulseAudio is to provide a way to help all of the different sound applications, libraries, hardware, drivers, etc. to work together by uniting all of this in an additional layer as you can see simplified in the illustration.

 Pulse Audio


Configuring PulseAudio
If you go to Preferences/Sound you will see this window on setting up sound.  One of your biggest frustrations will be that some applications have hardware drivers that only work with one OSS, EsounD or ALSA.  Because of this until you can get all of your applications working correctly is that you want to record your settings so you have a base line.  The other principle to note is that Autodetect is there as a standard because it will hopefully make it work eaiser that way.  If you have problems do not give up....I have found if you keep on working on it you often can find a solution.  For answers when you get stuck try: Live Training Netowrk or the Forum.



Digital Audio
Digital audio is great fun to play with and is becoming much easier to work with with the development of new programs in Linux.  Digital audio is like a series of still images taken in a movie to create samples that recreate the movie.  Basically the more samples that are taken the better the re-enactment of the movie.  A typical example would be a audio CD using the WAV format.  The WAV format takes 44100 still images per second.  Each still image is made up of 16 bits which refer to the resolution or depth of the still image.  These still images or samples, are stored as Pulse Code Modulation or PCM.  PCM devices were dramatically modified and expanded in the 2.6 kernel under the Advanced Linux Sound Architecture or ALSA. PCM designates the digital output when it interfaces with sound cards. Two major PCM types, hw and plughw, allow the user to modify the way that ALSA relates to the sound card.  The PCM is opened with specific settings for sample format, sample frequency, number of channels, number of periods or fragments and the size of the periods.  When the sound card does not support the settings which were opened with PCM there is a problem.  ALSA provides a solution to the problem by allowing the user to choose the plughw which enables ALSA to automatically convert the data in the plugin layer to a format that is supported by the sound card.  This process allow much greater availability to sound in Linux.  However, the changes will occur with the sample format, frequency and number of channels, thus altering the quality of the sound.  Now if the hw type is employed ALSA will attempt to open the PCM devices directly and use the settings of the application that is running.  

Several common PCM formats are WAV using Windows audio codecs with 8,16 and 24 bit PCM data.  WAV uses sample rates of 2 kHz to 192 kHz.  AIFF is a format used by Apple Macintosh which is very similar to the WAV format and uses the same bits and sample rates.

Often analog audio will need to be converted to digital.  The sound card has an analog to digital converter built in called a ADC.  On the other hand, when a digital CD is played the sound card must convert the digital to analog which uses the DAC converter on the sound card.

Video Compression
The real problem with audio is that when it is not compressed it consumes about 10 MB per minute.  This is data that is difficult to transport, via a network for example, and also difficult to store.  As a result several compression techniques were developed.  One is the MP3 which was developed by Fraunhofer IIS and was patented.  The result of the patent is that anyone distributing a MP3 encoder must pay a license fee.  As a result the OGG VORBIS audio compression was developed.  This format is now supported by most audio players.

ALSA has provided a method for Linux to use MIDI and as a matter of fact many sound cards have MIDI ports to provide the input from synthesizers, keyboards or sound modules.  Some sound cards even convert MIDID events into audible sounds with a WaveTable synthesizer.  Virtual MIDI keyboards actually use the keyboard on the computer to create sound.

Mixer is the process of manipulating the volume and balance of sound output and input on the computer sound system. Here is an example of a mixer.

Players are programs that will playback the common formats of MP3, WAV or OGG VORBIS.  Linux supports quite a number of easy to use players.  XMMS is one of the most popular.

Buffering and Latencies
On computers the CPU must perform multiple tasks at the same time, this is called multitasking.  Multitasking tasks include system tasks as well as programs being used by users.  The problem is that the CPU can only perform one task at a time so it must provide a time slice to each of the tasks that need to be performed.   

These time slices are very quick and most of the time are not noticeable by the user. However, in the process of playing back audio, occasionally clicks may be heard which are actually the CPU switching between tasks as it gives time slices to each program that is running.  This problem was addressed by providing buffers that would be large enough to span the longest switch interruption that was made by the CPU.

A second problem developed in the latency or reaction time of a program when using a buffer.  In other words, if the buffer is too big, there is a natural delay in the program.  So the solution was to keep the buffers as small as possible by increasing the priority of the audio program or by using a real-time scheduler.