Meeting minimal InterCom

Vicente González Ruiz & Savins Puertas Martín & Marcos Lupión Lorente

March 23, 2025

1 Description
1.1 Loop-based algorithm
1.2 Timer-based algorithm
2 Deliverables
3 Resources

1 Description

InterCom is an application that captures and plays audio, and therefore, in Linux, runs on the top of one of the following audio services:

ALSA.
PulseAudio (in the case of Xubuntu, this is the audio server that comes with Xfce).
JACK.
Pipewire.

To abstract InterCom from the available audio resources, InterCom uses PortAudio [1] (through sounddevice) to capture and play the audio. Using sounddevice, we have two alternatives for implementing InterCom (and in general, for any real-time audio-processing application):

1.1 Loop-based algorithm

Roughtly, InterCom can be divided into 6 steps:

1  # Loop-based algorithm (to be called in a loop) 
2  def record_IO_and_play(chunk_size): 
3    chunk = record(chunk_size) # (1) 
4    packed_chunk = pack(chunk) # (2) 
5    send(packed_chunk) # (3) 
6    packed_chunk = receive() # (4) 
7    chunk = unpack(packed_chunk) # (5) 
8    play(chunk) # (6)

where:

The record(chunk_size) method captures a chunk (a fragment) of frames¹. In sounddevice, this operation is carried on by the read() method. As can be seen in wire4.py² and also in the documentation of sounddevice if we read only the frames that are available in the soundcard’s buffer, this generates a non-blocking operation and the chunk size depends on the instant of time in which this method is called. Otherwise, if we specify a number of frames different to the number of frames available, the operation can³ be blocking and I/O-bound (the calling process sleeps until the required chunk size is returned).
pack(chunk) process the chunk to create a packet (or a sequence of packets), a structure that can be transmitted through the Internet using the Datagram Model. In general, this is a CPU-bounded (CPU-intensive) operation because the payload of the packet can be compressed in order to minimize the transmission bit-rate.
send(packed_chunk) sends the packet to our interlocutor. When datagrams are used, this step is not blocking nor CPU-bounding (the CPU usage is very low), as long as the number of packets/second is small and the sizes of the payloads are also small, as it is expected in InterCom.
receive(), waits (blocking the calling process) for an incoming packet, and therefore, this operation is IO-bound. However, most socket APIs [2] offeer a non-blocking option where when a packet is not available in the kernel’s buffer associated with the corresponding socket after a predetermined amount of time, some kind of exception is generated, and, in this case, it is resposabability of the programmer to generate an “alternative” chunk (in our case, for example, a chunk filled with zeros that will not produce any sound when it is played).
unpack(packed_chunk) is (like the method pack(chunk)) a CPU-intensive step that transforms a packed chunk into a chunk of audio.
play(chunk) renders the chunk. In general, this is an I/O-bound blocking action. However, if play() is called at the same pace as record(), and the record and play parameters are exactly the same (as usually happens in InterCom), the playing of the chunk should return immediately because the time that the play() method needs to complete would match exactly the time that the record() method requires (see wire4b.py).

1.2 Timer-based algorithm

The to-be-called-in-a-loop implementation depicted in the previous section works fine, but finally InterCom uses a to-be-called-by-an-interruption implementation because it is allows to have running another task for free.

In this algorithm, the task dedicated to record and play the chunks of audio is called periodically (probably, using some timer provided by the sound hardware). This procedure guarantees a gliches-free audio-IO when constant chunk sizes are used, because the timer interruption coincides exactly with the instant of time in which the record() and the play() methods can be used without blocking.

The current implementation of InterCom uses the Timer-based (interruption-based) algorithm.

2 Deliverables

None. However, you should understand the meaning, purpose and use of each of the parameters of minimal.py. For this, try the following:

Run minimial.py using localhost as the destination for your chunks of audio (notice that this is the default configuration for the parameters --destination_address and --destination_port). Check that you can listen to yourself after some delay, and that the quality of the sound is good (it is recommended to used a headset). If you think that the quality is not high enough, comment this with your teacher. Remember that the parameters --show_stats, --show_samples, and --show_spectrum, can provide some information about what is happening. Can you determine the delay?
Always using the command line, modify the parameter --frames_per_chunk. Do you notice any changes in latency or audio quality?
Modify the parameter --frames_per_second. Again, do you notice any changes in latency or audio quality? You should.
With one of your group mates, try to communicate using the same LAN (you will need to modify the parameter --destination_address). Notice that some LANs, such as the provided by the UAL could filter your InterCom traffic. Good alternative options are you LAN at home or an ad-hoc LAN created with a mobile device.
Repeat the last experiment when two (or more) group mates send you their chunks. What do you think that is happening?
Finally, are you able to communicate with an interlocutor that is in a different LAN than you (for example, when two InterCom instances are running in different home networks)?

Mark: 0 points.

3 Resources

[1] PortAudio.

[2] The Python Foundation. The Python Website.

¹A stereo sample, usually 16 + 16 bits.

²curl https://raw.githubusercontent.com/Tecnologias-multimedia/intercom/master/test/sounddevice/wire4.py > wire4.py

³The reading will be blocking if the number of requested frames is larger than the number of available frames.