Meeting minimal InterCom
November 18, 2024
Contents
1 Description
InterCom is an application that captures and plays audio, and therefore, in Linux,
runs on the top of one of the following audio services:
- ALSA.
- PulseAudio (in the case of Xubuntu, this is the audio server that comes
with Xfce).
- JACK.
- Pipewire.
To abstract InterCom from the available audio resources, InterCom uses
PortAudio [1] (through sounddevice) to capture and play the audio. Using
sounddevice, we have two alternatives for implementing InterCom (and in general, for
any real-time audio-processing application):
1.1 Loop-based algorithm
Roughtly, InterCom can be divided into 6 steps:
1 # Loop-based algorithm (to be called in a loop)
2 def record_IO_and_play(chunk_size):
3 chunk = record(chunk_size) # (1)
4 packed_chunk = pack(chunk) # (2)
5 send(packed_chunk) # (3)
6 packed_chunk = receive() # (4)
7 chunk = unpack(packed_chunk) # (5)
8 play(chunk) # (6)
where:
- The record(chunk_size) method captures a chunk (a fragment) of frames.
In sounddevice, this operation is carried on by the read() method. As can
be seen in wire4.py
and also in the documentation of sounddevice if we read only the frames
that are available in the soundcard’s buffer, this generates a non-blocking
operation and the chunk size depends on the instant of time in which this
method is called. Otherwise, if we specify a number of frames different to
the number of frames available, the operation can
be blocking and I/O-bound (the calling process sleeps until the required
chunk size is returned).
- pack(chunk) process the chunk to create a packet (or a sequence of
packets), a structure that can be transmitted through the Internet using
the Datagram Model. In general, this is a CPU-bounded (CPU-intensive)
operation because the payload of the packet can be compressed in order
to minimize the transmission bit-rate.
- send(packed_chunk) sends the packet to our interlocutor. When
datagrams are used, this step is not blocking nor CPU-bounding (the CPU
usage is very low), as long as the number of packets/second is small and
the sizes of the payloads are also small, as it is expected in InterCom.
- receive(), waits (blocking the calling process) for an incoming packet,
and therefore, this operation is IO-bound. However, most socket APIs [2]
offeer a non-blocking option where when a packet is not available in
the kernel’s buffer associated with the corresponding socket after a
predetermined amount of time, some kind of exception is generated,
and, in this case, it is resposabability of the programmer to generate an
“alternative” chunk (in our case, for example, a chunk filled with zeros
that will not produce any sound when it is played).
- unpack(packed_chunk) is (like the method
pack(chunk)) a CPU-intensive step that transforms a packed chunk into
a chunk of audio.
- play(chunk) renders the chunk. In general, this is an I/O-bound blocking
action. However, if play() is called at the same pace as record(), and
the record and play parameters are exactly the same (as usually happens
in InterCom), the playing of the chunk should return immediately because
the time that the play() method needs to complete would match exactly
the time that the record() method requires (see wire4b.py).
1.2 Timer-based algorithm
The to-be-called-in-a-loop implementation depicted in the previous section works
fine, but finally InterCom uses a to-be-called-by-an-interruption implementation
because it is allows to have running another task for free.
In this algorithm, the task dedicated to record and play the chunks of audio is
called periodically (probably, using some timer provided by the sound hardware).
This procedure guarantees a gliches-free audio-IO when constant chunk sizes are
used, because the timer interruption coincides exactly with the instant of time in
which the record() and the play() methods can be used without blocking.
The current implementation of InterCom uses the Timer-based (interruption-based)
algorithm.
2 Deliverables
None. However, you should understand the meaning, purpose and use of each of the
parameters of minimal.py. For this, try the following:
- Run minimial.py using localhost as the destination for your chunks
of audio (notice that this is the default configuration for the parameters
--destination_address and --destination_port). Check that you can
listen to yourself after some delay, and that the quality of the sound is
good (it is recommended to used a headset). If you think that the quality
is not high enough, comment this with your teacher. Remember that the
parameters --show_stats, --show_samples, and --show_spectrum, can
provide some information about what is happening. Can you determine
the delay?
- Always
using the command line, modify the parameter --frames_per_chunk. Do
you notice any changes in latency or audio quality?
- Modify the parameter --frames_per_second. Again, do you notice any
changes in latency or audio quality? You should.
- With one of your group mates, try to communicate using the same LAN
(you will need to modify the parameter --destination_address). Notice
that some LANs, such as the provided by the UAL could filter your
InterCom traffic. Good alternative options are you LAN at home or an
ad-hoc LAN created with a mobile device.
- Repeat the last experiment when two (or more) group mates send you
their chunks. What do you think that is happening?
- Finally, are you able to communicate with an interlocutor that is in a
different LAN than you (for example, when two InterCom instances are
running in different home networks)?
3 Resources