Class Opus

java.lang.Object
org.lwjgl.util.opus.Opus

public class Opus extends Object
Native bindings to the Opus library.

The Opus codec is designed for interactive speech and audio transmission over the Internet. It is designed by the IETF Codec Working Group and incorporates technology from Skype's SILK codec and Xiph.Org's CELT codec.

The Opus codec is designed to handle a wide range of interactive audio applications, including Voice over IP, videoconferencing, in-game chat, and even remote live music performances. It can scale from low bit-rate narrowband speech to very high quality stereo music. Its main features are:

  • Sampling rates from 8 to 48 kHz
  • Bit-rates from 6 kb/s to 510 kb/s
  • Support for both constant bit-rate (CBR) and variable bit-rate (VBR)
  • Audio bandwidth from narrowband to full-band
  • Support for speech and music
  • Support for mono and stereo
  • Support for multichannel (up to 255 channels)
  • Frame sizes from 2.5 ms to 60 ms
  • Good loss robustness and packet loss concealment (PLC)
  • Floating point and fixed-point implementation

Opus Encoder

This section describes the process and functions used to encode Opus.

Since Opus is a stateful codec, the encoding process starts with creating an encoder state. This can be done with:


 int error;
 OpusEncoder *enc;
 enc = opus_encoder_create(Fs, channels, application, &error);

From this point, enc can be used for encoding an audio stream. An encoder state must not be used for more than one stream at the same time. Similarly, the encoder state must not be re-initialized for each frame.

While encoder_create allocates memory for the state, it's also possible to initialize pre-allocated memory:


         int size;
 int error;
 OpusEncoder *enc;
 size = opus_encoder_get_size(channels);
 enc = malloc(size);
 error = opus_encoder_init(enc, Fs, channels, application);

where encoder_get_size returns the required size for the encoder state. Note that future versions of this code may change the size, so no assuptions should be made about it.

The encoder state is always continuous in memory and only a shallow copy is sufficient to copy it (e.g. memcpy()).

It is possible to change some of the encoder's settings using the encoder_ctl interface. All these settings already default to the recommended value, so they should only be changed when necessary. The most common settings one may want to change are:


 opus_encoder_ctl(enc, OPUS_SET_BITRATE(bitrate));
 opus_encoder_ctl(enc, OPUS_SET_COMPLEXITY(complexity));
 opus_encoder_ctl(enc, OPUS_SET_SIGNAL(signal_type));

where

  • bitrate is in bits per second (b/s),
  • complexity is a value from 1 to 10, where 1 is the lowest complexity and 10 is the highest, and
  • signal_type is either AUTO (default), SIGNAL_VOICE, or SIGNAL_MUSIC.

See Encoder related CTLs and Generic CTLs for a complete list of parameters that can be set or queried. Most parameters can be set or changed at any time during a stream.

To encode a frame, encode or encode_float must be called with exactly one frame (2.5, 5, 10, 20, 40 or 60 ms) of audio data:


 len = opus_encode(enc, audio_frame, frame_size, packet, max_packet);

where

  • audio_frame is the audio data in short (or float for encode_float),
  • frame_size is the duration of the frame in samples (per channel),
  • packet is the byte array to which the compressed data is written, and
  • max_packet is the maximum number of bytes that can be written in the packet (4000 bytes is recommended). Do not use max_packet to control VBR target bitrate, instead use the SET_BITRATE_REQUEST CTL.

encode and encode_float return the number of bytes actually written to the packet. The return value can be negative, which indicates that an error has occurred. If the return value is 2 bytes or less, then the packet does not need to be transmitted (DTX).

Once the encoder state if no longer needed, it can be destroyed with


 opus_encoder_destroy(enc);

If the encoder was created with encoder_init rather than encoder_create, then no action is required aside from potentially freeing the memory that was manually allocated for it (calling free(enc) for the example above).

Opus Decoder

This page describes the process and functions used to decode Opus.

The decoding process also starts with creating a decoder state. This can be done with:


 int error;
 OpusDecoder *dec;
 dec = opus_decoder_create(Fs, channels, &error);

where

  • Fs is the sampling rate and must be 8000, 12000, 16000, 24000, or 48000
  • channels is the number of channels (1 or 2)
  • error will hold the error code in case of failure (or OK on success), and
  • the return value is a newly created decoder state to be used for decoding.

While decoder_create allocates memory for the state, it's also possible to initialize pre-allocated memory:


 int size;
 int error;
 OpusDecoder *dec;
 size = opus_decoder_get_size(channels);
 dec = malloc(size);
 error = opus_decoder_init(dec, Fs, channels);

where decoder_get_size returns the required size for the decoder state. Note that future versions of this code may change the size, so no assuptions should be made about it.

The decoder state is always continuous in memory and only a shallow copy is sufficient to copy it (e.g. memcpy()).

To decode a frame, decode or decode_float must be called with a packet of compressed audio data:


 frame_size = opus_decode(dec, packet, len, decoded, max_size, 0);

where

  • packet is the byte array containing the compressed data
  • len is the exact number of bytes contained in the packet
  • decoded is the decoded audio data in opus_int16 (or float for decode_float), and
  • max_size is the max duration of the frame in samples (per channel) that can fit into the decoded_frame array.

decode and decode_float return the number of samples (per channel) decoded from the packet. If that value is negative, then an error has occurred. This can occur if the packet is corrupted or if the audio buffer is too small to hold the decoded audio.

Opus is a stateful codec with overlapping blocks and as a result Opus packets are not coded independently of each other. Packets must be passed into the decoder serially and in the correct order for a correct decode. Lost packets can be replaced with loss concealment by calling the decoder with a null pointer and zero length for the missing packet.

A single codec state may only be accessed from a single thread at a time and any required locking must be performed by the caller. Separate streams must be decoded with separate decoder states and can be decoded in parallel unless the library was compiled with NONTHREADSAFE_PSEUDOSTACK defined.

Repacketizer

The repacketizer can be used to merge multiple Opus packets into a single packet or alternatively to split Opus packets that have previously been merged. Splitting valid Opus packets is always guaranteed to succeed, whereas merging valid packets only succeeds if all frames have the same mode, bandwidth, and frame size, and when the total duration of the merged packet is no more than 120 ms. The 120 ms limit comes from the specification and limits decoder memory requirements at a point where framing overhead becomes negligible.

The repacketizer currently only operates on elementary Opus streams. It will not manipualte multistream packets successfully, except in the degenerate case where they consist of data from a single stream.

The repacketizing process starts with creating a repacketizer state, either by calling repacketizer_create or by allocating the memory yourself, e.g.,


 OpusRepacketizer *rp;
 rp = (OpusRepacketizer*)malloc(opus_repacketizer_get_size());
 if (rp != NULL)
     opus_repacketizer_init(rp);

Then the application should submit packets with repacketizer_cat, extract new packets with repacketizer_out or repacketizer_out_range, and then reset the state for the next set of input packets via repacketizer_init.

For example, to split a sequence of packets into individual frames:


 unsigned char *data;
 int len;
 while (get_next_packet(&data, &len))
 {
   unsigned char out[1276];
   opus_int32 out_len;
   int nb_frames;
   int err;
   int i;
   err = opus_repacketizer_cat(rp, data, len);
   if (err != OPUS_OK)
   {
     release_packet(data);
     return err;
   }
   nb_frames = opus_repacketizer_get_nb_frames(rp);
   for (i = 0; i < nb_frames; i++)
   {
     out_len = opus_repacketizer_out_range(rp, i, i+1, out, sizeof(out));
     if (out_len < 0)
     {
        release_packet(data);
        return (int)out_len;
     }
     output_next_packet(out, out_len);
   }
   opus_repacketizer_init(rp);
   release_packet(data);
 }

Alternatively, to combine a sequence of frames into packets that each contain up to TARGET_DURATION_MS milliseconds of data:


 // The maximum number of packets with duration TARGET_DURATION_MS occurs
 // when the frame size is 2.5 ms, for a total of (TARGET_DURATION_MS*2/5)
 // packets.
 unsigned char *data[(TARGET_DURATION_MS*2/5)+1];
 opus_int32 len[(TARGET_DURATION_MS*2/5)+1];
 int nb_packets;
 unsigned char out[1277*(TARGET_DURATION_MS*2/2)];
 opus_int32 out_len;
 int prev_toc;
 nb_packets = 0;
 while (get_next_packet(data+nb_packets, len+nb_packets))
 {
   int nb_frames;
   int err;
   nb_frames = opus_packet_get_nb_frames(data[nb_packets], len[nb_packets]);
   if (nb_frames < 1)
   {
     release_packets(data, nb_packets+1);
     return nb_frames;
   }
   nb_frames += opus_repacketizer_get_nb_frames(rp);
   // If adding the next packet would exceed our target, or it has an
   // incompatible TOC sequence, output the packets we already have before
   // submitting it.
   // N.B., The nb_packets > 0 check ensures we've submitted at least one
   // packet since the last call to opus_repacketizer_init(). Otherwise a
   // single packet longer than TARGET_DURATION_MS would cause us to try to
   // output an (invalid) empty packet. It also ensures that prev_toc has
   // been set to a valid value. Additionally, len[nb_packets] > 0 is
   // guaranteed by the call to opus_packet_get_nb_frames() above, so the
   // reference to data[nb_packets][0] should be valid.
   if (nb_packets > 0 && (
       ((prev_toc & 0xFC) != (data[nb_packets][0] & 0xFC)) ||
       opus_packet_get_samples_per_frame(data[nb_packets], 48000)*nb_frames >
       TARGET_DURATION_MS*48))
   {
     out_len = opus_repacketizer_out(rp, out, sizeof(out));
     if (out_len < 0)
     {
        release_packets(data, nb_packets+1);
        return (int)out_len;
     }
     output_next_packet(out, out_len);
     opus_repacketizer_init(rp);
     release_packets(data, nb_packets);
     data[0] = data[nb_packets];
     len[0] = len[nb_packets];
     nb_packets = 0;
   }
   err = opus_repacketizer_cat(rp, data[nb_packets], len[nb_packets]);
   if (err != OPUS_OK)
   {
     release_packets(data, nb_packets+1);
     return err;
   }
   prev_toc = data[nb_packets][0];
   nb_packets++;
 }
 // Output the final, partial packet.
 if (nb_packets > 0)
 {
   out_len = opus_repacketizer_out(rp, out, sizeof(out));
   release_packets(data, nb_packets);
   if (out_len < 0)
     return (int)out_len;
   output_next_packet(out, out_len);
 }

An alternate way of merging packets is to simply call repacketizer_cat unconditionally until it fails. At that point, the merged packet can be obtained with opus_repacketizer_out() and the input packet for which opus_repacketizer_cat() needs to be re-added to a newly reinitialized repacketizer state.