Skip to content

Rawaudio dev#3653

Draft
dingodoppelt wants to merge 40 commits intojamulussoftware:mainfrom
dingodoppelt:rawaudio-dev
Draft

Rawaudio dev#3653
dingodoppelt wants to merge 40 commits intojamulussoftware:mainfrom
dingodoppelt:rawaudio-dev

Conversation

@dingodoppelt
Copy link
Copy Markdown
Contributor

@dingodoppelt dingodoppelt commented Apr 17, 2026

Add a new "raw" audio quality setting

This PR adds uncompressed audio ("raw") to the quality settings so there is no Opus compression along the way
Discussion in #3654

This feature improves latency as well. I gained 2ms by using uncompressed audio while having a better audio quality.

This is work in progress, please help me test it

Checklist

  • I've verified that this Pull Request follows the general code principles
  • I tested my code and it does what I want
  • My code follows the style guide
  • I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
  • I've filled all the content above

@dingodoppelt dingodoppelt marked this pull request as ready for review April 19, 2026 06:54
@ann0see ann0see added this to the Release 4.0.0 milestone Apr 20, 2026
@ann0see ann0see added this to Tracking Apr 20, 2026
@github-project-automation github-project-automation Bot moved this to Triage in Tracking Apr 20, 2026
Comment thread src/clientsettingsdlg.cpp Outdated
Comment thread src/util.h
Comment thread src/client.cpp
// free audio modes
opus_custom_mode_destroy ( OpusMode );
opus_custom_mode_destroy ( Opus64Mode );
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to cause issues from time to time. After closing the client I sometimes get an error with free() complaining about wrong sizes. Needs further investigation. Some help here would be appreciated

@ann0see
Copy link
Copy Markdown
Member

ann0see commented Apr 20, 2026

I'd prefer not to check for the Jamulus version number but rather based on capabilities - we don't have 4.0.0 out yet and it might break during the dev process.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

I'd prefer not to check for the Jamulus version number but rather based on capabilities - we don't have 4.0.0 out yet and it might break during the dev process.

I wanted to reuse information already available as much as possible so I just added the code where there were version checks already implemented. (For sequence number and pan feature)
Capabilities would be nice but also would require more changes to client, channel, server and protocol which I don't really have an idea on how to make that backwards compatible. We should rather replace all version checks with some capabilities struct that client and server can agree upon so everything lands in one place. I just don't feel like the right person to take on that challenge and rather pursue my hacky approach, as long as it works for everybody.
The version check with 4.0.0 could be replaced by a point release 3.11.1 and would work right away.

@ann0see
Copy link
Copy Markdown
Member

ann0see commented Apr 20, 2026

Tested it and yes, the noise would be unacceptable. What is our fallback if max is selected but the server doesn't support it?

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

dingodoppelt commented Apr 20, 2026

Tested it and yes, the noise would be unacceptable. What is our fallback if max is selected but the server doesn't support it?

I just noticed that if you connect to a server with Max selected you get the noise unless you switch audio quality again while connected. The server code is fine and doesn't need changes, I misplaced the check for my introduced bRawAudioSupported in the client code. I'll have a closer look
Edit: Funny, the noise doesn't happen on legacy servers, only on rawaudio :D

@softins
Copy link
Copy Markdown
Member

softins commented Apr 23, 2026

I've just tried a build of rawaudio-dev here, between two separate hosts: server on a pi, client on a PC. It doesn't seem to be a MTU or fragmentation issue. The UDP packets are only 1068 bytes in size, and not fragmented.

Using a buffer size of 10.67ms (256) results in each packet containing two frames of audio, each with its own sequence number. In that setting, I was seeing one packet every 10.67ms coming from the Windows client, but still one packet every 5.33ms coming back from the server. They alternated between having zeros in the first frame and zeros in the second frame. So it could possibly be some issue in server.cpp that doesn't exist in client.cpp

Note that the client will encode according to the settings in the Client Settings dialog, but the server will encode according to the information in received in the NETW_TRANSPORT_PROPS message it received from the client.

Talking of which, the codec field in the NETW_TRANSPORT_PROPS message should specify a different value for RAW, rather than still saying OPUS, like this:

jamulus/src/util.h

Lines 484 to 492 in 849e823

// Audio compression type enum -------------------------------------------------
enum EAudComprType
{
// used for protocol -> enum values must be fixed!
CT_NONE = 0,
CT_CELT = 1,
CT_OPUS = 2,
CT_OPUS64 = 3 // using OPUS with 64 samples frame size
};

So when sending props for raw encoding, it should either use CT_NONE or define a new CT_RAW=4.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

I've just tried a build of rawaudio-dev here, between two separate hosts: server on a pi, client on a PC. It doesn't seem to be a MTU or fragmentation issue. The UDP packets are only 1068 bytes in size, and not fragmented.

This build is not taking into account iSndCrdFrameSizeFactor. From what I understood it should be mostly 1 and my code seems to only work when it is. iCeltNumCodedBytes should be multiplied by iSndCrdFrameSizeFactor. On 256 samples buffer size it will create packets that get fragmented. Wireshark shows the fragmentation. Should I push these changes for you to test? I might have gotten something fundamentally wrong here, but I'd say the problem is mainly in the client since the server happily plays back everything you throw at it.

@softins
Copy link
Copy Markdown
Member

softins commented Apr 23, 2026

Ah, so the issue is that the client is not sending enough data to satisfy the server, and the server is therefore adding in packets of zeros to maintain the data rate.

Fragmentation should not be an issue, at least with IPv4, as fragmentation and re-assembly happens transparently at the IP layer. In fact, I don't think it will occur anyway, as the traffic from the server is not fragmented. We should just get packets from the client at 5.33ms instead of 10.67ms.

In fact, I've been doing some tests with Wireshark of all the various data rates, qualities and mono/stereo, and it seems that the packet interval is normally half the buffer time specified in the Client Settings. Except when "Small buffers" is not checked, and then 2.67 (64) is exactly the same as 5.33 (128).

@softins
Copy link
Copy Markdown
Member

softins commented Apr 23, 2026

This build is not taking into account iSndCrdFrameSizeFactor. From what I understood it should be mostly 1 and my code seems to only work when it is. iCeltNumCodedBytes should be multiplied by iSndCrdFrameSizeFactor. On 256 samples buffer size it will create packets that get fragmented. Wireshark shows the fragmentation. Should I push these changes for you to test?

Yes please - I'm building directly from your rawaudio-dev branch.

@softins
Copy link
Copy Markdown
Member

softins commented Apr 23, 2026

I think in client.cpp around line 1486, you need also to do a similar loop as a few lines above:

for ( i = 0, j = 0; i < iSndCrdFrameSizeFactor; i++, j += iNumAudioChannels * iOPUSFrameSizeSamples )

I don't have any more time today to try it...

Comment thread src/server.cpp
}

const int iOffset = iB * SYSTEM_FRAME_SIZE_SAMPLES * vecNumAudioChannels[iChanCnt];
// Recognise a raw audio packet by its size
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to recognise the audio frame by a sentinel byte. Protocol frames begin with 00 00 and must have a good checksum. Otherwise they are considered to be audio. Opus frames always begin with 00 for mono and 04 for stereo. So maybe for raw audio, the audio data could be prepended with a byte of f0 for mono and f4 for stereo? Then it could be recognised unambiguously. Both client and server need to recognise the format of a received frame correctly without relying on an out-of-band context.

Comment thread src/client.cpp
@dingodoppelt
Copy link
Copy Markdown
Contributor Author

I had misunderstood the packet size calculation and it seems fixed with the last commit.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

Note that the client will encode according to the settings in the Client Settings dialog, but the server will encode according to the information in received in the NETW_TRANSPORT_PROPS message it received from the client.

Talking of which, the codec field in the NETW_TRANSPORT_PROPS message should specify a different value for RAW, rather than still saying OPUS, like this:

I think OPUS and OPUS64 only refer to the setting of small network buffers. It isn't related to the actual opus coding.

@softins
Copy link
Copy Markdown
Member

softins commented Apr 24, 2026

Note that the client will encode according to the settings in the Client Settings dialog, but the server will encode according to the information in received in the NETW_TRANSPORT_PROPS message it received from the client.
Talking of which, the codec field in the NETW_TRANSPORT_PROPS message should specify a different value for RAW, rather than still saying OPUS, like this:

I think OPUS and OPUS64 only refer to the setting of small network buffers. It isn't related to the actual opus coding.

Maybe - I hadn't got around to examining how the value was used in the code. It just felt wrong for the message to state OPUS when it wasn't, and maybe a specific value could also be useful to the server.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

Maybe - I hadn't got around to examining how the value was used in the code. It just felt wrong for the message to state OPUS when it wasn't, and maybe a specific value could also be useful to the server.

The server isn't aware of the opus quality setting. It is only being sent the packet size and feeds that into the opus codec. There is currently no other way for the server than to determine the codec (or none) by the expected packet sizes for rawaudio. OPUS and OPUS64 refer to 128 or 64 samples internal buffering, no relations to audio quality settings.

@softins
Copy link
Copy Markdown
Member

softins commented Apr 24, 2026

Just tried the latest build. It's looking good in Wireshark and sounding good too. No fragmentation either, as the max packet size is only 1068 for stereo, max quality, 10.67ms(256).

@JoshuaDodds
Copy link
Copy Markdown

Have you guys also tested with small network buffers? This seems to work as well for me and with a huge improvement on latency as well (as expected). Tested on arm64 ubuntu server side and win11 for the client.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

dingodoppelt commented Apr 25, 2026

Have you guys also tested with small network buffers? This seems to work as well for me and with a huge improvement on latency as well (as expected). Tested on arm64 ubuntu server side and win11 for the client.

Yes. I tried all possible combinations except for buffer sizes not a power of two. I know those work with opus audio quality settings on Jamulus. Jack on linux supports samplerates other than powers of two but I don't know about other platforms. Pipewire on linux is fixed to powers of two as well.

@ann0see
Copy link
Copy Markdown
Member

ann0see commented Apr 25, 2026

Is the stream rate in the UI still accurate?

@foobarth
Copy link
Copy Markdown

Is the stream rate in the UI still accurate?

2004 kbps on maxed out settings.

@ann0see
Copy link
Copy Markdown
Member

ann0see commented Apr 25, 2026

Ok. For me it didn't show any change - but I used an older version.

Definitely https://jamulus.io/wiki/Server-Bandwidth must be updated.

@ann0see ann0see added the needs documentation PRs requiring documentation changes or additions label Apr 25, 2026
@dingodoppelt
Copy link
Copy Markdown
Contributor Author

Ok. For me it didn't show any change - but I used an older version.

Definitely https://jamulus.io/wiki/Server-Bandwidth must be updated.

It should show higher rates anyways. Are you sure you were on a server 3.11.1? This is code I haven't touched and it worked automagically from the beginning. The bitrate indicates, if the server supports raw audio. If your settings are "Max" but you are shown the old opus bitrate the client fell back to opus. This is why you can have your settings at "Max" and join legacy servers without noticing, except for the audio quality not being raw.

@dingodoppelt
Copy link
Copy Markdown
Contributor Author

Definitely https://jamulus.io/wiki/Server-Bandwidth must be updated.

I don't think the numbers were correct in the first place. For me at 64 samples I get 906 kbps, not 894 as stated in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs documentation PRs requiring documentation changes or additions

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

5 participants