8000 Error when handling datagram · Issue #206 · pion/turn · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Error when handling datagram #206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jamesand opened this issue Mar 9, 2021 · 17 comments · Fixed 8000 by #243
Closed

Error when handling datagram #206

jamesand opened this issue Mar 9, 2021 · 17 comments · Fixed by #243

Comments

@jamesand
Copy link
jamesand commented Mar 9, 2021

Your environment.

Windows Server 2019
Pion Turn server 2.0.2

What did you do?

I executed "turn-server-simple.exe" to create a Turn server for a WebRTC project Im developing

What did you expect?

To work ok. It does sometimes. Others it get an error.

What happened?

When haviong a videocall using WebRTC, the console throws a lots of errors. For example:

turn ERROR: 2021/03/09 09:47:34 error when handling datagram: failed to handle Refresh-request from 94.73.44.131:30866: no allocation found for 94.73.44.131:30866:0.0.0.0:3478
turn ERROR: 2021/03/09 09:47:34 error when handling datagram: failed to handle Refresh-request from 94.73.44.131:30858: no allocation found for 94.73.44.131:30858:0.0.0.0:3478
turn ERROR: 2021/03/09 09:47:34 error when handling datagram: failed to handle Refresh-request from 94.73.44.131:30860: no allocation found for 94.73.44.131:30860:0.0.0.0:3478
turn ERROR: 2021/03/09 09:52:12 error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
turn ERROR: 2021/03/09 09:52:12 error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
turn ERROR: 2021/03/09 09:52:25 error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header

Others, I get:

turn ERROR: 2021/03/09 12:44:52 error when handling datagram: failed to handle Allocate-request from 176.83.80.192:7363: relay already allocated for 5-TUPLE
turn ERROR: 2021/03/09 12:44:52 error when handling datagram: failed to handle Allocate-request from 176.83.80.192:7543: relay already allocated for 5-TUPLE
turn ERROR: 2021/03/09 12:44:52 error when handling datagram: failed to handle Allocate-request from 176.83.80.192:7370: relay already allocated for 5-TUPLE
turn ERROR: 2021/03/09 12:44:52 error when handling datagram: failed to handle Allocate-request from 176.83.80.192:7381: relay already allocated for 5-TUPLE

Sometimes the videocall works ok althought the errors; sometimes not.

@mohammadVatandoost
Copy link

I have the same issue in CentOS Linux 7.

@mehrvarz
Copy link

I am also seeing "relay already allocated for 5-TUPLE".
It seems to happen only with some client devices (WebRTC web app on Firefox 91 Ubuntu + Android).
With other clients (same app on Chromium 91 Ubuntu) this error never shows up.
If the error is thrown, it always happens after the client has hit auth handler 3 times in short order. Client may be able to continue despite the err, sometimes with a delay. But plenty of times service is aborted.

I am seeing this more often lately. Maybe it is related to newer Firefox releases?
Hah! I just tried with an older FF (v65) on Android and here the error is not thrown.

@mehrvarz
Copy link

This log is showing the "already allocated for 5-TUPLE" issue. The client is FF 91 on Android. The server is running on Linux with Go 1.17. Comments below.

server.go:36: received 20 bytes of udp from s.s.s.s:31505 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=x0airCDAcjULiKrD 20 Binding
stun.go:9: received BindingRequest from s.s.s.s:31505
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 20 bytes of udp from c.c.c.c:47289 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=RploSUJXVvEFh58y 20 Binding
stun.go:9: received BindingRequest from c.c.c.c:47289
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 44 bytes of udp from c.c.c.c:47289 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=24 attrs=3 id=Qf4vu6nmbZcMyIuu 44 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47289
util.go:85: authenticateRequest call respondWithNonce
turn.go:24: received AllocateRequest -> authenticateRequest !hasAuth
server.go:36: received 20 bytes of udp from c.c.c.c:47329 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=h3N9rFd+/+m2nZg+ 20 Binding
stun.go:9: received BindingRequest from c.c.c.c:47329
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 44 bytes of udp from c.c.c.c:47329 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=24 attrs=3 id=iMVa7J13O6r18InB 44 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47329
util.go:85: authenticateRequest call respondWithNonce
turn.go:24: received AllocateRequest -> authenticateRequest !hasAuth
server.go:36: received 44 bytes of udp from c.c.c.c:47289 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=24 attrs=3 id=Qf4vu6nmbZcMyIuu 44 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47289
util.go:85: authenticateRequest call respondWithNonce
turn.go:24: received AllocateRequest -> authenticateRequest !hasAuth
server.go:36: received 20 bytes of udp from c.c.c.c:47329 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=h3N9rFd+/+m2nZg+ 20 Binding
stun.go:9: received BindingRequest from c.c.c.c:47329
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 140 bytes of udp from c.c.c.c:47289 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=120 attrs=7 id=Sr/cN8gRXBBMakWo 140 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47289
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:27: received AllocateRequest -> authenticateRequest hasAuth
turn.go:40: fiveTuple &{0 c.c.c.c:47289 0.0.0.0:3739}
allocation_manager.go:103: listening on relay addr: s.s.s.s:37937
server.go:36: received 140 bytes of udp from c.c.c.c:47329 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=120 attrs=7 id=3AG+CTVPZEao85A9 140 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47329
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:27: received AllocateRequest -> authenticateRequest hasAuth
turn.go:40: fiveTuple &{0 c.c.c.c:47329 0.0.0.0:3739}
allocation_manager.go:103: listening on relay addr: s.s.s.s:21344
server.go:36: received 140 bytes of udp from c.c.c.c:47289 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=120 attrs=7 id=Sr/cN8gRXBBMakWo 140 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47289
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:27: received AllocateRequest -> authenticateRequest hasAuth
turn.go:40: fiveTuple &{0 c.c.c.c:47289 0.0.0.0:3739}
turn ERROR: 2021/08/25 10:58:25 error when handling datagram: failed to handle Allocate-request from c.c.c.c:47289: relay already allocated for 5-TUPLE

comments:

  • s.s.s.s is my server
  • c.c.c.c is my client
  • "turn auth for client (c.c.c.c) SUCCESS" is my authHandler
  • "turn.go:40: fiveTuple" is a log-statement I have added to your code

There are three such "fiveTuple" lines before the error (why?).
The 1st and the 3rd do indeed show the same ip's and ports:

turn.go:40: fiveTuple &{0 c.c.c.c:47289 0.0.0.0:3739}
turn.go:40: fiveTuple &{0 c.c.c.c:47329 0.0.0.0:3739}
turn.go:40: fiveTuple &{0 c.c.c.c:47289 0.0.0.0:3739}

@mehrvarz
Copy link

For comparsion, here a log without the 5-TUPLE issue. The client in this case is Chromium 91 on Ubuntu. There are again 3 calls to AuthHandler (is this expected?). But unlike in the FF 91 case above, you can see CreatePermission being called. There is also less activity before the first call to AuthHandler than in the previous log.

server.go:36: received 20 bytes of udp from s.s.s.s:45996 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=TCAbc2bgBWe7YpE3 20 Binding
stun.go:9: received BindingRequest from s.s.s.s:45996
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 20 bytes of udp from c.c.c.c:47282 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Binding request l=0 attrs=0 id=b2ZzRlBTQ1VyZEk0 20 Binding
stun.go:9: received BindingRequest from c.c.c.c:47282
stun.go:22: received BindingRequest -> buildAndSend
server.go:36: received 28 bytes of udp from c.c.c.c:47282 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=8 attrs=1 id=RURhdmRRRVo5MWNX 28 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47282
util.go:85: authenticateRequest call respondWithNonce
turn.go:24: received AllocateRequest -> authenticateRequest !hasAuth
server.go:36: received 124 bytes of udp from c.c.c.c:47282 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket Allocate request l=104 attrs=5 id=ZzI5UHhuU01VYTdP 124 Allocate
turn.go:15: received AllocateRequest from c.c.c.c:47282
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:27: received AllocateRequest -> authenticateRequest hasAuth
turn.go:40: fiveTuple &{0 c.c.c.c:47282 0.0.0.0:3739}
allocation_manager.go:103: listening on relay addr: s.s.s.s:48310
server.go:36: received 128 bytes of udp from c.c.c.c:47282 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket CreatePermission request l=108 attrs=5 id=OUZFQWJUd3JtdDdJ 128 CreatePermission
turn.go:208: received CreatePermission from c.c.c.c:47282
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:232: adding permission for s.s.s.s:29120
server.go:36: received 128 bytes of udp from c.c.c.c:47282 on 0.0.0.0:3739
server.go:61: handleTURNPacket
server.go:67: handleTURNPacket CreatePermission request l=108 attrs=5 id=eDFSMjN1V2RHdkdX 128 CreatePermission
turn.go:208: received CreatePermission from c.c.c.c:47282
util.go:88: authenticateRequest not call respondWithNonce

turn auth for client (c.c.c.c) SUCCESS

turn.go:232: adding permission for s.s.s.s:45996

@mehrvarz
Copy link

Hey @Sean-Der

If you want to ask more questions I am in Slack and always ready to help

Sorry, I am having difficulties getting on to Slack. In any case, there seems to be an issue with Pion Turn and newer Firefox releases. Would you be able to take a look at my logs and share a thought? If you can help me a bit I may be able to fix this and provide a patch. Using FF with Coturn does not appear to throw any errors. Nor does using Chromium with Pion Turn. But when using FF 91 (on Ubuntu and Android) with Pion Turn, I always see the "already allocated for 5-TUPLE" error. And in some cases the WebRTC service can then not be used or there is a seconds long delay (however, sometimes service does work despite the error msg). Crucial question is whether FF or Pion Turn needs fixing. Thank you.

@mehrvarz
Copy link

Think I found a solution. It involves swapping step 1 and 2 in the handleAllocateRequest() function of internal/server/turn.go. In other words, I check for 5-tuple before invoking authenticateRequest(). Please let me know if doing this is wrong for obvious reasons. This is the code I am using. It seems to work well with both browsers, including on Android. FF 91 still does this, but now it appears to be harmless and all sessions work without delay or abort.

@cnderrauber
Copy link
Member
cnderrauber commented Feb 17, 2022

As RFC said, The server MUST require that the request be authenticated first. So there might be another reason for this.

https://datatracker.ietf.org/doc/html/rfc5766#section-6.2

6.2. Receiving an Allocate Request

When the server receives an Allocate request, it performs the
following checks:

  1. The server MUST require that the request be authenticated. This
    authentication MUST be done using the long-term credential
    mechanism of [RFC5389] unless the client and server agree to use
    another mechanism through some procedure outside the scope of
    this document.

  2. The server checks if the 5-tuple is currently in use by an
    existing allocation. If yes, the server rejects the request with
    a 437 (Allocation Mismatch) error.

@mehrvarz
Copy link

As RFC said, The server MUST require that the request be authenticated first. So there might be another reason for this.

No, the word "MUST" relates to "be authenticated". It does not relate to the order of steps. Five things need to be done. RFC does not explicitly say that a specific order is required.

Swapping step 1 and 2 seems to work well for me in practice and for 6 month now. Checking "is allocations in use" before doing "authenticateRequest()" also make full sense to me. With the modified order, no authentication takes place if the allocation is already in use.

The original order was causing issues with some UA's. Not with all. The modified order works with all UA's. Can you please do a code analysis?

@cnderrauber
Copy link
Member
cnderrauber commented Feb 17, 2022

As RFC said, The server MUST require that the request be authenticated first. So there might be another reason for this.

No, the word "MUST" relates to "be authenticated". It does not relate to the order of steps. Five things need to be done. RFC does not explicitly say that a specific order is required.

Swapping step 1 and 2 seems to work well for me in practice and for 6 month now. Checking "is allocations in use" before doing "authenticateRequest()" also make full sense to me. With the modified order, no authentication takes place if the allocation is already in use.

The original order was causing issues with some UA's. Not with all. The modified order works with all UA's. Can you please do a code analysis?

We should follow the RFC's step to check these.
I don't know the detail of your problem, but I can reproduce the already allocated case. That's because turn server's response sent to client may lost, and the behavior of client's retry and pion process the retry request cause the problem.

  1. client sent allocate request
  2. server received request, process it correctly and response to client
  3. response lost, and client retry and got 437 allocation mismatch result because server found 5 tuple already exist
    a. some client retry with same 5 tuple, firefox does this, it use resiprocate's nICEr lib, and I checked the code
    b. some client like chrome retry allocation with other transport when received 437 , so it success.

client's behavior like chrome is follow RFC, and nICEr not implement this.

The client SHOULD try three different client transport addresses before giving up on this server

RFC also has mentioned server could implement 'stateless stack approach' to process allocation retransmit, but pion not implement this.

So the '5 tuple already exist' problem is caused by both ff and pion. I can reproduce it if I always drop first allocation response, and ff can't establish connection. Then I add the retransmit process, the problem is gone, ff works fine even allocation response is lost/dropped. I will create a PR tomorrow for this

@mehrvarz
Copy link

Wow. This is a fairly extensive patch. When do you think will a release tag become available?

@cnderrauber
Copy link
Member

@mehrvarz Have another test case for packet lost, if all things working, will release today

@mehrvarz
Copy link

I am now testing v2.0.8. As before, I am seeing "No Permission or Channel exists..." log entries. About 4-6 per client connect, even for pure p2p (none-TURN) connections. These msgs come from /internal/allocation/allocation.go, very close to code you just looked at. Did you not notice them fill the log?

Note: If something is wrong (say, turn auth fail), I see more of these msgs. But if everything is working right, I still get about 4-6 of these msgs.

@mehrvarz
Copy link

@cnderrauber I am not asking you to solve a problem. I am only asking if you have seen unexpected "No Permission or Channel exists..." log msgs.

@cnderrauber
Copy link
Member

I think they are exist but not sure.

@mehrvarz
Copy link
mehrvarz commented Feb 24, 2022

Findings:

  • 1 additional, unnecessary, gratious allocation is allocated per session
  • this allocation has indeed no permission or channel assigned to it
  • this empty allocation keeps being invoked at runtime, causing the msg
  • issue occurs on public servers (domain name?)
  • issue may be harmless, but it sure is aggravating
  • analyzing the logs is impeded by the many false warning messages

Just let me mute this log statement.

Ed: I previously reported this issue. It sometimes goes away. But it keeps coming back.

@hy-net
Copy link
hy-net commented Mar 17, 2022

I setup a new pion server, put up for a few days (not even using it yet), and this error appears from no where.

Seems like someone is sending random datagram to my server.

@dadagov125
Copy link

I have the same problem when connecting from ios 15.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
0