freeswitch 旁路方案
目前旁路方案中,开源可用的有:mod_audio_stream,
这个模块是把用户语音流以ws的方式发送给服务端,服务端收到之后就可以进行ASR相关的识别。
freeswitch的版本为: 1.10.7,本次测试的mod_audio_stream版本是:
1.0.3
编译
- 下载
mod_audio_stream模块
1
|
git clone https://github.com/amigniter/mod_audio_stream.git
|
- 编译
mod_audio_stream模块
1
|
cd mod_audio_stream && ./build-mod-audio-stream.sh
|
需要注意的是mod-audio-stream依赖libwsc,可以提前下载好。
编译成功之后,会在mod-audio-stream/build目录下生成mod_audio_stream.so文件。
- 安装
mod_audio_stream模块
1
|
cp build/mod_audio_stream.so /usr/local/freeswitch/mod/
|
只需要把mod_audio_stream.so文件复制到和freeswitch其他mod共同的路径下, 默认是:/usr/local/freeswitch/mod/。
- 加载
mod_audio_stream模块
在freeswitch的conf/autoload_configs/modules.conf.xml文件中添加如下内容:
1
|
<load module="mod_audio_stream"/>
|
这样就不用手动加载mod_audio_stream模块了。你如果想手动加载mod_audio_stream模块,可以使用如下命令:
1
|
fs_cli -x "load mod_audio_stream"
|
测试
我使用的是esl,当呼入电话接通之后,执行:
1
|
uuid_audio_stream <uuid> start ws://172.16.4.111:8080/ws mono 8k
|
wsserver端代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
|
package main
import (
"log"
"net/http"
"time"
"github.com/gorilla/websocket"
)
var upgrader = websocket.Upgrader{
CheckOrigin: func(r *http.Request) bool {
return true // 允许所有跨域请求,生产环境应该更严格
},
}
func handleWebSocket(w http.ResponseWriter, r *http.Request) {
// 升级 HTTP 连接到 WebSocket
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
log.Printf("升级 WebSocket 失败: %v", err)
return
}
defer conn.Close()
log.Printf("新的 WebSocket 连接已建立: %s", r.RemoteAddr)
// 设置读取超时
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
conn.SetPongHandler(func(string) error {
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
return nil
})
for {
// 读取消息
messageType, message, err := conn.ReadMessage()
if err != nil {
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseAbnormalClosure) {
log.Printf("读取错误: %v", err)
}
break
}
// 打印收到的内容
log.Printf("收到来自 %s 的消息:", r.RemoteAddr)
log.Printf(" 消息类型: %d", messageType)
log.Printf(" 内容: %s", string(message))
log.Printf(" 长度: %d 字节", len(message))
log.Println(" ---")
}
log.Printf("WebSocket 连接已关闭: %s", r.RemoteAddr)
}
func main() {
http.HandleFunc("/ws", handleWebSocket)
host := "172.16.4.111:8080"
log.Printf("WebSocket 服务器启动在 http://%s", host)
log.Printf("WebSocket 端点: ws://%s/ws", host)
if err := http.ListenAndServe(host, nil); err != nil {
log.Fatal("服务器启动失败: ", err)
}
}
|
可以看到freeswitch的日志有报错connection error:
1
2
3
4
5
6
|
2025-11-05 15:47:01.031389 54.17% [DEBUG] sofia.c:7499 Channel sofia/internal/1000@172.16.4.111 entering state [completed][200]
2025-11-05 15:47:01.031389 54.17% [DEBUG] mod_audio_stream.c:150 mod_audio_stream cmd: 622e69e4-cfcb-4a08-a9d7-fa45a9cefb88 start ws://172.16.4.111:8080/ws mono 8k
2025-11-05 15:47:01.031389 54.17% [DEBUG] mod_audio_stream.c:81 calling stream_session_init.
2025-11-05 15:47:01.051398 54.17% [INFO] audio_streamer_glue.cpp:170 connection error
2025-11-05 15:47:01.051398 54.17% [DEBUG] audio_streamer_glue.cpp:357 (622e69e4-cfcb-4a08-a9d7-fa45a9cefb88) no resampling needed for this call
2025-11-05 15:47:01.051398 54.17% [DEBUG] audio_streamer_glue.cpp:360 (622e69e4-cfcb-4a08-a9d7-fa45a9cefb88) stream_data_init
|
经过排查,问题是libwsc库的WebSocketClient.cpp的connect连接函数中,自动使用DNS解析。
我使用的是容器部署的,其容器内没有设置DNS解析,导致连接失败。解决方法有三种:
- 在容器内执行
echo "nameserver 8.8.8.8" >> /etc/resolv.conf。
- 修改
libwsc库的WebSocketClient.cpp的connect连接函数,内evdns_base_new(base, 1);为evdns_base_new(base, 0);。
- 运行容器时, 添加
--dns 8.8.8.8参数。
我使用的是--dns 8.8.8.8方法, 解决了问题。freeswitch的日志为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
2025-11-06 10:30:14.450220 53.23% [DEBUG] sofia.c:7499 Channel sofia/internal/1000@172.16.4.111 entering state [completed][200]
EXECUTE [depth=0] sofia/internal/1000@172.16.4.111 park()
2025-11-06 10:30:14.470242 53.23% [DEBUG] mod_audio_stream.c:150 mod_audio_stream cmd: 174679ed-d156-4223-9940-81a92c93f3de start ws://172.16.4.111:8080/ws mono 8k
2025-11-06 10:30:14.470242 53.23% [DEBUG] mod_audio_stream.c:81 calling stream_session_init.
2025-11-06 10:30:14.470242 53.23% [DEBUG] audio_streamer_glue.cpp:357 (174679ed-d156-4223-9940-81a92c93f3de) no resampling needed for this call
2025-11-06 10:30:14.470242 53.23% [DEBUG] audio_streamer_glue.cpp:360 (174679ed-d156-4223-9940-81a92c93f3de) stream_data_init
2025-11-06 10:30:14.470242 53.23% [DEBUG] mod_audio_stream.c:87 adding bug.
2025-11-06 10:30:14.470242 53.23% [DEBUG] switch_core_media_bug.c:978 Attaching BUG to sofia/internal/1000@172.16.4.111
2025-11-06 10:30:14.470242 53.23% [DEBUG] mod_audio_stream.c:91 setting bug private data.
2025-11-06 10:30:14.470242 53.23% [DEBUG] mod_audio_stream.c:94 exiting start_capture.
2025-11-06 10:30:14.490223 53.23% [DEBUG] switch_rtp.c:7331 Correct audio RTCP ip/port confirmed.
2025-11-06 10:30:14.490223 53.23% [DEBUG] sofia.c:7499 Channel sofia/internal/1000@172.16.4.111 entering state [ready][200]
2025-11-06 10:30:14.510238 53.23% [DEBUG] switch_rtp.c:1982 rtcp_stats_init: audio ssrc[1139674539] base_seq[2624]
2025-11-06 10:30:14.550238 53.23% [DEBUG] switch_rtp.c:7934 Correct audio ip/port confirmed.
2025-11-06 10:30:14.550238 53.23% [DEBUG] switch_core_io.c:448 Setting BUG Codec PCMA:8
|
ws Server也收到了数据:
1
2
3
4
5
6
7
8
|
025/11/06 10:30:22 收到来自 172.16.4.111:15173 的消息:
2025/11/06 10:30:22 消息类型: 2 �� ���� �� ���� �� ����
2025/11/06 10:30:22 长度: 320 字节
2025/11/06 10:30:22 ---
2025/11/06 10:30:22 收到来自 172.16.4.111:15173 的消息:
2025/11/06 10:30:22 消息类型: 2 �� �� �� �� �� ����� ��
2025/11/06 10:30:22 长度: 320 字节
2025/11/06 10:30:22 ---
|
另外注意一点的是, mod_audio_stream也会产生EVENT事件,有以下几种:
1
2
3
4
5
|
mod_audio_stream::json
mod_audio_stream::connect
mod_audio_stream::disconnect
mod_audio_stream::error
mod_audio_stream::play
|
这些事件是在EVENT:CUSTOM的body里, 示例:
- connect(连接ws服务)
1
2
3
4
5
6
7
8
9
10
11
|
event: CUSTOM
Event-Subclass: []string{"mod_audio_stream%3A%3Aconnect"}
Core-Uuid: []string{"55191760-1543-4551-ae35-46c8579b88a3"}
Unique-Id: []string{"310ec632-94a8-4985-915b-5126faeadc96"}
Variable_number_alias: []string{"1000"}
Variable_dtmf_type: []string{"rfc2833"}
...
Event-Name: []string{"CUSTOM"}
Content-Length: []string{"22"}
Variable_event-Name: []string{"REQUEST_PARAMS"}
{"status":"connected"}
|
- disconnect(ws服务断开连接)
1
2
3
4
5
6
|
event: CUSTOM
Variable_sip_network_port: []string{"62170"}
Variable_local_media_port: []string{"26832"}
Content-Length: []string{"84"}
...
{"status":"disconnected","message":{"code":1000,"reason":"Connection closed (EOF)"}}
|
- error(ws服务连接断开之后产生此错误)
1
2
3
4
5
6
|
event: CUSTOM
Event-Calling-File: []string{"mod_audio_stream.c"}
Event-Name: []string{"CUSTOM"}
Content-Length: []string{"68"}
...
{"status":"error","message":{"code":6,"error":"Connection timeout"}}
|
总结
-
这种方案可以拿到用户的rtp数据,比较灵活对接asr,机器人语音流可以使用playbackor uuid_displace播放文件生成,
想要打断机器人话术直接就可以uuid_break uuid or uuid_displace uuid stop。
-
按键(rfc2833/inbound)获取有两种方式:
esl监听DTMF事件,这样就能获取按键。
- 解析用户
rtp数据,根据rfc2833协议,按键数据在rtp的payload里,需要解析rtp数据,才能获取按键。
-
录音保存功能还是要freeswitch来做。