[m3u8]解析某个视频网站播放

m3u8视频格式,其实请求下来的是一个文本,里面记载了一串ts视频片段以及时间戳;有些还包含视频加密的key。

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-ALLOW-CACHE:YES
#EXT-X-KEY:METHOD=AES-128,URI="http://hls.videocc.net/8b0a2fa267/0/8b0a2fa267c027bc31a2675e51780d00_2.key?pid=1529814332630X1997867&ts=1529814332000&sign=98271bf3deabebb179b9d0476a30f123&ms=5d88e9afe50d7c8248b52b26c15ef328",IV=0xae98961dd802f860ae9b67dd75136a18
#EXT-X-TARGETDURATION:13
#EXTINF:5.080000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_0.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:5.000000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_1.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:10.000000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_2.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:10.000000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_3.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:10.000000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_4.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:10.000000,
http://ab-mts.vid+eocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_5.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:12.760000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_6.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXTINF:0.200000,
http://ab-mts.videocc.net/8b0a2fa267/48/1527235156000/0/78/0d/00_2/8b0a2fa267c027bc31a2675e51780d00_2_7.ts?pid=1529814332630X1997867&ts=1529814332000&sign=5d88e9afe50d7c8248b52b26c15ef328
#EXT-X-ENDLIST

某个视频网站只提供在线播放,并且只能用它的js播放,通过请求ts和sign以及验证parent.origin来源限制domain是视频拥有者设置好的域名来源。

原本计划拿到视频vid,可以直接请求ts和sign,再本地改改hosts映射就可以播放了;
后来想了一下,m3u8视频播放是一个公用标准,应该将key和ts片段下载了,就能自建server播放。
So,试试看。

1、首先在Github上淘一淘,有hls.js:
https://video-dev.github.io/hls.js/docs/html/
噢,居然还有m3u8的python解析:
https://github.com/globocom/m3u8
这下播放和修改m3u8替换url都方便了。
2、计划任务:
通过mitmproxy代理,获取svideoid的列表,是json数据;
通过svideo得到sign、ts
构建m3u8的下载地址,还需要生成pid

    e.getPid = function t() {
        var e = new Date;
        var t = e.getTime() + "";
        var n = parseInt(Math.random() * 1e6 + 1e6) + "";
        var i = t + "X" + n;
        if (typeof updatePid == "function") {
            updatePid(i)
        }
        return i
    }

下载原生的m3u8文件后,用上面的工具解析,然后替换url,保存到自己server的路径;
下载每一条ts文件,保存到自定义路径;
这里可以用video的唯一编码来作为文件夹,以清晰度为文件名前缀,ts分片索引保存;
3、数据到位之后,编写html播放页面

<script src="https://cdn.jsdelivr.net/npm/hls.js@latest"></script>
<!-- Or if you want a more recent canary version -->
<!-- <script src="https://cdn.jsdelivr.net/npm/hls.js@canary"></script> -->
<video id="video"></video>
<script>
  var video = document.getElementById('video');
  if(Hls.isSupported()) {
    var hls = new Hls();
    hls.loadSource('https://video-dev.github.io/streams/x36xhzz/x36xhzz.m3u8');
    hls.attachMedia(video);
    hls.on(Hls.Events.MANIFEST_PARSED,function() {
      video.play();
  });
 }
 // hls.js is not supported on platforms that do not have Media Source Extensions (MSE) enabled.
 // When the browser has built-in HLS support (check using `canPlayType`), we can provide an HLS manifest (i.e. .m3u8 URL) directly to the video element throught the `src` property.
 // This is using the built-in support of the plain video element, without using hls.js.
 // Note: it would be more normal to wait on the 'canplay' event below however on Safari (where you are most likely to find built-in HLS support) the video.src URL must be on the user-driven
 // white-list before a 'canplay' event will be emitted; the last video event that can be reliably listened-for when the URL is not on the white-list is 'loadedmetadata'.
  else if (video.canPlayType('application/vnd.apple.mpegurl')) {
    video.src = 'https://video-dev.github.io/streams/x36xhzz/x36xhzz.m3u8';
    video.addEventListener('loadedmetadata',function() {
      video.play();
    });
  }
</script>

经过自建Server的验证,确实key文件和ts文件内容是不变化的,可以下载下来播放;
那么只剩下一个问题:如果从videoId找到m3u8的下载地址?需要研究一下该网站的视频播放器js文件了,筛选可疑的代码如下:

        function Y(t) {
            var n = (new Date).getTime();
            var r = o.vid + n + "polyv";
            var a = CryptoJS.MD5(r);
            var s = "";
            if (o.hlstest) {
                s = i + "hlstest.videocc.net/event/switch_bitrate?sign=" + a + "&ts=" + n + "&vid=" + o.vid + "&mt=" + o.ts + "&ms=" + o.sign
            } else {
                s = i + "hls.videocc.net/event/switch_bitrate?sign=" + a + "&ts=" + n + "&vid=" + o.vid + "&mt=" + o.ts + "&ms=" + o.sign
            }
            e.ajax({
                url: s,
                dataType: "text",
                success: function(e) {
                    if (e == "success") {
                        var n = J(function(e) {
                            t.src = e;
                            t.play()
                        })
                    }
                },
                error: function() {}
            })
        }

这里的i是http协议

            var i = "http://";
            if (window.location.protocol == "https:") {
                i = "https://"
            }

通过Postman发送构建的m3u8的url,视频文件url居然都是相同的层级,OK,这下都不用查找m3u8真实地址了,直接修改videoId即可。
开始编写Python代码:
1、解析VideoList中的svideoId字段
2、获取该svideoId的当前ts、sign
3、构建m3u8下载地址,下载m3u8文本,修改url并存到指定路径
4、下载m3u8中的ts片段,并存到指定路径
5、创建模板html,建立title到svideoId的超链接,点击即可播放。