一把手直属专用:01056292228转800   舆论引导:01056292228转802   综合治理:01056292228转805   品牌安全与提升:01056292228转808
您当前的位置:亲稳网 > 中国亲稳 > 亲稳行业 > 亲稳教育 >







  搜狐教育独家报道   近日网上一篇题为"色情视频音频辅助识别"的硕士研究生学位论文爆红,记者多方搜集资料,最终在全球最大的数字图书馆中国知网查到一篇来源于北京邮电大学的的《色情视频音频辅助识别》论文,发表时间是2011年,作者是姬鹏宇。

Sohu education exclusives   Recently an essay on Internet"Pornographic video audio auxiliary recognition"The master graduate student degree thesis detonation red,Reporter collect enought material,Finally, in the world's biggest digital library China hownet to check a from of Beijing university of post and telecommunications《Pornographic video audio auxiliary recognition》The paper,Publication time is 2011 years,The writer is JiPengYu。


In the paper the parts as follows:

  随着网络技术和多媒体技术的不断发展,人们日常生活中接触到的多媒体信息越来越多,数字视频便是其中重要的一种。数字视频在便利人们生活的同时,一些问题也随之而来:暴力,色情等视频借机传播,成为了社会的不和谐因素。基于以上原因,识别并检测该类视频就成为一项有实际意义的工作。但是,视频文件一般数据量巨大,在处理的过程中,对于存储和运算的要求都较高。同时,从一般的视频流中直接提取出高级的语义信息仍然比较困难。因此,我们可以通过其他的途径来选取解决这一问题的办法。在视频文件中,音频是对视觉信息的一个极好的补充。同时,音频自身含有大量的特征信息。而就色情类视频本身而言,其在特定场景下,音频也具有自身特征。因此,本文选取了音频作为突破口来对色情视频进行检测和识别。 色情视频所对应的音频在物理特性上与普通音频没有差别,因此可以选择一些传统的音频处理手段来处理色情类音频。本文选取了高斯混合模型(GMM)和隐马尔可夫模型(HMM)来构建分类和识别模型。主要工作包括模型(GMM模型和HMM模型)的训练。重点在识别系统框架的构建和实现。 首先,从视频文件中提取出音频信息并转化为WAV格式(16bit,22kHz,单声道)的待测音频。待测音频通过汉明窗加窗处理后被分成0.02秒的短时音频处理帧。接下来对每个短时音频处理帧中提取出26维MFCC系数、1维过零率、1维短时能量、4维子带能量和4维子带能量比等特征,形成36维的特征向量。在色情音频识别过程中,首先利用短时能量将音频处理帧分静音帧和非静音帧,再利用GMM模型将非静音帧进一步分成音乐、语音、音乐语音混合声和环境声四类。最后再利用HMM模型从剩余的语音和音乐语音混合帧中识别出可能包含色情的音频帧。 整个算法在VC6.0平台下实现。测试结果表明,整个系统可以有效工作,起到了良好的辅助识别作用。

Along with the network technology and the multimedia technology unceasing development,People's Daily life come into contact with more and more multimedia information,Digital video is an important one of a kind。Digital video in convenient people living at the same time,Some problems are followed:violence,Pornography and video looking to spread,Become a social disharmonious factors。Based on the above reasons,To identify and test the video will become a practical work。but,Video file general of the great amount of data,In the process,For storage and operation of the demand is higher。At the same time,From the general streaming video directly extracted senior semantic information is still more difficult。therefore,We can through the other way to select the solution to this problem。In the video file,Audio visual information is one of the very good supplement。At the same time,Audio itself contains a lot of feature information。And pornographic kind of video itself,The specific scene,Audio also has its own characteristics。therefore,In this paper, we select audio as a breakthrough to pornographic video detection and recognition。 Pornographic video of the corresponding audio in physical characteristics and common audio no difference,So can choose a few traditional audio processing method to deal with pornographic kind of audio。In this paper, we select gaussian mixture model(GMM)And the hidden markov model(HMM)To build classification and recognition model。The main work include model(GMM model and HMM model)training。Focus on the construction of the recognition system framework and implementation。 First of all,Video files from extract audio information and translated into WAV format(16 bit,22 KHZ,monophony)Test to the audio。To be tested audio through the hamming window add window after treatment were divided into 0.02 seconds short audio processing frame。Next to each short audio processing frame extract and d MFCC coefficient、1 d zero-crossing rate is、1 d short-time energy、4 d subband energy and 4 d subband energy than characteristics,Formation and dimensional feature vector。In the pornographic audio recognition process,First, the short-term energy will audio processing frame points mute frame and the mute frame,Reuse GMM model will not mute frame further divided into music、speech、Music voice mixed sound and environmental acoustic four categories。Finally reuse HMM model from the rest of the speech and music voice mixed frame identify may contain pornographic audio frame。 The whole algorithm in VC6.0 platform to realize。Test results show that,The whole system can work effectively,Have a good auxiliary recognition effect。
