亲民维稳热点推荐
- ·路政执法健全过错责任追究制度--亲稳
- ·长白山旅游:受意外事件影响大 发展要
- ·北京十渡景区:多具遗体从拒马河上游
- ·Wi-Fi收费导致顾客对酒店的满意度下
- ·为什么旅游业不能诚信规避灾难风险?
- ·政策利好催热租车自驾市场--亲稳网络
- ·华山“一票制”被质疑:剥夺游客的选
- ·北京104家旅游单位可免费无线上网--
- ·北京16名人故居将首次开放--亲稳网络
- ·西湖景区倒塌仿古建筑 均由同一单位
- ·国庆自驾游 免费高速实用技巧指南--
- ·2012额济纳金胡杨国际模特大赛圆满落
- ·9月主要UGC型在线旅游网站和产品监测
- ·卓美亚集团与携程签订全球分销协议--
- ·分析称1500名游客赴日旅游涉嫌炒作--
- ·云南:旅游“打非治违”不力将问责--
即刻使用亲民维稳解决方案!
发掘汇报软件
亲民维稳相关链接
- ·中瑞签署交通运输合作谅解备忘录-201
- ·快递服务多环节战略合作协议签订-201
- ·潜水打捞协会5年培训5900余名专业人
- ·冯正霖与交通建设企业代表座谈筹融资
- ·杨传堂:在改革发展实践中增强战略思
- ·华东六省一市联动治超-2013年05月新
- ·冯正霖在作农村公路工作报告时指出:
- ·中国交通报社与部职业资格中心启动战
- ·河北“七公开”经验全国推广 -2013年
- ·宁杭、杭甬高铁试运行 长三角高铁网
- ·今年第二次快递职鉴全国统考报考人数
- ·上广开通内贸运输“海上动车”-2013
- ·李建波要求公路建设延伸到哪里廉政责
- ·船舶船员协同管理信息化有序推广-201
- ·杨传堂会见坦桑尼亚客人-2013年05月
使用亲民维稳全套解决方案邀请
亲稳发掘汇报系统
林小俊:网络点评在口碑榜中的重要作用--亲稳舆论引导监测室
2013-04-28
林小俊(慧评网CEO):今年2月份受慧评网作为数据的提供方参与了口碑排行榜的工作里,在整个2012年中国高端饭店口碑排行榜工作里面,慧评网提供的工作内容是关于网络点评的数据提供,在这里简单跟各位领导和各位媒体朋友们介绍一下这样一个解决方案。
Lin Xiaojun(Hui evaluation network CEO):In February this year by hui evaluation network as data providers involved in the work list by word of mouth,In the whole 2012 high-end hotel reputation ranking work inside China,Hui the work content of evaluation of network is provided data about online comments,Here is simple with the leaders and all the friends to introduce such a solution.
数据的要求为什么提供网络点评,讲一下原因和历史。现在这个活动进行口碑排行,对于中国的高端饭店进行口碑排行,口碑这两个字实际指顾客满意度,我们主要进行的是基于顾客满意度进行的排行。在这个顾客满意度的排行里面如何知道顾客的满意不满意,顾客满意度的整个评估工作实际上是需要通过海量的顾客反馈回来的数据来得到的,你得去问顾客,他到底对入住的酒店是怎样的感受体验,只有基于这样的数据才有可能真实客观去得到满意度的计算指标。在传统研究里面确实跟戴院长和张会长一直反复提到过的,在传统研究中主要是采用问卷调查的形式来搜集顾客的数据。为什么做问卷调查?因为确实在互联网还没有像今天也许流行起来的时候,当我们想要去了解顾客的时候,除了正式的在大堂里面或者在酒店里面跟顾客碰面进行访谈以外或者通过专家的专访以外,更主要的形式就像刚才这张表一样通过问卷的形式来提交给客户,让我们的顾客能够从这张表得到一些反馈信息。他的一些缺陷也是很明显的,有的顾客问卷调查很明显是一种固定模板的形式,他的问题在顾客进行体验的回顾之前所有的问题被确定死了,就像这样一张表13个表确定死了,而不是顾客想表达什么内容有什么体验有什么感受都可以去表达的,只有遵照固定的模板回答问题。这样一个问题一定会站在调查者的角度而不是站在用户的角度获取数据。
Ask why to provide online comments data,Tell me the reason and history.Now number the activities here,Word of mouth for high-end Chinese restaurant ranking,This two word by word of mouth actually refers to customer satisfaction,We mainly in the ranking is based on customer satisfaction.In the customer satisfaction rankings is not happy how to know the customer satisfaction,Customer satisfaction of the evaluation work is, in fact, through massive amounts of customer feedback data are needed to get,You will have to go and ask the customer,What does he to stay in hotels is how to feel,Only based on this data can be real objective to get satisfaction index is calculated.In traditional research director and will do with wearing has been mentioned repeatedly,In the traditional research is mainly in the form of questionnaire survey to collect customer's data.Why do the questionnaire?Because do not like maybe popular today in the Internet,When we want to know the customer,In addition to the formal in the lobby or outside or inside the hotel to meet with the customer to interview through expert interview,More main form like this form just now through the questionnaire form to submit to the customer,Let our customers can get some feedback information from this form.Some of his defects is quite obvious,Some customer questionnaire is clearly a form of fixed template,His problem all the problem before customer experience review was determined to death,13 a list like this table to determine the dead,Not your customer want to show what content have what experience can feel to express,Only comply with fixed template to answer the question.Such a problem will certainly stand in the Angle of the investigators to get the data instead of standing in the user's point of view.
第二,问卷调查采用顾客打分的形式进行回顾的。在座的住酒店在前台有问卷调查表,花了一两分钟打了一个分,这样的分简单点就是满意不满意打个勾或者叉是1—5分从很不满意到很满意五个级别去打一个级别。像这样一个方式实际上相对来讲我认为是相对客观的,受人当前形态的情绪影响会波动比较大,很难去界定说我到底三分跟四分跟五分之间有什么明显区别没有,很难界定的一件事情。对客观评估来讲并不是非常有利的手段。
In the second,Questionnaire survey in the form of customer ratings for review.Questionnaires were present at the hotel at the front desk,Took a minute or two beat one points,This simple point is satisfied put a check mark are not satisfied or fork is 1, 5 points from very not satisfied to satisfied to play a level five levels.Like such a way is, in fact I think relatively is relatively objective,Affected by the current form of emotional fluctuation is larger,Said it's hard to define what the hell am I three points with four points with any significant difference between five minutes,It is difficult to define one thing.Is not very favorable for objective evaluation methods.
第三点,我们现在做的不是针对某一个特定的饭店去评估说你这家饭店的满意度到底怎么样,我们要做的事情是排行,这个排行虽然说排出来的结果五星三百家四星三百家总共就六百家,但是为了排到这六百家酒店我们实际要对所有的参与排行的酒店都进行相应的数据分析。我们可以看一下刚刚介绍的数据,有758家挂牌五星级饭店是4422家四星级饭店参与了排行,意味着我们对总数超过五千家的饭店进行相关的顾客反馈数据搜集,如果以问卷的形式来做的话需要是数十万甚至上百万的调查问卷,每一家酒店一百张问卷至少是50万,如果两百张问卷超过一百万,它的工作量要搜集三百万问卷的工作量不管在成本上还是时间上都是难以接受的。幸运的应该说是目前的技术在发展时代在发展,WEB2.0大家都知道WEB1.0就是一个单向发布的时代,网络也是一种媒体,大家在网络上单向接收信息,我们看搜狐的新闻新浪的新闻这是WEB1.0的定义。到了2.0时代就是现在所处的时代背景,我们每个人都是一个自媒体,可以通过网络很容易进行双向的交流,每个人都可以在互联网上去发表自己的观点自己的看法,这就是2.0时代的大背景。随着现在这样一个背景的发展,实际越来越多的客户顾客非常愿意将他对于饭店的消费体验分享到网络上的,因为确实是对每一个人来讲他都有一个表达自己的诉求,不管是高兴了还是不高兴了,我住过这家酒店我不管是在携程上定的还是在艺龙上定的,会有一个非常大的诉求把自己对于这家酒店的体验分享到网络上给别人看表达自己。如果不是OTA上定的还分享到微博上微信上,大家都有这种诉求,并且愿意在互联网分享自己的体验和反馈。那么也就意味着到了目前为止实际互联网上以点评的形式存在着海量的顾客反馈数据,不用再搞问卷调查,要现去搜集,还得找到确实入住过这家饭店的人做调查行为,上百万个用户实际是不太可能的。但是网络上随着这几年数据的积累,这样一些顾客的反馈数据很现成的存在了,只要用好它就行了。我们每一条点评怎么用它的问题,每一条点评在我们站在顾客的角度我是一个消费者要去选择一家酒店,我上携程去订酒店的时候每一条点评对于我来讲就是一个这家酒店的背书,我看它写的好还是不好决定要不要住这家酒店,这仅仅是传统的用法。但是站在科学调研的角度或者站在满意度评估的角度,每一条点评实际上都是顾客所做的一次问卷调查,他可能在未知的情况下,恰恰是这个未知带给他客观性,他不需要迎合某一个酒店或者迎合某一个调查者,他只需要去真实把自己的观点好的或者坏的观点通过点评的形式分享出来就可以了,非常客观,而且用文字表述的。当他在字里行间表达的意思一定是最想表达的经过深思熟虑的意思,这样的每一条点评都会是一个顾客站在自己的角度对于酒店所做的一次客观的问卷调查,我们只要能够把它抽象出问卷调查的内容来就能够做这个工作。基于这样的判断和考虑,慧评网在CHM的基础上开展了这样的工作,以海量网络点评来代替问卷调查,这一点也是本次口碑排行重大的创新。在这样一个大的背景下整个工作里面慧评网为排行榜提供两方面的工作,一方面是数据采集的支持,第二方面是数据分析的支持。数据采集很好理解,因为网络上的点评很多,存在于互联网上各个网站里面都有,这时候到底从哪些网站上去采集这些点评,怎么高效的把这个点评采集下来,因为它不是一个酒店的几百条点评,针对所有侯选酒店上百万的点评,如何把它们高效快速采集下来而且不要遗漏,这是很大的工作。第二,采集下来够不够,这实际上是不够的,因为我们采集下来以后每一条点评就是用户所写的一些文字内容,就像刚才秘书长讲的例子一段话,这话离我们想找的转化成问卷调查的结果,想把它用起来实际还缺少一个环节缺少一个什么环节?语言分析的环节,读去这段话很好理解,我们读得懂,也知道这段话到底讲酒店的什么方面,但是计算机把这段话理解出来他在讲什么,这段点评中顾客所表达出来哪些观点有什么样的情绪态度,到底是愤怒的不高兴的还是满意的接受的,这些都是有一个工作在里面。最终把这个点评转换成可计算的观点数据库,拿这个数据库进行后面的综合评价模型最后得到满意度指数。
The third point,We now do not to evaluate a particular hotel that you the satisfaction of the restaurant exactly,We have to do is list,Although discharged out the results five three hundred four-star hundred six hundreds in total,But in order to row the six hundreds of hotel we actually want to all involved in ranking of the hotel for the corresponding data analysis.We can have a look at just introduce data,There are 758 listed 4422 four-star hotel is a five-star hotel to participate in the ranking,Means that we have total of more than five thousands hotels in collecting customer feedback data,If in the form of a questionnaire to do need to be hundreds of thousands or even millions of questionnaire,Every hotel, one hundred copies of questionnaire is at least one hundred,If two hundred copies of questionnaires to more than one million,Its workload to collect three million questionnaire workload in time or cost is hard to accept.Lucky it should be said that the current technology in the era of development in development,"Web" is known to all is a one-way release time,Internet is also a kind of media,A one-way receive information on the Internet,We look at sohu news sina news this is the definition of".Is now 2.0 era background,Each of us is a from the media,Two-way communication can easily through the network,Everyone can go up and published their views on the Internet,2.0 this is the background of The Times.With the development of such a background now,The actual customers more and more customers are willing to share his consumption experience of hotel on the network,Because really is for each and every one of his has an express their demands,Whether happy or not happy,I lived in the hotel I whether it is set on ctrip and elong,There will be a very big appeal for himself the hotel experience sharing on the network to people express themselves.If not on the OTA will also share on weibo micro letter,Everyone has this kind of appeal,And willing to share their experience in the Internet and feedback.Also means that the actual so far on the Internet in the form of a review there are vast amounts of customer feedback data,Don't have to do the questionnaire survey,Now go to collect,Have to find did stay in this hotel people do investigation behavior,Millions of users really is unlikely.But on the network as the data accumulated over the years,So some customer feedback is very readily available in the data,As long as you use it well.Each of us a comment on how to use it,Each comment in our standing in the customer's point of view I am a consumer to choose a hotel,I ctrip to book hotel on each review for me is an endorsement of the hotel,I think it good or bad decision to not to live in the hotel,This is just the traditional usage.But to stand in the Angle of the scientific research or to stand in the Angle of the degree of satisfaction evaluation,Every comments are actually the customer did a questionnaire survey,He may be in the case of unknown,Is precisely this unknown to his objectivity,He doesn't need to cater to a hotel or catering to a certain investigators,He just need to true his own point of view good or bad by comments in the form of share out,Very objective,And use the words.While he was in between the lines means it must be the most thoughtful meaning expression,Every article reviews can be a customer stand your side for an objective questionnaire survey conducted by the hotel,As long as we can make it abstract out the content of the questionnaire will be able to do the work.Based on the judgment and consider,Hui evaluation of network on the basis of CHM has carried out the work,With massive amounts of online comments instead of a questionnaire survey,This is the word of mouth was significant innovation.Work in such a big under the background of the entire inside of the two aspects of evaluation of network to offer list,On the one hand is the support data collection,The second aspect is the data analysis support.Data collection is easy to understand,For many comments on the network,Each site has existed in the Internet,Exactly what website to collect the comments from at this moment,How efficient the review collection down,Because it is not a hotel hundreds of comments,For all that lost millions of comments on hotel,How to efficient rapid acquisition down and don't miss them,This is a big job.In the second,Acquisition down enough,This is really not enough,Because we collected down after each review is what users write some text content,As examples of secretary just said a words,That we want to find into the results of the questionnaire survey,Want to use it actually lacks a link lack of a link?Linguistic analysis of the link,Read this passage is easy to understand,We read,Know what's this passage about what aspects of the hotel,But the computer put this passage to understand what he was talking about,Customers in this period of review what kind of emotional attitude expressed what view,Anger is not happy or satisfied with acceptable,These are all have a job in it.Eventually put the comments into a calculable view database,With the database on the back of the comprehensive evaluation model of the satisfaction index.
分开来讲一下这两方面的工作,第一个网络点评的采集,第二点评的语言分析。实际上基本的信息点刚刚秘书长已经简单介绍过了,我做一些技术方面的背书和细化的描述。首先一个机制的问题,我们采取聚焦排行的机制,这一点区别于像百度这样的搜索引擎,因为所谓排行就是小计算机能够在网络上把内容不停跑下来,对于网络上现在用爬虫用的最多的是百度GOOGLE这样的搜索引擎,一天到晚把互联网所有的页面抓取下来,但是对于我们这样一个应用背景来讲这样的爬虫实际是不够用的,抓取下来以后不分页面里面到底哪一块是点评,哪一块不是点评,哪些页面是跟点评有关的哪些页面是招聘页面不是我们想要的页面。我们抓取的时候需要直接定位到这家酒店到底有哪些点它的标题是什么正文是什么时间点是什么,要把这些信息精准定位到,这是我们聚焦爬虫的工作,在采集内容的时候精准定位到哪些内容是我们要的抓取,哪些内容是我们不要的删除。整个来源网站包含了三个最主流的OTA网站携程、艺龙、桐城,垂直搜索类网站去哪儿,还有两个目前使用量最大的用户生成的网站道道网和大众点评网大家经常会用到。对于其它网站大一点的基本上就是从网站当中复制数据,比如芒果网其实从道道网引用它的数据点评,像驴评网从携程上到用点评,还有其它小的基本不存在点评类,因为预定量少基本不会有点评量产生。这六个网站在我们评估下来涵盖了互联网绝大多数的点评原生态的点评。整个采集的时间范围是从05年1月1号开始到12年12月31号,我们用这么长跨度的点评来得到我们每一个因子每一个酒店所关注的方面的权重系数,这是最主要的目的。具体计算每家酒店满意度的时候,我们是2012年度满意度评估,我们采用12年度全年的数据来做本年度的计算。整个更新周期实际目前做的是10分钟更新一次的周期,这样能够最大限度避免网络点评的丢失行为或者删除行为,保证每条点评在互联网上出现的时候我们,我们在10分钟以内采集下来,不会有遗漏现象发生。
Separate this two aspects,The first online comments collection,The second comment on analysis of the language.Actually just basic information, secretary-general of a simple introduction has been made,I do some technical aspects of endorsement and the detailed description.A mechanism of problem in the first place,We take the focus on ranking mechanism,This is distinguished from like baidu search engine,Because the so-called ranking is small computers to content on the network running down,For now on the network use crawlers use most is baidu search engine GOOGLE,Day and night crawl down all Internet page,But for us as a practical application background, the creeper is not enough use,Crawl down after all inside pages which is a comment,Which is not a review,Which page are related to review which pages are recruitment page is not what we want.We need directly when fetching the orientation to the hotel to see exactly what order it what is the title of the text is what time,Accurate positioning will send these information to,This is our focused crawler,When gathering content to locate precisely to what we are fetching,What is that we don't delete.The entire source website contains three most mainstream OTA website ctrip/elong/tongcheng,Vertical search sites where to,And two far the largest user-generated website usage over network and mass dianping we often used.For other site bigger basically copy the data from the website,Such as mango net from behing actually review the data of network reference it,Like asses evaluation of the net to use comments on ctrip,There are other small basic there is no comment on class,Because the booking volume no comment on the amount of basic produce less.The six sites in our down assessment covers the vast majority of comments on Internet original comments.The entire collection of time range is from January 1, 5 years to 12 years on December 31,We use for such a long span of comments for each of our factor each hotel concerned aspects of weight coefficient,This is the main purpose.Calculation of each hotel's satisfaction,We are 2012 satisfaction evaluation,We have 12 year data to do the calculation of this year.The actual update cycle now do is 10 minutes to update a cycle,It can avoid network comments on maximum loss or delete actions,Ensure that each of the comments we appeared on the Internet,We collected down within 10 minutes,There would be no omission phenomenon occurred.
第二方面的工作是语言分析的工作,这个刚才介绍过了,我们在这里也是详细再阐述一下语言分析的问题在哪里到底怎么做才好。我们的分析对象就是左边的原始评论,用自然语言表达的正文片断,像左边一句话就是自然语言表述的文章。我们希望得到的分析结果像右边这样的能够把点评中顾客对酒店的观点识别出来以及他在表达这个观点时的情感态度识别出来,实际这样一个分析就是点评的分析要求深度理解的要求。刚刚戴院长讲到过,这样一个理解是非常困难的事情,因为人在表达内容的时候有他的变化性存在,有人可以随意说,但是计算机都得读懂。戴院长刚才举了一个例子,坑爹,你看见这家院长的点评是坑爹的词语,没有事先的定义完整定义说这代表是一个什么样的情绪态度的时候很难理解这句话。假设我就把坑爹输入进去告诉计算机坑爹代表负面的让你不满意的词语的时候,够不够呢?我再举一个例子,一个用户写的点评是下了出租车在去酒店的路上天气灰蒙蒙的,北京的天气真坑爹,我入住的酒店服务还不错。除了坑爹这个词语不是原生态的其它词语都是原生态的,用了恶劣,北京的天气恶劣,出现了恶劣出现了坑爹,目前的原有系统里面绝大多数的系统对于这样的评论比较头疼,如果只是看到坑爹或者只是看到恶劣的时候这就是一个差评,这是一个负面评论,因为用户表达不满意的情绪。但是实际上没有对酒店表达,他表达的负面情绪是北京的天气,跟酒店无关。我们怎样才能够把这条点评把这段话这句话从一段点评当中摘出来说你这句话讲的是北京天气跟我的酒店没关系,不能算酒店的负面评论,这就涉及到另外一个更深层面的问题了,我们在讲语言的时候除了词语还有更多的逻辑关系,相信我们小学初中的时候学过句法结构,这种东西我们也得教会计算机,让计算机把一句话拆分成主谓宾,找到它们之间的修辞单元。通过这句话系统理解坑爹代表的是天气是北京的天气,它跟我们酒店没有关系就把它忽略掉。如果说他指的是北京的酒店服务真坑爹,影射过去关联上的是服务,这个服务不是别人的服务是这家酒店的服务,我们当成酒店服务类的提取出来。这就是典型的语言理解的过程,要在句子层面词层面做不同的判定,评判出这句话讲什么意思。通过语言分析把左边的原始评论直接映射到右边的分析结果,把每个设计到的酒店我们称之为因子以及在微因子上代表的情感态度直接出来。
The second part of my job is the job of the language analysis,The introduced to you just now,We here is also a detailed analysis to describe language problem where exactly what to do.Our analysis objects is to the left of the original comment,Expressed in natural language text clips,Like the left one is the natural language expression.We hope to get the analysis results such as the right to review the customer identified to the point of view that the hotel and his emotional attitude in expressing the idea when identified,Actually such an analysis is to review the analysis of the demands of deep understanding.Just wear dean said,Such an understanding is a very difficult task,Because the person is in the variability of the expressing content with his presence,One can say,But the computer must read.Dean just an example,Pit dad,Have you seen the dean of the review is pit dad's words,Hadn't said it represents the complete definition is a what kind of attitude when it is hard to understand this sentence.Assume that I will tell the pit dad entered computer pit dad represent negative let you not satisfied with words,Is enough??I'll take an example,A review is written by a user under the taxi on the way to the hotel the dusty weather,The weather of Beijing is pit dad,I stay in the hotel service is good.In addition to pit dad this words is not original other words are original,With the bad,The weather in Beijing,A bad a pit dad,Current system inside most of the original system for such comments more headaches,If only see the pit dad or just see bad when it is a bad review,This is a negative review,Because the user can express the emotions are not satisfied with.But in fact not the expression of the hotel,His expression of negative emotions is the weather in Beijing,Has nothing to do with hotel.How can we put this article reviews the passage this sentence from a review of picking out said that you speak this sentence is the weather in Beijing in hotel it doesn't matter with me,You don't get a hotel negative comments,This involves another deeper level question,In addition to words when we are speaking the language more logical relationship,Believe we primary school junior high school of time studied syntactic structure,This kind of thing we have to teach computer,Let the computer word split into subject-predicate,Find them between the rhetorical unit.Through the system to understand this sentence pit dad represents the weather is the weather in Beijing,It has nothing to do with our hotel just ignore it.If he was referring to the Beijing hotel service so pit dad,Reference service is associated in the past,This service is not someone else's service is the service of the hotel,We as a hotel service class.This is a typical process of language comprehension,Want to do different judgement at the sentence level word level,To judge the speak this sentence what mean.Put through the analysis of the language on the left side of the original comment directly map to the right analysis of the results,Call each design to the hotel we factor and micro factor on representative of emotional attitude directly.
我们看一下标红了,实际标红是什么意思?我们刚刚提到的我们的工作试图把用户的点评转换成问卷调查,转换成真正的能够靠这种用户自发行为能够阐释的问卷调查。问卷调查应该像右边这样对秘书长提出的13个问题,实际对每个问题用户打一个分满意、不满意,这就是问卷调查。我们的工作就是把左边的点评转换成右边的问卷调查,最后想象得到我们实际上基于刚刚的工作能从互联网上抓取这些点评,我们能够把这些点评的正文转换成一张一张问卷调查,当我们海量去采集这些点评的时候我们已经免费得到上百万张问卷调查这样一个数据规模来做排行的工作。最终慧评网作为数据的提供方针对我们的网络点评提供了一些海量的顾客反馈的数据,具体两个指标,这里面仅仅限于我们刚刚提到的758家挂牌五星和4422家四星总共五千来家酒店,对这五千来家酒店提供270万条顾客点评。只要每一条点评有很多的观点,每家酒店有多个观点的时候,我们在270万条点评当中提取出来910万个观点,这样在百万级别上面玩数据的工作。为什么慧评网能干这件事情?张秘书长邀请我们以后,我们也确实非常踊跃非常自信承担这个工作,我们有自己的积累和基础承担这样一个工作任务。实际上简单说一下我们慧评网实际上确实比较应景,慧评网就是干这个的,慧评网在互联网的搜索、语言的理解、挖掘等等都有一些技术专长和知识产权,从我在北大做博士开始一直在做这块工作。带的这些技术最后应用到酒店的领域,实际针对酒店领域我们目前可以蛮自豪的说建立了全球最精准的酒店点评数据库,不一定是全球最大的,但至少是最精准的。我们能够得到一条点评所对应出来所有的观点和问卷调查。
We see a red with the subscript,Actually the red is what mean?We just mentioned our work trying to convert user's comments on to the questionnaire survey,Converted into a real spontaneous behavior can depend on the user is able to interpret the questionnaire.The questionnaire should be 13 on secretary general problems, such as the right,The actual user satisfaction play a points for each question/Not satisfied with,This is a questionnaire.Our job is to the left of the review into the right of the questionnaire,Finally imagine we actually can work from the Internet based on just grab those reviews,We can take these comments on the body of the converted into a piece of a questionnaire survey,When we mass to gather the comments we have such a free get millions of questionnaire data size to do list.Finally hui evaluation network as data provide guidelines for our online comments provided some customer feedback of massive amounts of data,Two indicators,It just we just mentioned 758 listed 4422 five-star and four-star hotel a total of five thousand,Article for the five thousand hotels with five thousand customer reviews.As long as each review there are a lot of point of view,Each hotel has more than one view,We extracted in the middle of the 2.7 million comments on 2.7 million views,Such work in million level play data on it.Why hui evaluation network can do about it?Secretary general zhang invited us later,We also really eager is confident to undertake this work,We have our own accumulation and basis for such a task.Simple actually said about our hui evaluation net actually is really appropriate,Hui evaluation network to do this,Hui review web search in the Internet/The understanding of the language/Mining and so on all have some technical expertise and knowledge property rights,Since I do Dr At Peking University has been doing this work.With these techniques applied to the fields of hotel at last,Actually for hotel areas we now can say pretty proud of established the world's most accurate hotel reviews database,Not necessarily is the world's largest,But at least is most accurate.We can get a review for all view and questionnaire survey.
左边我们分享几个数据,这个数据可能能解答戴院长的问题,现在为止我们统计了13年的网络点评覆盖16万家酒店国内国外,国内六万家酒店国外十万家酒店的点评,搜集了1600万条点评,提取出来八千万条点评观点,从55个纬度根据频次,根据用户点评中愿意描述的多与少搜集出来24个因子作为这次整个排行的模型因子。我们如果有机会比如在明年下一轮发布的时候将会有机会能够对全国一万五千多家挂牌的星级酒店也有机会做这样一个整体的工作,这个数据到今年二月份为止,整个是我们慧评网在口碑排行里面所承担的数据提供和数据分析的工作。
We Shared a few data on the left,This data may wear dean can answer questions,Now we counted 13 years of network review covered 16 m hotel at home and abroad,6 m Chinese foreign 10 m hotel reviews,Collected 16 million comments,Extracted eighty million comments,From 55 latitude according to frequency,According to customer's comments to describe more and less to collect from 24 factor as the whole model of the factors.If we have opportunities such as released in the next round of next year there will be a chance to a listing on the national more than fifteen thousand star-rated hotels also have the opportunity to do such an integral part of the work,This data until February this year,Whole is our hui evaluation network inside the mouth ranking data to provide work and data analysis.
谢谢大家!
Thank you for your attention!
亲稳链接:链接亲民维稳,践行稳中求进!
- 中瑞签署交通运输合作谅解备忘录-2013年05月新闻-中华人民共和国--亲民维稳
- 快递服务多环节战略合作协议签订-2013年05月新闻-中华人民共和国--亲民维稳
- 潜水打捞协会5年培训5900余名专业人员-2013年05月新闻-中华人民--亲民维稳
- 冯正霖与交通建设企业代表座谈筹融资工作-2013年05月新闻-中华人--亲稳网络
- 杨传堂:在改革发展实践中增强战略思维能力-2013年05月新闻-中华--亲稳舆论
- 华东六省一市联动治超-2013年05月新闻-中华人民共和国交通运输部--亲稳舆论
- 冯正霖在作农村公路工作报告时指出:农村公路发展要加快实现四个--亲稳网络
- 中国交通报社与部职业资格中心启动战略合作-2013年05月新闻-中华--亲稳舆论
- 河北“七公开”经验全国推广 -2013年05月新闻-中华人民共和国交--亲稳网络
- 宁杭、杭甬高铁试运行 长三角高铁网初步形成-2013年05月新闻-中--亲稳网络
- 今年第二次快递职鉴全国统考报考人数创新高-2013年05月新闻-中华--亲民维稳
- 上广开通内贸运输“海上动车”-2013年05月新闻-中华人民共和国交--亲稳网络
- 李建波要求公路建设延伸到哪里廉政责任就落实到哪里-2013年05月--亲稳网络
- 船舶船员协同管理信息化有序推广-2013年05月新闻-中华人民共和国--亲稳网络