Tuesday, June 26, 2012

Windows和Ubuntu双系统完全独立的安装方法

http://www.ubuntuhome.com/windows-and-ubuntu-install.html

| Ubuntu Home

Posted by Snow on 2012/06/25

安装Windows和Ubuntu双系统时，很多人喜欢先安装windows，然后安装ubuntu最后使用ubuntu自带的grub进行引导，如果重新安装windows之后则ubuntu无法启动。还要花很长时间尽心修复。

今天Ubuntu之家给大家推荐一种安装方式，将两个系统完全独立开来，无论重新安装哪个系统都不会影响另一个系统。这里有一个要求，就是windows版本要比xp高，而且不支持xp。

第一步：安装windows系统。

第二步：安装liveusb-creator，并制作U盘版的ubuntu启动盘。

第三步：插入U盘安装ubuntu。

第四步：在windows启动管理器中添加ubuntu启动项。

大体分为四步，好像每个人都是这样安装。OK以本人安装的方式为例，详细说一下。

第一步安装windows，并流出30G左右的空白磁盘。（大小根据个人喜好设置，ps：分区时主分区个数不能多余4个，不然无法安装linux）

第二步可以参考http://www.ubuntuhome.com/liveusb-creator-v3-11-7-releases.html，注意必须安装到C盘，如果不是安装到C盘，可能会有问题。该软件我屡试不爽，archlinux除外。

第三步安装linux，这一步是关键我们要把grub安装到linux分区中。ubuntu各个版本下载地址：http://www.ubuntuhome.com/ubuntu-download

这里比较高级些，小心操作没有什么问题，而且还可以点击"还原"进行恢复，所以不需要担心什么。

以我的为例，我只分了两个区给ubuntu，一个根分区（/dev/sda6）和一个交换分区(/dev/sda7)，安装linux必须有这两个分区。

下面有一个"安装启动引导器的设备"，这个选择linux的根分区，或者boot分区（如果你有boot分区的话）。然后下一步进行安装。

安装完毕后重启，这时和没有安装ubuntu是没有任何区别的，直接进入你的windows系统。我们需要在windows启动管理器中添加ubuntu的启动项。

第四步：下载easybcd这个软件安装。

点击"1"，然后在"2"中选择"grub 2"，下面随便填一个你喜欢的名字，点击"3"按钮。

OK，重启windows，这时你会发现windows启动管理器中多了一项（名字在"2"处填写的），选中该项回车，这就进入了我们熟悉的grub启动管理器中。

因为grub是安装到ubuntu中，如果 windows出现问题重新安装，我们仅仅用easybcd添加下启动项即可。如果要升级ubuntu的话比较恶心了，这个东西会将grub写入mbr中，如果升级ubuntu的话必须得重新安装。我曾经从某个版本的ubuntu升级到另一个版本，发现升级的和直接安装的有区别，所以每次都是重新安装系统。

Monday, June 18, 2012

互联网精准广告定向技术

http://www.williamlong.info/archives/3125.html

-月光博客

互联网精准广告定向技术，指的是依托于搜索引擎庞大的网民行为数据库，对网民几乎所有上网行为进行个性化的深度分析，按广告主需求锁定目标受众，进行一对一传播，提供多通道投放，按照效果付费。

本文的写作初衷是总结自己的知识，将知识从片段的、隐形的转化为可以向别人讲述、能够给人帮助的。在总结的过程中自己也提升了很多，同时希望这些内容能够切实的给刚进入这个行业的同学们以帮助。为了查看方便，特把内容进行汇总。

　　一、基础知识：

　　1、Http Header之User-Agent

User Agent中文名为用户代理，是Http协议中的一部分，属于头域的组成部分，User Agent也简称UA。它是一个特殊字符串头，是一种向访问网站提供你所使用的浏览器类型及版本、操作系统及版本、浏览器内核、等信息的标识。通过这个标识，用户所访问的网站可以显示不同的排版从而为用户提供更好的体验或者进行信息统计；例如用手机访问谷歌和电脑访问是不一样的，这些是谷歌根据访问者的UA来判断的。UA可以进行伪装。

浏览器的UA字串的标准格式：浏览器标识 (操作系统标识; 加密等级标识; 浏览器语言) 渲染引擎标识版本信息。但各个浏览器有所不同。

字串说明：

1、浏览器标识

出于兼容及推广等目的，很多浏览器的标识相同，因此浏览器标识并不能说明浏览器的真实版本，真实版本信息在 UA 字串尾部可以找到。

2、操作系统标识

3、加密等级标识

N: 表示无安全加密
　　I: 表示弱安全加密
　　U: 表示强安全加密

4、浏览器语言

在首选项 > 常规 > 语言中指定的语言

5、渲染引擎

显示浏览器使用的主流渲染引擎有：Gecko、WebKit、KHTML、Presto、Trident、Tasman等，格式为：渲染引擎/版本信息

6、版本信息

显示浏览器的真实版本信息，格式为：浏览器/版本信息

　　2、用户追踪之基础技术——Cookie

Cookie是如此的重要，以至于我们后面要讲到的回头客定向、访客频次定向、用户定向等等都需要基于此技术才可以实现，并且我们日常工作中所能见到的第三方监测工具如doubleclick、99click、秒针等也都要利用cookie技术，网站分析工具如GA、百度统计、CNZZ等也需要利用cookie。如果没有Cookie，互联网广告市场将受到巨大打击，尤其对于目前我们谈论的精准广告而言。如果没有Cookie，网站分析也不从做起，遑论优化了。

Cookie是什么

Cookie在英文中是小甜品的意思，但在计算机语言中，Cookie指的是当你浏览某网站时，网站存储在你电脑上的一个小文本文件，伴随着用户请求和页面在 Web 服务器和浏览器之间传递。它记录了你的用户ID，密码、浏览过的网页、停留的时间等信息，用于用户身份的辨别。Cookie通常是以user@domain格式命名的，user是你的本地用户名，domain是所访问的网站的域名。

为什么要Cookie

因为HTTP协议是无状态的，对于一个浏览器发出的请求，服务器无法区分是不是同一个来源，无法知道上一次用户做了什么。所以，需要额外的数据用于维护会话。 Cookie 正是这样的一段随HTTP请求一起被传递的额外数据，用于维护浏览器和服务器的会话。我们可以想象一个场景，你没有登录京东时在京东上购物，选择了3件商品放入购物车，在结算时，京东为什么还能知道这三件商品是什么？没错，是Cookie！

Cookie的工作原理

Cookie利用网页代码中的HTTP头信息，伴随着用户请求和页面在 Web 服务器和浏览器之间传递。例如：当你在浏览器地址栏中键入了Amazon的URL，浏览器会向Amazon发送一个读取网页的请求，并将结果在显示器上显示。在发送之前，该网页在你的电脑上寻找Amazon网站设置的Cookie文件，如果找到，浏览器会把Cookie文件中的数据连同前面输入的URL一同发送到Amazon服务器。服务器收到Cookie数据，就会在他的数据库中检索你的ID，你的购物记录、个人喜好等信息，并记录下新的内容，增加到数据库和Cookie文件中去。如果没有检测到Cookie或者你的Cookie信息与数据库中的信息不符合，则说明你是第一次浏览该网站，服务器的CGI程序将为你创建新的ID信息，并保存到数据库中。

关于Cookie的一些知识点

1、Cookie是基于浏览器的，因此当电脑上安装多个浏览器时，服务器会生成多个Cookie。虽然是同一个人，但服务器是识别为多个用户。
2、Cookie是基于浏览器的，因此当同一台电脑有多个人使用时，服务器也只会生成一个Cookie。虽然是多个人，但服务器会认为是一个用户。
3、Cookie是无法跨设备进行设置的。比如我们在单位和家里分别使用两台电脑，即使我们使用同一种同一版本的浏览器，我们还是生成了两个Cookie，服务器会认为是两个用户。（PS：现在有些浏览器可以同步数据，比如Chrome、Friefox，可以避免这种问题）

请注意：以上所说的Cooke指的全部是Http Cookie。有一种Cookie——Flash Cookie，可以解决多浏览器的问题。

关于Flash Cookie

FlashCookie是由FlashPlayer控制的客户端共享存储技术，鉴于目前Flash技术的普遍性，几乎所有的网站都采用，所以具有同Http Cookie一样的作用。在技术上，通过使用JavaScript与ActionScript可以将Http Cookie和Flash Cookie进行互通。

Flash cookie的优势在于：

1、跨浏览器：不管用户的计算机上安装了多少个浏览器或者浏览器的不同版本，使用Flash Cookie能够使所有的浏览器共用一个Cookie。

2、不易删除：所有的浏览器均提供了清除Http Cookie的快捷方式，但Flash Cookie并没有此种方式，并且其保存位置非常隐蔽，网民难以删除。

3、容量更大：Flash Cookie可以容纳最多100千字节的数据，而一个标准的HTTP Cookie只有4千字节。

作为网络广告行业的销售人员，了解以上知识就已经绰绰有余了。如果想了解更多，可以接着往下看。

Cookie的数量

1、大多数浏览器支持最大为 4096 字节的 Cookie。因此最好用 Cookie 来存储用户 ID 之类的标识符，用户的详细信息则通过用户 ID从数据库或其他数据源中读取。

2、浏览器还限制站点可以在用户计算机上存储的 Cookie 的数量。大多数浏览器只允许每个站点存储 20 个 Cookie；当存储更多 Cookie时，最旧的 Cookie 便会被丢弃。有些浏览器还会对它们将接受的来自所有站点的 Cookie 总数作出绝对限制，通常为 300 个。

Cookie的失效时间

1、浏览器的Cookie设置会决定是否保存Cookie数据。如果浏览器不允许Cookie保存，则关掉浏览器后，这些数据就消失。

2、如果浏览器允许保存Cookie，那么Cookie的时间由服务器的设置决定。Cookie有一个Expires（有效期）属性，这个属性决定了Cookie的保存时间，服务器可以通过设定Expires字段的数值，来改变Cookie的保存时间。如果不设置该属性，那么Cookie只在浏览网页期间有效，关闭浏览器，这些Cookie自动消失，绝大多数网站属于这种情况。通常情况下，Cookie包含Server、Expires、Name、value这几个字段，其中对服务器有用的只是Name和value字段，Expires等字段的内容仅仅是为了告诉浏览器如何处理这些Cookies。

第一方Cookie和第三方Cookie

大多数的第三方监测工具和网站分析工具都会采用第三方Cookie。所谓第一方和第三方的说法，是用来确定Cookie的归属的，这个归属是指Cookie中记录的域（domain）。第一方和第三方的唯一区别只是：Cookie中的域名是否和被访问网站的域一样，是就是第一方，否就是第三方。举个例子：如果你访问网站www.chinawebanalytics.cn的时候，网站在你的电脑上设置了一个Cookie，里面的记录的域名也是www.chinawebanalytics.cn，那么这个Cookie就是第一方的，归你访问的网站www.chinawebanalytics.cn所有。而如果你访问网站www.chinawebanalytics.cn时，在你的计算机中设置的Cookie的域名是www.abc.com，那么这个Cookie就是第三方Cookie，归www.abc.com所有。

所以，第一方Cookie并不一定需要由某个网站自己的服务器给自己建立，别的网站也能为它建立；而且，第一方Cookie也不一定是能由某个网站自己读取的，它完全可能由第三方读取。

　　二、定向技术介绍：

　　1、语言定向

1、语言的来源

简单理解，语言指的是用户的浏览器语言，是从浏览器的Http Header的Accept-Language的字段来的。

2、浏览器的Accept-Language是由浏览器的语言设置所决定的。

3、浏览器的默认语言设置和浏览器语言无关，默认继承操作系统的语言。

　　2、浏览器定向

浏览器定向同样需要依赖于各个浏览器在打开页面时所传输的Http header信息中的User-Agent。

　　3、操作系统定向

操作系统定向依赖于各个浏览器在打开页面时所传输的http header信息中的User-Agent。

　　4、地域定向

地域定向依赖于对IP地址的识别，而IP协议是互联网的基础协议，因此从网络诞生的第一天起，地域定向就可以被使用了。

通俗来讲，IP地址就是互联网上的门牌号，接入互联网的所有主机就是我们的一个个住所，其中有个人的，有单位的。个人住所一家一个门牌号，单位的多家公用一个门牌号，由于规划的原因，有的住所会有多个门牌号，也是规划的原因，门牌号有时会发生变化。IP地址也有此特点，一台主机可以具有多个IP地址，而多台主机也可以公用一个IP地址。

现实中，不管如何规划，通过门牌号我们能找到我们要找的住所，也能清楚住所所在的具体位置。同样，在网络中，通过IP地址我们也能定位到我们所需要找的主机，并且清楚知道主机所在的地理位置。这样我们就能进行广告的地域定向了。

从技术层面讲，地域定向的工作逻辑是：

当一个请求发送给服务器时，服务器根据配置（以Apache为例，在Apache Httpd中进行配置）记录下请求的相关数据，组成日志文件，日志基本会包括请求时间、请求IP、请求的URL、请求的Reffer、请求的User-Agent以及其他信息，将收集到的IP地址与已有的IP数据库进行比对，即可以确定请求者的地理位置了，比如山西省太原市。

国内目前免费的IP库有 QQ IP数据库纯真版，即我们通常所说的纯真IP库，收集了包括中国电信、中国网通、长城宽带、网通宽带、聚友宽带等 ISP 的最新准确 IP 地址数据，包括最全的网吧数据。IP数据库每5天更新一次，企业可以在此基础上修正后使用。

目前的地域定向更多的是针对省份以及地级城市的定向，针对县级市或者区级的定向基本上都十分不准确。

　　5、回头客定向

随着电商网站的火爆，从2010年开始，互联网广告行业出现了一种定向方式——回头客定向。回头客定向是随着精准理念的发展而提出来的。顾名思义，回头客定向是指针对到达过广告主网站的某一个点的用户或者发生过某一个行为的用户进行定向。

从概念中，我们可以发现回头客定向的三个基本点：1、到达过；2、某一个点或某个行为；3、定向投放。这三点也是回头客定向和人群定向的区别之处。

从营销的角度讲，针对不同到达深度的用户或者不同行为的用户，我们需要采取的营销策略可能会有不同。我们以电商网站的购物流程来举例子。电商网站的购物流程分为以下几个步骤：

1、针对浏览过商品的人，我们应该分析他的浏览记录，发现他感兴趣的商品，然后通过广告将他感兴趣的商品推送到他的面前（如果要做到非常完美，针对每个用户有不同的广告显示，需要有哪些条件？大家可以评论，我们一起交流）。

2、针对已经将商品加入购物车的人，此时可能更重要的是给他一张电子优惠券，以促进其下单。

3、针对到达过注册或者登录界面，但未完成注册和登录的人，给他一个商品即将售馨或者即将涨价的倒计时更能促进其回来下单。

4、针对到过填写配送地址页面但没有提交订单的人，提示免邮递费用或者直接告诉他"你还差一步就将完成订单"，可能会是一个好的方法。

5、已经提交订单的人，是我们的老客户了，此时应该推荐关联的商品信息，以促进其二次消费。

所以，进行回头客定向的投放，一定是要有以下三个步骤的：

1、设置回头客人群的监测。支持回头客定向的系统必须能够支持对各个点的监测，因此提取监测代码在此是必须的。好的系统可以利用一个监测代码，通过数据分析得出不同监测点的回头客（大家说如何做到？）；差的系统就提供不同监测点的设置功能，每个监测点提取不同的监测代码。

2、整理针对各个监测点用户的独特营销诉求。制作针对不同回头客的不同创意。

3、利用投放系统，对回头客进行定向的广告投放。

一般来讲，定向越准确，能得到的量就会越少，因此，在做回头客定向时，不应该再选择媒体进行投放。从另一个角度理解，回头客定向已经是最领先精准的目标用户定向了，此时媒介选择的意义也大大弱化了。

以上所说的是纯正意义上的回头客定向，鉴于回头客定向受人欢迎的精准的概念和可怜的流量，有些人或公司权衡后会将回头客定义的非常广泛，比如到过网站的人、点过广告的人、看过广告的人都算作回头客，这只是又一次的中国特色而已。这种事情多了，反而于精准广告市场的发展不利。

　　6、人群定向

人群定向其实就是目标人群定向，在营销学中，产品定位以及人群细分是非常重要的理念，这种理念也已经得到了市场的认可，因此每一种产品在设计、生产之初就已经确定了自己的目标人群。从我们的广告投放、市场宣传来讲，一定是希望能给对目标人群进行，花费在目标人群之外的推广都是浪费的。

但在以往的媒介中，想要完全的识别用户，以确定是否目标人群并不是容易的事情，甚至从理论上说是完全做不到的，只能通过不同的媒介手段去尽量的靠近目标人群（电视、广播、杂志都是如何确定自己的受众的呢？有人讨论嘛？）。但即使这样，也产生了一句广告界最著名的话语——我知道广告费浪费了一半，但我不知道到底是哪一半。

在互联网时代，通过技术的力量，可以无限的接近、近乎准确的判断每一个人的属性，从而为广告主目标群体定向服务。但是，互联网也只是无限的接近，而不是确切的能标示出个人的属性。目前，最接近的应该是类似于罗维邓白氏之类公司的数据（顺便说一句，央视315晚会的曝光，对罗维邓白氏公司只能是免费的广告，而不是打击）。

言归正传，我们来说说互联网的人群定向。互联网公司通常讲的人群定向并不单单包括人口的自然属性（demographic），还包括人群兴趣（interest）、人群行为（behavior）、购物行为（purchasing）。

注：此处我们说的人群行为指的是对广告的行为，比如浏览广告，点击广告以及转发、下载广告等交互行为。目前市场上经常有一些公司标榜行为定向，但让其展开一说，就只是说对用户的浏览行为进行定向，非常正确、毫无破绽的说法，但细问却还是这一句。这只能说明这种公司忽悠而无真章的事实（大家说说为什么能说明？）。

对于真正提供定向的公司，不管各个公司都提供什么样的人群定向，以上所说的4类属性或行为都是基于cookies技术（了解Cookie），通过对用户长期的互联网浏览行为数据进行分析所得出的。由于各公司的资源优势不同，因此目前没有一个公司能够建立健全的数据。

自然属性（demographic）

自然属性包括性别、年龄、学历、地域、婚姻状况、家庭状况（是否有小孩，小孩年龄等）、收入（个人收入、家庭收入）、行业、职业等信息。单纯通过互联网浏览行为并不能分析到如此全面且准确的信息，目前还主要以找到真实的样本进行建模分析为主。自然属性数据以艾瑞的数据最为准确。

人群兴趣（interest）

人群兴趣在每个公司会有不同的认知。目前，兴趣数据属悠易最好，悠易的数据是公开的，可以通过悠易受众引擎查看。

人群行为（behavior）

上面注解所说的人群行为仅仅是行为中的一种，如果有搜索引擎的资源，则可以加入搜索行为的监测（如百度的搜客定向——对在百度搜索过已添加关键词的人，在其浏览指定的投放网站时投放客户推广组下的创意。）；如果有微博数据，则可以加入关注与被关注的行为（新浪有此打算吗？），因此人群行为各公司的定义差异是最大的。

购物行为（purchasing）

购物行为指的是作为消费者角色，互联网用户的消费数据。毋庸置疑，购物数据如果淘宝是第二，也没人可以自称第一。

在广告系统中，用户的所有属性或行为应该是可以进行自由组合设定的。但以上所有的属性或行为就可以全方面的了解用户了吗？并不是！这是一个发散性的命题，每个人会有不同的见解。比如我们还可以加入用户的设备（PC、Pad、移动设备等），通过用户上网通道来描述用户。还有其他的角度吗，大家留言讨论吧！

　　7、并发次数

在按天售卖或者按时间售卖的时代，是不需要考虑并发次数的。只是在按照展现次数（CPM）售卖的时候，我们才有可能需要考虑广告并发的设置。

在按照CPM售卖时，广告投放的速度可以有两种——尽快投放和匀速投放。尽快投放很好理解，就是尽快投放完规定的量。匀速投放就是在规定的时间内均匀的投放完规定的量。举个例子，一天之内投放1000个CPM，选择尽快投放就意味着广告在第x小时投放完毕，那么(24-x)的时间内就不会再看到广告；而匀速投放意味着我们需要在第23小时59分时还看到广告。这个如何做到呢？此时就需要利用并发次数的设定了。

并发次数指的是广告某个时间周期内播放的次数，其目的是为了保证广告的匀速投放。并发次数的计算方法为：广告投放量/投放时长。注意：此处的时长根据需要，可以按照秒、分、刻等单位来计算。并发次数的规则需要广告投放核心的支持，当在规定的时长内，广告未达到并发次数时，广告可以展现。达到设置次数后，则不予以展现。

一个思考题：如果一个广告一天内要求投放1000CPM，而媒体的PV一天正好是1000CPM，那么尽快投放是否能够跑完广告的规定量？匀速投放是否能够跑完广告规定的量？如果跑不完，我们需要怎么做，才可以跑完？

　　8、时段定向

每一个广告活动，每一次宣传活动，都会有周期的设定。在一个投放活动被制订出来后，在每种媒介、每个媒体上的投放周期就已经确定了。电视、广播、报纸杂志是以节目的播放时间、广告顺序以及报刊杂志的期数来决定投放的周期的。互联网广告则以开始日期、结束日期以及投放时段来决定投放周期的（需要注意的是：投放时间是以服务器的时间为准的）。

　　9、网页定向

网页定向指的是针对特定的URL进行定向，使广告投放在指定的URL上。网页定向是互联网广告定向中不常使用的定向。

网页定向最核心的技术有两个：

1、如何获取当前页的URL，注意是当前页非Http Header中的Reffer。当前页URL需要通过加在页面上的JS代码获得，设计时需要考虑到如果JS代码被放在iFrame中的情况，甚至会被放置到好几层嵌套的iFrame上。

2、广告系统在定向设置时需要考虑到URL匹配问题。左匹配、右匹配、包含、不包含、通配符等。匹配规则需要在广告投放核心进行处理。

　　10、访客频次

频次是广告投放中一个非常重要的概念。网络广告的频次和其他媒介投放时的频次概念是一致的。

频次是指个人或家庭接触广告信息的次数。在传统的电视媒介中，我们不能准确的控制每一个人接触广告信息的次数，只能是通过总收视点除以到达率计算得出。但是在网络广告中，一个人可以接触广告信息的最高频次是可以严格控制的，实现严格控制的基础技术也是cookie，可见cookie对于互联网广告精准投放的重要性。

在网络广告的投放中，频次的控制对象比其他媒介更广泛，频次可以控制广告的浏览、点击、完整浏览，甚至是广告的转发、下载等其他的行为，因此互联网的频次指的是访客与广告发生互动的最高次数，而互动的行为设定则需要能够在广告系统中进行设置。当然经常还是对广告的浏览进行频次设置，我们也以此举例。

网络广告频次控制的原理非常简单。当用户通过浏览器访问页面时，会请求放置在页面的广告位代码，广告位代码和服务器进行交互，广告位代码将用户的cookie信息（包含对广告的访问次数）传给服务器（如果没有cookie，服务器会生成一个），服务器进行频次的匹配，超过频次设定的广告将不会被投放，在同时判断了其他定向条件后，服务器回传适合的广告到浏览器进行投放，在返回信息的同时，还会将用户cookie上此广告的浏览次数加1。通过这种方式，网络广告实现了精确的频次控制。

广告投放中，并不是频次越高越好，过少的接触不会在接触的用户心中产生印象，过多的接触反而会使接触的用户产生不快，厌恶。1972年，美国心理学家赫尔伯特.克鲁格曼经过研究，确立了消费者接触广告三次的心理学关系：第一次好奇："这是什么？"第二次是认识："干什么用的？"第三次是判断："对广告产生什么印象？"。当然，因为产品、市场、品牌、竞争、创意以及媒体等不同，在频次设置上也会有所不同，不过，对广告的有效接触频次限定一般都是以3次为底限的。

为了了解广告的投放效果，在报表中，广告系统一般会提供平均接触频次、频次分布图。

讨论：频次分布图是什么样子？设计时需要注意什么？

　　11、关键词定向

我们所讲的关键词定向实际上就是Google AdWords中的内容相关广告（Contextual）。

关键词定向实现必须具备以下能力：

抓取网页内容并进行分析的能力

分析时需要考虑到页面的结构、html标签、链接等影响，对页面的正文进行分析，得到最恰当的一些关键词来描述页面所表达的内容。关键词定向是否有效的瓶颈即在于此。

需要注意的是，由于实时快速分析页面的要求非常高，当页面足够多的时候，系统执行效率会非常的低下，因此必须具有提前抓取有可能出现广告页面的能力。

当然，实时快速分析同样重要。

广告系统中设置广告投放关键词的能力

需要能够确保操作人员可以方便快捷的在系统中进行关键词的设置（正向选择、反向排除），如果能够提供对之前投放的关键词效果分析及推荐更好。

来源：牛国柱投稿，原文链接。

Monday, May 14, 2012

10 Spelling Checker Secrets for Microsoft Word

http://www.computerworld.com/s/article/9225180/10_Spelling_Checker_Secrets_for_Microsoft_Word?source=rss_latest_content&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+computerworld%2Fnews%2Ffeed+%28Latest+from+Computerworld%29&utm_content=Google+Feedfetcher

- Computerworld

10 Spelling Checker Secrets for Microsoft Word

These tips can prevent you from confusing 'advice' with 'advise,' stop Word from flagging acronyms, and make you look more literate.

Helen Bradley

March 14, 2012 (PC World)

You use Word's spelling checker every day, and probably just as often encounter some of the tool's puzzling behavior. Do you know how to get rid of a word that you mistakenly added to its dictionary, for instance, or how to hide the red wiggly lines that appear all over your document?

The following ten tricks will help you to work more efficiently in Word 2010, and they will even make you and your documents look smarter.

1. Control the 'Check Spelling as You Type' Feature

This default feature reviews spelling within your document as you work, indicating with a red wiggly line any words that are missing from the spelling checker's dictionary. The feature can be distracting, but it's easy to disable. To do so, choose File, Options, Proofing, click the Check spelling as you type checkbox to clear it and reverse the current setting, and then click OK.

2. Check Foreign-Language Spelling

Word isn't naturally bilingual, but you can train it to process more than one language at a time. Ordinarily, when you're working on a document that includes text in, say, French, Word likely won't recognize the other language if you've set your primary language to U.S. English; in this case, Word will add wiggly lines under the assorted foreign words, suggesting that they are all misspellings.

You can avoid that situation by setting Word to check the French text using a French word list. To arrange this, select the text in French (or whatever foreign language you're using), and click the Review tab on the Ribbon toolbar. Then click Language and choose Set Language in the Proofing group of buttons. The Language dialog box will appear. Here you should click the language to use for the selected text; the listed languages displaying checkmark icons are available for use in checking spelling. Click OK to finish.

3. Add Unusual Words to the Dictionary

If you know ahead of time that you will be using some unusual words, and if you do not want Word to report them as possible misspellings, you can add them to the dictionary.

Choose File, Options, Proofing, and click Custom Dictionaries. Click the custom.dic file--or the name of the dictionary to add the words to, if you are using a special dictionary--and click Edit Word List. Type a word, and click Add. When you're done, click OK to exit the dictionary.

Adding words one at a time is sensible if you have only a few. But if you have a long list of words to add, it's best to do so by editing the dictionary file itself.

First, from the Custom Dictionaries dialog box, make a note of the file-path entry that shows where the custom.dic file is located. Then launch a plain-text editor such as Notepad or WordPad, and use it to open the custom.dic file. Type or paste your words, one word per line, into the document and then save it. Word will automatically sort the items into alphabetical order when it next uses the file.

4. Remove Misspellings in the Spelling Checker

If you add a misspelled word to the dictionary by accident, Word won't identify it as misspelled until you remove it.

Choose File, Options, Proofing, and click Custom Dictionaries. Select the default dictionary in the list; typically this is the custom.dic file. Click Edit Word List to open the custom.dic dialog box, which contains a list of words you have added to Word's custom dictionary. Scroll down the list, click the errant word, and then click Delete and Close. In the future, if you use this misspelling in a document, Word will properly flag it as a mistake.

5. Determine What the Spelling Checker Checks

Depending on the type of work you do, you may discover that Word either finds errors where none exist, or fails to catch the embarrassing errors you do make. For some terms, such as email addresses, URLs, or items containing numbers, you can decide whether Word checks their spelling or leaves them alone.

To see the preferences that Word is currently configured to use, choose File, Options, Proofing. Here you can set preferences, such as 'Ignore words in UPPERCASE' and 'Ignore words that contain numbers'. If you don't want Word to report email addresses and URLs as misspellings, for example, click to enable the Ignore Internet and file addresses checkbox.

You can also disable Flag repeated words if you find Word's highlighting of repeated words annoying. When you are done, click OK to return to editing the document. These changes apply instantly, and will remain in place even after you shut down and restart Word.

6. Hide the Wiggly Underlines, Just This Once

If you like to work with 'Check spelling as you type' enabled, but wish to hide the wiggly underlines for one document only to reduce distractions, you can do so. This feature lets you control the visibility of the wiggly lines on a document-by-document basis, without disabling the spelling checker itself.

Choose File, Options, Proofing. Within the 'Exceptions for:' group of options, make sure the current document name appears in the box, and click Hide spelling errors in this document only. Click OK, and the document will stop showing wiggly underlines. You can still spelling-check the document, of course, by clicking the Review tab on the Ribbon toolbar and selecting Spelling & Grammar, or by pressing the shortcut key, F7.

7. Configure Text So That Word Doesn't Check It

Computer programming code, scientific data, and other specialized text often includes words that don't live in Word's dictionary, so the spelling checker frequently flags them. To disable spelling checks for such situations, first select the text in question. Then click the Review tab on the Ribbon toolbar, and choose Language, Set Proofing Language. Click the Do not check spelling or grammar checkbox, and click OK. Word will no longer proof the selected text, now or at any time in the future.

8. Use Multiple Dictionaries for Different Projects

Many businesses have their own language. For example, a doctor's office uses medical terminology, and a mining office uses mining jargon. If your business uses certain industry terms, it's convenient to have a dictionary of those terms on hand, to prevent Word from flagging them as misspellings.

You can either add the special terminology to your own custom.dic file or create a second dictionary file of the specialized terms. Maintaining a second file can be beneficial, as you can share it with other users without sharing your own personal custom.dic or needing to overwrite the other user's custom.dic file with your version.

To create a second dictionary, choose File, Options, Proofing, and click Custom Dictionaries. Click New, type a name for your dictionary file, and click Save. Now you can add words to the dictionary as detailed in Tip 3 above.

If you are using two dictionaries--both custom.dic and a second one of specialized words--you'll want Word to use words from both files when it makes suggestions for correcting the items it has flagged as spelling errors. To make sure that Word is configured to do this, click File, Options, Proofing, and confirm that the option Suggest from main dictionary only is disabled. If not, disable it and click OK.

9. Share a Custom Dictionary With Other Users

Once you've created a dictionary file, you can share it with other users so that they can employ it in their version of Word. To do so, in Windows Explorer, locate the .dic file you created, and then send the recipient a copy. The other person, on their computer, will need to place the file in the same folder as their own custom.dic file.

Then, to add the file to Word, the recipient should launch Word and choose File, Options, Proofing, Custom Dictionaries and click Add. The user should then locate and select the new .dic file, which will be in the folder that the dialog box points to, and click Open to add it to Word's Dictionary list.

10. Flag Words Misspelled in Context Only

In some situations you may find yourself using a word that's correctly spelled but incorrect in the context. Homophones, such as stationary and stationery, or advice and advise, can be confusing--all the more so because the spelling checker won't always flag their misuse. In addition, if you tend to overuse a word, you may want Word to alert you so that you can change it on certain occasions. A solution to both issues is to exclude the problematic words so that the tool will flag them.

To exclude one or more words, you must add them to the Word exclusion file, which is already created for you and installed with Word 2007 and 2010.

Start by searching for ExcludeDictionaryEN*.lex using Windows Search. In the search results, you will find multiple files, one for each English variant. The four-digit code in each filename tells you which .lex file belongs to which language variant. For example, 0409 is for the United States, and 0809 is for the United Kingdom. See Microsoft's site for the IDs for each locale; look for the number in the LCID Hex column to identify the files for the language variants you use.

In Windows Explorer, open the folder containing the exclusion files, right-click the ExcludeDictionaryEN*.lex file for the first language variant you use, and choose Open With, then WordPad. Type the words to exclude, one per line, and click Save. Repeat for any other language variants that you use. Close and reopen Word.

In the future, when you type any word that's in the exclude dictionary file, Word will flag it as a spelling error. Take care to click only 'Ignore Once'--not 'Ignore All' or 'Add To Dictionary'--to move past the word when you're using the 'Spelling and Grammar' dialog box. Otherwise, the spelling checker won't flag the word as a misspelling in the future.

10 essential performance tips for MySQL

http://www.computerworld.com/s/article/9227128/10_essential_performance_tips_for_MySQL?source=rss_latest_content&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+computerworld%2Fnews%2Ffeed+%28Latest+from+Computerworld%29&utm_content=Google+Feedfetcher

- Computerworld

From workload profiling to the three rules of indexing, these expert insights are sure to make your MySQL servers scream

Baron Schwartz

May 14, 2012 (Infoworld)

As with all relational databases, MySQL can prove to be a complicated beast, one that can crawl to a halt at a moment's notice, leaving your applications in the lurch and your business on the line.

The truth is, common mistakes underlie most MySQL performance problems. To ensure your MySQL server hums along at top speed, providing stable and consistent performance, it is important to eliminate these mistakes, which are often obscured by some subtlety in your workload or a configuration trap.

[ To enhance the performance and health of your MySQL systems, check out our 10 essential MySQL tools for admins. | Learn how to master MySQL in the Amazon cloud. | Keep up to date on the key business tech news and insights with the InfoWorld Daily newsletter. Subscribe today! ]

Luckily, many MySQL performance issues turn out to have similar solutions, making troubleshooting and tuning MySQL a manageable task.

Here are 10 tips for getting great performance out of MySQL.

MySQL performance tip No. 1: Profile your workloadThe best way to understand how your server spends its time is to profile the server's workload. By profiling your workload, you can expose the most expensive queries for further tuning. Here, time is the most important metric because when you issue a query against the server, you care very little about anything except how quickly it completes.

The best way to profile your workload is with a tool such as MySQL Enterprise Monitor's query analyzer or the pt-query-digest from the Percona Toolkit. These tools capture queries the server executes and return a table of tasks sorted by decreasing order of response time, instantly bubbling up the most expensive and time-consuming tasks to the top so that you can see where to focus your efforts.

Workload-profiling tools group similar queries together into one row, allowing you to see the queries that are slow, as well as the queries that are fast but executed many times.

MySQL performance tip No. 2: Understand the four fundamental resourcesTo function, a database server needs four fundamental resources: CPU, memory, disk, and network. If any of these is weak, erratic, or overloaded, then the database server is very likely to perform poorly.

Understanding the fundamental resources is important in two particular areas: choosing hardware and troubleshooting problems.

When choosing hardware for MySQL, ensure good-performing components all around. Just as important, balance them reasonably well against each other. Often, organizations will select servers with fast CPUs and disks but that are starved for memory. In some cases, adding memory is cheap way of increasing performance by orders of magnitude, especially on workloads that are disk-bound. This might seem counterintuitive, but in many cases disks are overutilized because there isn't enough memory to hold the server's working set of data.

Another good example of this balance pertains to CPUs. In most cases, MySQL will perform well with fast CPUs because each query runs in a single thread and can't be parallelized across CPUs.

When it comes to troubleshooting, check the performance and utilization of all four resources, with a careful eye toward determining whether they are performing poorly or are simply being asked to do too much work. This knowledge can help solve problems quickly.

MySQL performance tip No. 3: Don't use MySQL as a queueQueues and queue-like access patterns can sneak into your application without your knowing it. For example, if you set the status of an item so that a particular worker process can claim it before acting on it, then you're unwittingly creating a queue. Marking emails as unsent, sending them, then marking them as sent is a common example.

Queues cause problems for two major reasons: They serialize your workload, preventing tasks from being done in parallel, and they often result in a table that contains work in process as well as historical data from jobs that were processed long ago. Both add latency to the application and load to MySQL.

MySQL performance tip No. 4: Filter results by cheapest firstA great way to optimize MySQL is to do cheap, imprecise work first, then the hard, precise work on the smaller, resulting set of data.

For example, suppose you're looking for something within a given radius of a geographical point. The first tool in many programmers' toolbox is the great-circle (Haversine) formula for computing distance along the surface of a sphere. The problem with this technique is that the formula requires a lot of trigonometric operations, which are very CPU-intensive. Great-circle calculations tend to run slowly and make the machine's CPU utilization skyrocket.

Before applying the great-circle formula, pare down your records to a small subset of the total, and trim the resulting set to a precise circle. A square that contains the circle (precisely or imprecisely) is an easy way to do this. That way, the world outside the square never gets hit with all those costly trig functions.

MySQL performance tip No. 5: Know the two scalability death trapsScalability is not as vague as you may believe. In fact, there are precise mathematical definitions of scalability that are expressed as equations. These equations highlight why systems don't scale as well as they should.

Take the Universal Scalability Law, a definition that is handy in expressing and quantifying a system's scalability characteristics. It explains scaling problems in terms of two fundamental costs: serialization and crosstalk.

Parallel processes that must halt for something serialized to take place are inherently limited in their scalability. Likewise, if the parallel processes need to chat with each other all the time to coordinate their work, they limit each other.

Avoid serialization and crosstalk, and your application will scale much better. What does this translate into inside of MySQL? It varies, but some examples would be avoiding exclusive locks on rows. Queues, point No. 3 above, tend to scale poorly for this reason.

MySQL performance tip No. 6: Don't focus too much on configurationDBAs tend to spend a huge amount of time tweaking configurations. The result is usually not a big improvement and can sometimes even be very damaging. I've seen a lot of "optimized" servers that crashed constantly, ran out of memory, and performed poorly when the workload got a little more intense.

The defaults that ship with MySQL are one-size-fits-none and badly outdated, but you don't need to configure everything. It's better to get the fundamentals right and change other settings only if needed. In most cases, you can get 95 percent of the server's peak performance by setting about 10 options correctly. The few situations where this doesn't apply are going to be edge cases unique to your circumstances.

In most cases, server "tuning" tools aren't recommended because they tend to give guidelines that don't make sense for specific cases. Some even have dangerous, inaccurate advice coded into them -- such as cache hit ratios and memory consumption formulas. These were never right, and they've gotten even less correct as time has passed.

MySQL performance tip No. 7: Watch out for pagination queriesApplications that paginate tend to bring the server to its knees. In showing you a page of results, with a link to go to the next page, these applications typically group and sort in ways that can't use indexes, and they employ a LIMIT and offset that causes the server to do a lot of work generating, then discarding rows.

Optimizations can often be found in the user interface itself. Instead of showing the exact number of pages in the results and links to each page individually, you can just show a link to the next page. You can also prevent people from going to pages too far from the first page.

On the query side, instead of using LIMIT with offset, you can select one more row than you need, and when the user clicks the "next page" link, you can designate that final row as the starting point for the next set of results. For example, if the user viewed a page with rows 101 through 120, you would select row 121 as well; to render the next page, you'd query the server for rows greater than or equal to 121, limit 21.

MySQL performance tip No. 8: Save statistics eagerly, alert reluctantlyMonitoring and alerting are essential, but what happens to the typical monitoring system? It starts sending false positives, and system administrators set up email filtering rules to stop the noise. Soon your monitoring system is completely useless.

I like to think about monitoring in two ways: capturing metrics and alerting. It's very important to capture and save all the metrics you possibly can because you'll be glad to have them when you're trying to figure out what changed in the system. Someday, a strange problem will crop up, and you'll love the ability to point to a graph and show a change in the server's workload.

By contrast, there's a tendency to alert way too much. People often alert on things like the buffer hit ratio or the number of temporary tables created per second. The problem is that there is no good threshold for such a ratio. The right threshold is not only different from server to server, but from hour to hour as your workload changes.

As a result, alert sparingly and only on conditions that indicate a definite, actionable problem. A low buffer hit ratio isn't actionable, nor does it indicate a real issue, but a server that doesn't respond to a connection attempt is an actual problem that needs to be solved.

MySQL performance tip No. 9: Learn the three rules of indexingIndexing is probably the most misunderstood topic in databases because there are so many ways to get confused about how indexes work and how the server uses them. It takes a lot of effort to really understand what's going on.

Indexes, when properly designed, serve three important purposes in a database server:

If you can design your indexes and queries to exploit these three opportunities, you can make your queries several orders of magnitude faster.

MySQL performance tip No. 10: Leverage the expertise of your peersDon't try to go it alone. If you're puzzling over a problem and doing what seems logical and sensible to you, that's great. This will work about 19 times out of 20. The other time, you'll go down a rabbit hole that will be very costly and time-consuming, precisely because the solution you're trying seems to make a lot of sense.

Build a network of MySQL-related resources -- and this goes beyond toolsets and troubleshooting guides. There are some extremely knowledgeable people lurking on mailing lists, forums, Q&A websites, and so on. Conferences, trade shows, and local user group events provide valuable opportunities for gaining insights and building relationships with peers who can help you in a pinch.

For those looking for tools to complement these tips, you can check out my MySQL configuration tool, Query Advisor tool, and Percona Monitoring Plugins. The configuration tool can help you generate a baseline my.cnf file for a new server that's superior to the sample files that ship with the server. The Query Advisor analyzes your SQL to help detect potentially bad patterns such as pagination queries (No. 7). Percona Monitoring is a set of monitoring and graphing plugins to help you save statistics eagerly and alert reluctantly (No. 8). All three are freely available.

Related articles

This story, "10 essential performance tips for MySQL," was originally published at InfoWorld.com. Follow the latest developments in data management at InfoWorld.com. For the latest developments in business technology news, follow InfoWorld.com on Twitter.

Read more about data management in InfoWorld's Data Management Channel.

Tuesday, March 13, 2012

16 Useful Boilerplates to Start Your Project Quickly | Queness

http://www.queness.com/post/10905/16-useful-boilerplates-to-start-your-project-quickly

Introduction

Boilerplate is a set of code that can be reused in many ways with little or no alteration. However, the boilerplates we are talking about here usually can be used as a base, a solid foundation for your projects. Additional benefit, it's also a good place to learn tips and tricks about coding too!

Boilerplate is extremely useful because it usually comprise of best coding practices, and also contain heaps of tips and tricks which otherwise would take years of times to learn. Take HTML Email Boilerplate as an example, building an eDM isn't easy, it requires one to go back to HTML 1.0, no more div, span or high level CSS settings such as float, position etc, what you need is table for layout, inline CSS for simple styling. To make building an eDM even worse, email clients have strict restriction and they don't behave the same, as if you're working with more than one legacy browsers that all render the page differently! With the Email boilerplate, it contains CSS settings HTML structure and even some tips and tricks to help and guide you how to avoid redering inconsistencies issues.

With no further a do, I have collected 16 boilerplates for different web technologies, platforms - HTML, CSS, jQuery, WordPress and etc. They enforce best practices and constant updates and I pretty sure it will be a really good foundation for your projects.

HTML & Miscelaneous

HTML5 BoilerplateHTML5 Boilerplate is the professional badass's base HTML/CSS/JS template for a fast, robust and future-safe site.
HTML Email BoilerplateThis website and its sample code creates a template of sorts, absent of design or layout, that will help you avoid some of the major rendering problems with the most common email clients out there — Gmail, Outlook, Yahoo Mail, etc. This is a good stuff, I use it in my work and it contains a lot of tips and tricks which save you heaps of times to fix it yourself.
HTML5 Mobile BoilerplateMobile Boilerplate is your trusted template made custom for creating rich and performant mobile web apps. You get cross-browser consistency among A-grade smartphones, and fallback support for legacy Blackberry, Symbian, and IE Mobile.
Twitter BootstrapSimple and flexible HTML, CSS, and Javascript for popular user interface components and interactions. Not really a boilerplate, but it has a lot of reusable component for fast prototyping or development.
Zend Framework BoilerplateZend Framework (ZF) Boilerplate is an all-in-one platform for development of enterprise grade PHP applications based on the Zend Framework.

CSS

GetSkeletonSkeleton is a small collection of CSS & JS files that can help you rapidly develop sites that look beautiful at any size, be it a 17" laptop screen or an iPhone. Skeleton is built on three core principles: Responsive Grid Down to Mobile, Fast to Start and Style Agnostic.
CSS Media Queries BoilerplateQuick snippet for CSS Media Query setup.
Boilerplate for Responsive MobileYAMP is a small set of tools and best practices that allow web designers to build responsive websites faster.

jQuery

jQuery BoilerplateThis project won't seek to provide a perfect solution to every possible pattern, but will attempt to cover a simple template for beginners and above.
Stefan Gabos jQuery Plugin BoilerplateA boilerplate for jump-starting jQuery plugins development.
Essential jQuery Plugin PatternsA javascript pattern for jQuery plugin development. But while well-known JavaScript patterns are useful, another side of development could benefit from its own set of design patterns: jQuery plugins. The officialjQuery plugin authoring guide offers a great starting point for getting into writing plugins and widgets, but let's take it further.

WordPress

WordPress Widget BoilerplateAn organized, maintainable boilerplate for building WordPress widgets.
Root ThemeRoots is a starting WordPress theme based on HTML5 Boilerplate & Bootstrap from Twitter.
BonesBones is a boilerplate for WordPress theme development. It contains classic (fixed grid) and responsive layout to choose from.
Starkers ThemeStarkers is a bare-bones WordPress theme created to act as a starting point for the theme designer.
TwentyTen Five HTML5 Base ThemeBringing HTML5 to WordPress, you can use this TwentyTen Five WordPress template to build your won HTML themes.

About the Author

Kevin Liew is a web designer and developer and keen on contributing to the web development industry. He loves frontend development and absolutely amazed by jQuery. Feel free to say hi to me, or follow @quenesswebblog on twitter.

Tuesday, June 26, 2012

Windows和Ubuntu双系统完全独立的安装方法

Monday, June 18, 2012

互联网精准广告定向技术

一、基础知识：

1、Http Header之User-Agent

2、用户追踪之基础技术——Cookie

二、定向技术介绍：

1、语言定向

2、浏览器定向

3、操作系统定向

4、地域定向

5、回头客定向

6、人群定向

7、并发次数

8、时段定向

9、网页定向

10、访客频次

11、关键词定向

Monday, May 14, 2012

10 Spelling Checker Secrets for Microsoft Word

10 Spelling Checker Secrets for Microsoft Word

These tips can prevent you from confusing 'advice' with 'advise,' stop Word from flagging acronyms, and make you look more literate.

10 essential performance tips for MySQL

From workload profiling to the three rules of indexing, these expert insights are sure to make your MySQL servers scream

Tuesday, March 13, 2012

16 Useful Boilerplates to Start Your Project Quickly | Queness

Introduction

HTML & Miscelaneous

CSS

jQuery

WordPress

About the Author

Featured Post

Windows和Ubuntu双系统完全独立的安装方法

　　一、基础知识：

　　1、Http Header之User-Agent

　　2、用户追踪之基础技术——Cookie

　　二、定向技术介绍：

　　1、语言定向

　　2、浏览器定向

　　3、操作系统定向

　　4、地域定向

　　5、回头客定向

　　6、人群定向

　　7、并发次数

　　8、时段定向

　　9、网页定向

　　10、访客频次

　　11、关键词定向