SQLite Explan: Why the Most Popular Accounts Query is Slow compa
- 时间:2020-09-08 11:08:55
- 分类:网络文摘
- 阅读:75 次
Similarly to Most Blocks Accounts on Steem Blockchain I was going to have a page that lists the most popular accounts (most followed).
However, after analysis, I decide to pause it because the query to get the most popular accounts is very slow – it takes more than 20 minutes to retrieve results.
On the other hand, the query to get the most blocked accounts is fast – less than a minute – which makes it feasible to do so.
When a query takes ages, the SQLite suffers a known database lock problem. SQLite is not very good at concurrency – when a thread is reading all rows, all other write operations (such as insertion) will be locked:
database is locked
Traceback (most recent call last):
File “test.py”, line 135, in
con.commit()
sqlite3.OperationalError: database is locked
Almost Same query – but with different parameters. For ‘most blocked’ the filtering is where what = ‘ignore’ and for ‘most popular’ the filtering is where what = ‘blog’
This is the table schema and indices:
sqlite> .schema CREATE TABLE mute ( who text, whom text, what text, time text, block integer, constraint pkey primary key (who, whom) ); CREATE INDEX index_from on mute(who); CREATE INDEX index_to on mute(whom); CREATE INDEX index_time on mute(time); CREATE INDEX index_what on mute(what); CREATE INDEX index_block on mute(block);
And if we run the query to see the number of the candidate rows:
sqlite> select count(rowid) from mute where what='ignore'; 332096 sqlite> select count(rowid) from mute where what='blog'; 100757608
It is a huge difference! Yes, for both queries, SQLite will utilize the index `what` and process the result set by grouping – the more rows – the more time it requires!
This can be confirmed by the `explain` statement which takes you how the SQLite will process your query.
sqlite> explain select whom,count(1) from mute where what='blog' group by whom order by count(1) desc limit 100; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 57 0 00 Start at 57 1 OpenEphemeral 1 4 0 k(1,-B) 00 nColumn=4 2 Integer 100 1 0 00 r[1]=100; LIMIT counter 3 SorterOpen 2 1 0 k(1,B) 00 4 Integer 0 5 0 00 r[5]=0; clear abort flag 5 Integer 0 4 0 00 r[4]=0; indicate accumulator empty 6 Null 0 8 8 00 r[8..8]=NULL 7 Gosub 7 49 0 00 8 OpenRead 0 2 0 3 00 root=2 iDb=0; mute 9 OpenRead 3 7 0 k(2,,) 02 root=7 iDb=0; index_what 10 String8 0 10 0 blog 00 r[10]='blog' 11 SeekGE 3 18 10 1 00 key=r[10] 12 IdxGT 3 18 10 1 00 key=r[10] 13 DeferredSeek 3 0 0 00 Move 0 to 3.rowid if needed 14 Column 0 1 11 00 r[11]=mute.whom 15 MakeRecord 11 1 12 00 r[12]=mkrec(r[11]) 16 SorterInsert 2 12 0 00 key=r[12] 17 Next 3 12 1 00 18 OpenPseudo 4 11 1 00 1 columns in r[11] 19 SorterSort 2 51 0 00 GROUP BY sort 20 SorterData 2 11 4 00 r[11]=data 21 Column 4 0 9 00 r[9]= 22 Compare 8 9 1 k(1,B) 00 r[8] <-> r[9] 23 Jump 24 28 24 00 24 Move 9 8 1 00 r[8]=r[9] 25 Gosub 6 37 0 00 output one row 26 IfPos 5 51 0 00 if r[5]>0 then r[5]-=0, goto 51; check abort flag 27 Gosub 7 49 0 00 reset accumulator 28 Integer 1 12 0 00 r[12]=1 29 AggStep0 0 12 3 count(1) 01 accum=r[3] step(r[12]) 30 Column 4 0 2 00 r[2]= 31 Integer 1 4 0 00 r[4]=1; indicate data in accumulator 32 SorterNext 2 20 0 00 33 Gosub 6 37 0 00 output final row 34 Goto 0 51 0 00 35 Integer 1 5 0 00 r[5]=1; set abort flag 36 Return 6 0 0 00 37 IfPos 4 39 0 00 if r[4]>0 then r[4]-=0, goto 39; Groupby result generator entry point 38 Return 6 0 0 00 39 AggFinal 3 1 0 count(1) 00 accum=r[3] N=1 40 Copy 2 15 0 00 r[15]=r[2] 41 Copy 3 13 0 00 r[13]=r[3] 42 Sequence 1 14 0 00 r[14]=cursor[1].ctr++ 43 MakeRecord 13 3 17 00 r[17]=mkrec(r[13..15]) 44 IdxInsert 1 17 13 3 00 key=r[17] 45 IfNotZero 1 48 0 00 if r[1]!=0 then r[1]--, goto 48 46 Last 1 0 0 00 47 Delete 1 0 0 00 48 Return 6 0 0 00 end groupby result generator 49 Null 0 2 3 00 r[2..3]=NULL 50 Return 7 0 0 00 51 Sort 1 56 0 00 52 Column 1 2 15 00 r[15]=whom 53 Column 1 0 16 00 r[16]=count(1) 54 ResultRow 15 2 0 00 output=r[15..16] 55 Next 1 52 0 00 56 Halt 0 0 0 00 57 Transaction 0 0 20 0 01 usesStmtJournal=0 58 Goto 0 1 0 00
SQLite is a relational database – which makes it inefficient in handling a large dataset. The steem blockchain has been running for years – the data accumulates quickly.
–EOF (The Ultimate Computing & Technology Blog) —
推荐阅读:常吃些鲜枣可抑制癌细胞提高免疫力 保健与养生:大枣的食疗功效和作用 红枣食疗三推荐:健脾安神降压降脂 男性养生:男人吃葡萄干有特别好处 食品营养之葡萄干具有的营养价值 女性养生:女人吃葡萄干有哪些好处? 对转基因食品必须采取最严格保护措施 《食品安全法》修订四大焦点引发关注 秋冬季节风寒感冒 葱白有治疗感冒功效 秋冬时节抗病毒防感冒的食疗养生方
- 评论列表
-
- 添加评论