SQLite Explan: Why the Most Popular Accounts Query is Slow compa

  • 时间:2020-09-08 11:08:55
  • 分类:网络文摘
  • 阅读:79 次

Similarly to Most Blocks Accounts on Steem Blockchain I was going to have a page that lists the most popular accounts (most followed).

However, after analysis, I decide to pause it because the query to get the most popular accounts is very slow – it takes more than 20 minutes to retrieve results.

On the other hand, the query to get the most blocked accounts is fast – less than a minute – which makes it feasible to do so.

When a query takes ages, the SQLite suffers a known database lock problem. SQLite is not very good at concurrency – when a thread is reading all rows, all other write operations (such as insertion) will be locked:

database is locked
Traceback (most recent call last):
File “test.py”, line 135, in
con.commit()
sqlite3.OperationalError: database is locked

Almost Same query – but with different parameters. For ‘most blocked’ the filtering is where what = ‘ignore’ and for ‘most popular’ the filtering is where what = ‘blog’

This is the table schema and indices:

sqlite> .schema
CREATE TABLE mute (
who text,
whom text,
what text,
time text,
block integer,
constraint pkey primary key (who, whom)
);
CREATE INDEX index_from on mute(who);
CREATE INDEX index_to on mute(whom);
CREATE INDEX index_time on mute(time);
CREATE INDEX index_what on mute(what);
CREATE INDEX index_block on mute(block);

And if we run the query to see the number of the candidate rows:

sqlite> select count(rowid) from mute where what='ignore';
332096
sqlite> select count(rowid) from mute where what='blog';
100757608

It is a huge difference! Yes, for both queries, SQLite will utilize the index `what` and process the result set by grouping – the more rows – the more time it requires!

This can be confirmed by the `explain` statement which takes you how the SQLite will process your query.

sqlite> explain select whom,count(1) from mute where what='blog' group by whom order by count(1) desc limit 100;
addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     57    0                    00  Start at 57  
1     OpenEphemeral  1     4     0     k(1,-B)        00  nColumn=4    
2     Integer        100   1     0                    00  r[1]=100; LIMIT counter
3     SorterOpen     2     1     0     k(1,B)         00               
4     Integer        0     5     0                    00  r[5]=0; clear abort flag
5     Integer        0     4     0                    00  r[4]=0; indicate accumulator empty
6     Null           0     8     8                    00  r[8..8]=NULL 
7     Gosub          7     49    0                    00               
8     OpenRead       0     2     0     3              00  root=2 iDb=0; mute
9     OpenRead       3     7     0     k(2,,)         02  root=7 iDb=0; index_what
10    String8        0     10    0     blog        00  r[10]='blog'
11    SeekGE         3     18    10    1              00  key=r[10]    
12      IdxGT          3     18    10    1              00  key=r[10]    
13      DeferredSeek   3     0     0                    00  Move 0 to 3.rowid if needed
14      Column         0     1     11                   00  r[11]=mute.whom
15      MakeRecord     11    1     12                   00  r[12]=mkrec(r[11])
16      SorterInsert   2     12    0                    00  key=r[12]    
17    Next           3     12    1                    00               
18    OpenPseudo     4     11    1                    00  1 columns in r[11]
19    SorterSort     2     51    0                    00  GROUP BY sort
20      SorterData     2     11    4                    00  r[11]=data   
21      Column         4     0     9                    00  r[9]=        
22      Compare        8     9     1     k(1,B)         00  r[8] <-> r[9]
23      Jump           24    28    24                   00               
24      Move           9     8     1                    00  r[8]=r[9]    
25      Gosub          6     37    0                    00  output one row
26      IfPos          5     51    0                    00  if r[5]>0 then r[5]-=0, goto 51; check abort flag
27      Gosub          7     49    0                    00  reset accumulator
28      Integer        1     12    0                    00  r[12]=1      
29      AggStep0       0     12    3     count(1)       01  accum=r[3] step(r[12])
30      Column         4     0     2                    00  r[2]=        
31      Integer        1     4     0                    00  r[4]=1; indicate data in accumulator
32    SorterNext     2     20    0                    00               
33    Gosub          6     37    0                    00  output final row
34    Goto           0     51    0                    00               
35    Integer        1     5     0                    00  r[5]=1; set abort flag
36    Return         6     0     0                    00               
37    IfPos          4     39    0                    00  if r[4]>0 then r[4]-=0, goto 39; Groupby result generator entry point
38    Return         6     0     0                    00               
39    AggFinal       3     1     0     count(1)       00  accum=r[3] N=1
40    Copy           2     15    0                    00  r[15]=r[2]   
41    Copy           3     13    0                    00  r[13]=r[3]   
42    Sequence       1     14    0                    00  r[14]=cursor[1].ctr++
43    MakeRecord     13    3     17                   00  r[17]=mkrec(r[13..15])
44    IdxInsert      1     17    13    3              00  key=r[17]    
45    IfNotZero      1     48    0                    00  if r[1]!=0 then r[1]--, goto 48
46    Last           1     0     0                    00               
47    Delete         1     0     0                    00               
48    Return         6     0     0                    00  end groupby result generator
49    Null           0     2     3                    00  r[2..3]=NULL 
50    Return         7     0     0                    00               
51    Sort           1     56    0                    00               
52      Column         1     2     15                   00  r[15]=whom   
53      Column         1     0     16                   00  r[16]=count(1)
54      ResultRow      15    2     0                    00  output=r[15..16]
55    Next           1     52    0                    00               
56    Halt           0     0     0                    00               
57    Transaction    0     0     20    0              01  usesStmtJournal=0
58    Goto           0     1     0                    00      

SQLite is a relational database – which makes it inefficient in handling a large dataset. The steem blockchain has been running for years – the data accumulates quickly.

–EOF (The Ultimate Computing & Technology Blog) —

推荐阅读:
Counting Substrings with Only One Distinct Letter with Different  Ukranian Authorities Prevent Security Blogger From Getting Kidna  5 Instagram Tools to Help You Get More Blog Visitors  6 SEO Analysis Tools Every SEO Needs to Know For 2017  Top Tools to Create and Sell Content  Mauritania Clerics Call For ‘Apostasy’ Blogger To Be Killed  Automating WordPress Website Marketing With GetResponse  Travel Blogger Teaches People ‘How To Move To New Zealand’ After  50 Content Marketing Ideas to Go From Rookie to Super Hero [Info  What to Look For When Choosing a Web Host for Your Blog 
评论列表
添加评论