Debugging Memcached Server

In this post, we will discuss different methods and options to monitor and debug memcached server. Memcached is one of the most popular open source, in-memory data store.

Once memcached server is started, it starts collecting a number of data points. We will discuss about the most important data points that we should keep an eye on.

Connecting to memcached server

For connecting to memcached we can use telnet or nc (netcat).

In below example, we are using telnet. If memcached is running on remote server, we can use memcached server IP or hostname instead of localhost.


telnet localhost 11211

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Once we are connected to memcached server, we can run few commands to get more data.

stats command gives statistics for memached server since it was last restarted.

stats

STAT pid 25576
STAT uptime 4396
STAT time 1611041032
STAT version 1.4.4
STAT pointer_size 64
STAT rusage_user 12.674073
STAT rusage_system 23.105487
STAT curr_connections 15
STAT total_connections 43821
STAT connection_structures 40
STAT cmd_get 1103178
STAT cmd_set 225769
STAT cmd_flush 0
STAT get_hits 982501
STAT get_misses 120677
STAT delete_misses 35
STAT delete_hits 153
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT auth_cmds 0
STAT auth_errors 0
STAT bytes_read 503877522
STAT bytes_written 1595543410
STAT limit_maxbytes 5368709120
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 4
STAT conn_yields 31
STAT bytes 361115217
STAT curr_items 121807
STAT total_items 225769
STAT evictions 0
END


We can also check the status using memcached-tool (official perl script installed by default with memcached)

memcached-tool localhost:11211 stats

#localhost:11211   Field       Value
         accepting_conns           1
               auth_cmds           0
             auth_errors           0
                   bytes   392450753
              bytes_read   548574492
           bytes_written  1746355242
              cas_badval           0
                cas_hits           0
              cas_misses           0
               cmd_flush           0
                 cmd_get     1196890
                 cmd_set      245214
             conn_yields          32
   connection_structures          40
        curr_connections          18
              curr_items      129905
               decr_hits           0
             decr_misses           0
             delete_hits         172
           delete_misses          35
               evictions           0
                get_hits     1068191
              get_misses      128699
               incr_hits           0
             incr_misses           0
          limit_maxbytes  5368709120
     listen_disabled_num           0
                     pid       25576
            pointer_size          64
           rusage_system   25.115181
             rusage_user   13.809900
                 threads           4
                    time  1611041450
       total_connections       47412
             total_items      245214
                  uptime        4814
                 version       1.4.4

Important statistics to focus on

curr_connections

This lists the number of clients currently connected to the server. If this number is too close to max connections setting, you need to increase your max connection count.

evictions

This lists the number of valid items removed from cache to free memory for new items. If eviction count is high, you may want to increase RAM allocated to memcached server. There are other factors also which should be considered before increasing RAM allocation. The eviction may be high beause the application is setting a large number of keys but never reading them back again.

conn_yields

Connection yields are the amount of times the memcached server has throttled a specific client. This may be because of large multi-get requests. To lower connection yields, we can either increase -R or send smaller batches of multi-gets.

cache hit ratio / hitrate


cache hit ratio = get_hits * 100 / (get_hits + get_misses) 

High cache hit ratio indicates that our application is able to benefit from memcached.

listen_disabled_num

This shows the number of times memcached has hit its connection limit. This number should stay at or close to zero.

limit_maxbytes

This number should be same as specified in -m during memcached startup.

Slabs

Memcached uses pre-allocated memory chunks for storing data. These chunks are called slabs. Slabs are grouped into slab classes for space allocation.

We can use below command to get slabwise stats


stats slabs
STAT 1:chunk_size 96
STAT 1:chunks_per_page 10922
STAT 1:total_pages 1
STAT 1:total_chunks 10922
STAT 1:used_chunks 1
STAT 1:free_chunks 0
STAT 1:free_chunks_end 10921
STAT 1:mem_requested 89
STAT 1:get_hits 1396
STAT 1:cmd_set 1
STAT 1:delete_hits 0
STAT 1:incr_hits 0
STAT 1:decr_hits 0
STAT 1:cas_hits 0
STAT 1:cas_badval 0
STAT 3:chunk_size 152
STAT 3:chunks_per_page 6898
STAT 3:total_pages 16
STAT 3:total_chunks 110368
STAT 3:used_chunks 103571
STAT 3:free_chunks 1
STAT 3:free_chunks_end 6796
STAT 3:mem_requested 13923095
STAT 3:get_hits 1226527
STAT 3:cmd_set 249397
STAT 3:delete_hits 0
STAT 3:incr_hits 0
STAT 3:decr_hits 0
STAT 3:cas_hits 0
STAT 3:cas_badval 0

END

Use the stats command with items to get stats about the different slabs of keys in your server. The number after “items:” is a slab id.


telnet localhost 11211 
stats items
STAT items:1:number 1
STAT items:1:age 6771
STAT items:1:evicted 0
STAT items:1:evicted_nonzero 0
STAT items:1:evicted_time 0
STAT items:1:outofmemory 0
STAT items:1:tailrepairs 0
STAT items:3:number 97998
STAT items:3:age 31
STAT items:3:evicted 0
STAT items:3:evicted_nonzero 0
STAT items:3:evicted_time 0
STAT items:3:outofmemory 0
STAT items:3:tailrepairs 0
STAT items:4:number 3159
STAT items:4:age 144
STAT items:4:evicted 0
STAT items:4:evicted_nonzero 0
STAT items:4:evicted_time 0
STAT items:4:outofmemory 0
STAT items:4:tailrepairs 0
END

To get the keys stored in each slab we use the cachedump command. In the command shown below we are retrieving a maximum of fifty keys from the 3rd slab.

1
stats cachedump 3 50