RIAK compact e/leveldb tombstones and reclaim disk space
The Problem
When attempting to reclaim disk space, deleting data may seem like the obvious first step. However, in Riak this is not necessarily the best thing to do if the disk is nearly full.
This is because deleting objects in Riak is complicated. As stated in the object-deletion section of latest Riak documentation:
In single-server, non-clustered data storage systems, object deletion is a trivial process. In an eventually consistent, clustered system like Riak, however, object deletion is far less trivial because objects live on multiple nodes, which means that a deletion process must be chosen to determine when an object can be removed from the storage backend.
How deletion works
- Riak writes a “tombstone” value for the key to the N vnodes that contain it
(this is a new record) - Riak by default, waits 3 seconds to verify all vnodes agree to the
tombstone/delete - Riak issues an actual delete operation against the key to leveldb
- leveldb creates its own tombstone
- the leveldb tombstone “floats” through level-0 and level-1 as part of normal
compactions - upon reaching level-2, leveldb will initiate immediate compaction and
propagation of tombstones in .sst table files containing 1000 or more
tombstones.
Consequence of this is that freeing disk space, if it happens either, it happens very slowly!
Solution for e/leveldb
In short, there is an c++ function in leveldb that is used to compact the underlying storage. The function is called
CompactRange
".
In particular, deleted and overwritten versions are discarded, and the data is rearranged to reduce the cost of operations needed to access the data.
This function does not exists in the erlang code that uses this c++ library. This means that we needed to build an standalone tool that calls this library function on all leveldb files in Riak. Drawback of this is, that your Riak server has to be offline while running such an 3rd party tool
Use case
We build such an tool, you can check it out from github RiakToolsCxx.git and build it with cmake. External dependency leveldb-basho is pulled automatically by cmake.
Checkout and build process could look like:
dwalter@knxwork:~/Projects$ git clone https://github.com/hw-dwalter/RiakToolsCxx.git RiakToolsCxx Cloning into 'RiakToolsCxx'... remote: Counting objects: 11, done. remote: Total 11 (delta 0), reused 0 (delta 0), pack-reused 11 Unpacking objects: 100% (11/11), done. dwalter@knxwork:~/Projects$ cd RiakToolsCxx/ dwalter@knxwork:~/Projects/RiakToolsCxx$ mkdir build dwalter@knxwork:~/Projects/RiakToolsCxx$ cd build/ dwalter@knxwork:~/Projects/RiakToolsCxx/build$ cmake .. -- The CXX compiler identification is GNU 6.3.0 -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test COMPILER_SUPPORTS_CXX11 -- Performing Test COMPILER_SUPPORTS_CXX11 - Success -- Performing Test COMPILER_SUPPORTS_CXX0X -- Performing Test COMPILER_SUPPORTS_CXX0X - Success -- The C compiler identification is GNU 6.3.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Boost version: 1.62.0 -- Found the following Boost libraries: -- filesystem -- system -- Configuring done -- Generating done -- Build files have been written to: /home/dwalter/Projects/RiakToolsCxx/build dwalter@knxwork:~/Projects/RiakToolsCxx/build$ make Scanning dependencies of target leveldb-basho [ 10%] Creating directories for 'leveldb-basho' [ 20%] Performing download step (git clone) for 'leveldb-basho' Cloning into 'leveldb-basho'... Already on 'develop' Your branch is up-to-date with 'origin/develop'. [ 30%] No patch step for 'leveldb-basho' [ 40%] No update step for 'leveldb-basho' [ 50%] No configure step for 'leveldb-basho' [ 60%] Performing build step for 'leveldb-basho' ar: creating libleveldb.a [ 70%] No install step for 'leveldb-basho' [ 80%] Completed 'leveldb-basho' [ 80%] Built target leveldb-basho Scanning dependencies of target riakcompact [ 90%] Building CXX object src/CMakeFiles/riakcompact.dir/main.cpp.o [100%] Linking CXX executable riakcompact [100%] Built target riakcompact dwalter@knxwork:~/Projects/RiakToolsCxx/build$ ./src/riakcompact usage: ./src/riakcompact [path]
After this tool is build you can use it like this. Take in mind that your Riak node has to be turned off!
root@knxwork:/home/dwalter/Projects/RiakToolsCxx/build# ./src/riakcompact /var/lib/riak/leveldb/ "/var/lib/riak/leveldb/91343852333181432387730302044767688728495783936" [directory] compacting... done "/var/lib/riak/leveldb/685078892498860742907977265335757665463718379520" [directory] compacting... done
Conclusion
After deleting an complete bucket in our riak (key by key), we are able to reduce consumed disk space from 75GB to 25GB with this tool!
Compaction freed more than 66% of the data!