Programming Rants

Recently I being so much unmotivated with my work, and I think I know the reason, but anyway, these are the generic things to get things done:

Buy a pen, and small notebook, seriously! Make a TODO list, break into very small things to be done in a day
Have enough sugar and sleep, lack of sleep/concentration/focus is bad, as bad as alcohol addiction
Do your tasks first before doing unimportant tasks (facebook, tweeting, watching movies, etc), make those unimportant things as reward
Do not imagine yourself reaching the success, probably better imagining if you are fail and be afraid of it
Your willpower is unlimited, and can be strengthened by doing more things that requires willpower, as willpower is contagious
Surround yourself with people with same goals, or at least watch motivational people that work hard to reach their goals/dreams, be envious!
Forgive yourself, guilt makes you want instant gratification that produces more guilt
Do less decision, always go for option A, be courageous, set simple rules
Walk/start first, motivation will comes later

I think that's it, these article summarized from videos about motivation and willpower I found on youtube.

The microbenchmark result for spawning short-lived concurrent process:

# Go 1.8.0
$ go build test.go ; for k in 5 50 500 5000 50000 500000; do echo -n $k; time ./test $k > /dev/null; done

5
CPU: 0.00s Real: 0.00s RAM: 2080KB
50
CPU: 0.06s Real: 0.01s RAM: 3048KB
500
CPU: 0.61s Real: 0.12s RAM: 7760KB
5000
CPU: 6.02s Real: 1.23s RAM: 17712KB # 17 MB
50000
CPU: 62.30s Real: 12.53s RAM: 207720KB # 207 MB
500000
CPU: 649.47s Real: 131.53s RAM: 3008180KB # 3 GB

# Elixir 1.4.2 (erts-8.2.2)
$ for k in 5 50 500 5000 50000 ; do echo -n $k; time elixir --erl "+P 90000000" test.exs $k > /dev/null; done

5
CPU: 0.53s Real: 0.50s RAM: 842384KB # 842 MB
50
CPU: 1.50s Real: 0.62s RAM: 934276KB # 934 MB
500
CPU: 11.92s Real: 2.53s RAM: 1675872KB # 1.6 GB
5000
CPU: 122.65s Real: 20.20s RAM: 4336116KB # 4.3 GB
50000

CPU: 1288.65s Real: 209.66s RAM: 6573560KB # 6.5 GB

You can find the code here. In terms of performance, it's not really, but you can argue about anything else.

As usual, the only one matters are data updates and multiple queries.

Top ranker languages are C, C++, Java, C#, Dart, Python, Go, Perl, Scala, Javascript.

Top ranker language for multiple queries: Dart, Java, C++, C, Kotlin, Rust, Ur, Go.

Dart seems to be getting more and more popularity, since a framework for cross platform mobile app: Flutter is very usable.

There's some new alternative for near native performance out there today:

Flutter Beta1 (Dart),
pros: developed by Google, hot reload
cons: the Dart programming language have a bit awkward syntax, standard library a bit incomplete compared to other programming languages
NativeScript Vue (Javascript)
pros: no angular :3, Javascript!
cons: Javascript! very early
GoMatcha (Go)
pros: Go!
cons: very early

Sorry but i didn't intend to add ReactNative, personal reason: the error message is too freakin ugly. Also there are some others like Unity3D (for games, inefficient =battery-drain if you use it for static UIs), Xamarin (.NET), Scade (Swift), CodenameOne (Java), Kotlin/Native (Kotlin), Qt (C++), LiveCode (Transcript) and others (especially game engines like: Marmalade with C++/Lua, Corona with Lua) and of course you can choose other Hybrid or HTML5-based ones.

Old style Unity programming, known for slow performance when number of GameObjects grows, that can be solved by managing the update call ourself. Object pooling known to be used to minimize GC overhead (that can cause stutter). ECS (Entity Component System), is a architectural pattern, known best for simplifying a complex system, that popularized and mostly used in Games. In ECS, composition are preferred than inheritance (like Golang :3 yay).

Component is a structure that consist of one or more computable element, for example: Position, Sprite, Velocity, Input, AI, etc.
Entity is collection of Components, for example:

Player (consist of Position, Velocity, Sprite, Input),
Enemy (consist of Position, Velocity, Sprite, AI),
Tree/Pillar (consist of Position, Sprite), etc.

System is a function that process components in a batch, for example:

Movement (uses Position, Velocity),
Render (uses Position, Sprite),
PlayerControl (uses Player.Velocity, Input)
Bot (uses Enemy.Velocity, Player.Position)

In Unity, the latest stable version (2018.1.0, 2018-05-02) at the time of this article written have built in support for ECS. The benefit of using built-in ECS is the new Job System (utilizing SIMD and cache-friendly) and Burst Compiler that able to utilize multi-core and automatic scheduling (prevent race condition). There's another alternative for ECS that known long time ago if you don't want to use Unity's, such as Entitas framework (which has best documentation among others), Artemis (C# version), Leopotam, and EgoECS.

To use the Unity ECS, there's two method that can be used:

Hybrid, to create hybrid ECS, all you need to do is add an EntityGameObject script on the prefab.
Pure, you must create an entity manually.

To start any type of ECS project, all you must do is edit Packages/manifest.json, update to something like this:

{

"dependencies": {

"com.unity.incrementalcompiler": "0.0.38",

"com.unity.entities": "0.0.12-preview.1"

"registry": "https://staging-packages.unity.com",

"testables": [

"com.unity.collections",

"com.unity.entities",

"com.unity.jobs"

]

}

Then you must go to Edit > Project Settings > Player, Other Settings, Scripting Runtime Version to 4.x. Also it's better to use il2cpp than Mono for the scripting backend.

To start a pure ECS, first you must decompose all Component to structs, for example:

Then start create the bootstrap code (one that spawns the entity):

Then create a system to manipulate the values:

Add the GameBootstrap to the main camera or the canvas (don't forget to add the canvas). Set these values on your Unity:

Main camera > projection to "perspective"
Canvas

Canvas Scaler reference resolution to 800 x 600 (or whatever you like)
Screen match ratio to 0.5
X rotation to 90, render mode to "screen space - camera"
Drag your main camera to the "Render Camera"

Create a directory "Resources/" inside "Assets/" and add a ball.png image with transparency, in my case I used 600x600 image.
Install or git clone SpriteInstanceRenderer component to render the sprite using ECS. This component requires TransformMatrix, Heading2D and Position/Position2D component to render the sprite.

That's it, now you have created a simple pong animation with ECS, it would show something like this (I add a text on each corner to make sure the coordinate system is right):

To debug the entities, since it would not shown on the Hierarchy panel, click the Window > Entity Debugger. It would show something like this:

When you click one of the entity, you watch the current component values on Inspector panel, like this:

Previously we tried the Pure ECS to create a Pong animation, this article will show example snippet of how to use Hybrid ECS. With hybrid ECS we can use normal game object to act as the entity. What you need to do is:

create a prefab,
drag-and-drop your custom Component script (MonoBehavior, not a struct that inherited from IComponentData, so must be exactly one class per file) to the prefab,
then add the GameObjectEntity script to the prefab.
Next, you can instantiate the prefab as usual.
Use a class inherited from ComponentSystem to manipulate the values of those components.

To create a System that uses those GameObjectEntity, you can use injected property or method:

class SomeSystem1 : ComponentSystem {
ComponentGroup group;
protected override void OnUpdate() {
if(group == null || group.CalculateLength() < 1) {
group = GetComponentGroup(typeof(Direction2D), typeof(Coord2D), typeof(Health)); // must be cached as data member }
var dirs = group.GetComponentArray<Coord2D>();
// do something with it,
// note that if you want to use Job/Schedule,
// you must use [Inject]
}
}

class SomeSystem2 : ComponentSystem {
struct Data {
public GameObjectArray gameObjects;
public ComponentArray<Position2D> Positions;
public ComponentArray<Heading2D> Headings;
public ComponentArray<Health> Healths;
}
[Inject] Data data;
protected override void OnUpdate() {
// do something with data.*
// inherit from JobComponentSystem
// if you want to use Job/Schedule
// also read this
}
}

One of the biggest benefit of using Hybrid ECS than Pure ECS is that you don't have to wait/implement missing component/system (Animation for example) that not yet implemented by Unity to make stuff works correctly.

This is the example the difference between creating hybrid ECS way and standard/normal way (btw this is not using a correct composition of component, just minimal conversion between those two):

Difference between prefab in Hybrid ECS way and in normal way:

Example code for the system:

Unity ECS Intro

goo.gl/GxUkGF

Learning about Pony, a new (at least for me) programming language that designed to be elixir-python-golang like (imho), got me to this page:

It's interesting to see the memory usage of those language-framework, also Crystal, despite inability to utilize multicore still could perform as good as Golang's default muxer (which is not the best).

GitFlow is one way to manage git repository, where master branch forked into develop branch, and each person will create their own branch when they develop a feature, merged back to develop branch. See this video for the detailed explanation. Also this cheatsheet for the comparison between git-flow command and normal git command.

When you are using Gitflow without git-flow program, you will have to type this long git commands:

Or you can use Git GUIs.

Redoing old string associative benchmark (3 years ago) but now on Macbook Pro MJLQ2, here's the result:

alias time='/usr/local/bin/gtime -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"'

$ java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
$ time javac hash_map.java
CPU: 1.22sReal: 0.60sRAM: 67540KB
$ time java hash_map
6009354 6009348 611297
36186112 159701682 23370001
CPU: 274.83sReal: 53.47sRAM: 3946268KB

$ go version
go version go1.10.2 darwin/amd64
$ go build map.go
CPU: 0.09sReal: 0.14sRAM: 16160KB
$ time ./map
6009354 6009348 611297
36186112 159701682 23370001
CPU: 29.55sReal: 22.12sRAM: 2425316KB

$ dart --version
Dart VM version: 2.0.0 (Fri Aug 3 10:53:23 2018 +0200) on "macos_x64"
$ time dart ./map.dart
6009354 6009348 611297
36186112 159701682 23370001
CPU: 59.29sReal: 60.08sRAM: 1763952KB

$ node --version
v10.10.0
$ time node object.js # ERROR: TOO SLOW
^CCommand terminated by signal 2
CPU: 855.68sReal: 763.40sRAM: 1517576KB

$ php -version
PHP 7.1.16 (cli) (built: Mar 31 2018 02:59:59) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.1.0, Copyright (c) 1998-2018 Zend Technologies
$ time php -d memory_limit=4G assoc.php
6009354 6009348 611297
36186112 159701682 23370001CPU: 26.76sReal: 27.89sRAM: 1297824KB

$ mcs --version
Mono C# compiler version 5.10.1.0
$ time mcs dictionary.cs
CPU: 0.21sReal: 0.32sRAM: 42564KB
$ time mono ./dictionary.exe
6009354 6009348 611297
36186112 159701682 23370001CPU: 33.69sReal: 33.64sRAM: 1609576KB

$ ruby --version
ruby 2.3.7p456 (2018-03-28 revision 63024) [universal.x86_64-darwin17]
$ time ruby hash.rb
time ruby hash.rb
6009354 6009348 611297
36186112 159701682 23370001
CPU: 100.57sReal: 103.63sRAM: 3012148KB

$ jruby --version
jruby 9.2.0.0 (2.5.0) 2018-05-24 81156a8 Java HotSpot(TM) 64-Bit Server VM 25.172-b11 on 1.8.0_172-b11 +jit [darwin-x86_64]
$ time jruby hash.rb # ERROR: TOO MEMORY CONSUMING
Error: Your application used more memory than the automatic cap of 3641MB.
Specify -J-Xmx####M to increase it (#### = cap size in MB).
Specify -w for full java.lang.OutOfMemoryError: GC overhead limit exceeded stack trace
Command exited with non-zero status 1
CPU: 1616.55sReal: 257.91sRAM: 4439172KB

$ python3 --versionPython 3.7.0
$ time python3 dictionary.py
6009354 6009348 611297
36186112 159701682 23370001CPU: 117.59sReal: 121.63sRAM: 3923812KB

$ pypy --version
Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Jun 24 2018, 08:19:27)[PyPy 6.0.0 with GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)]
$ time pypy dictionary.py
(6009354, 6009348, 611297)
(36186112, 159701682, 23370001)
CPU: 28.82sReal: 31.55sRAM: 3993764KB

$ lua -v
Lua 5.3.5 Copyright (C) 1994-2018 Lua.org, PUC-Rio
$ time lua table.lua
60093546009348611297
3618611215970168223370001CPU: 89.05sReal: 91.35sRAM: 2523332KB

$ luajit -v
LuaJIT 2.0.5 -- Copyright (C) 2005-2017 Mike Pall. http://luajit.org/
$ time luajit table.lua # ERROR: CANNOT ALLOC MEMORY
PANIC: unprotected error in call to Lua API (not enough memory)
Command exited with non-zero status 1CPU: 30.57sReal: 31.56sRAM: 1132648KB

This time PHP7 became the best performing programming language implementation, also the least memory consumption (I'm amazed with what they did in version 7). And as usual LuaJIT limited to 1-2GB so they failed, NodeJS too slow, and I'm unwilling to wait more than 10 minutes, so I terminate the app, JRuby hogging RAM too much.

Language	Version	Compile Duration	Compile RAM	Runtime Duration	Runtime RAM	2015's Duration	2015's RAM	Duration Improv	RAM Improv
C#	5.10.1.0	0.32	42,564	33.64	1,609,576	40.98	1,690,224	17.91%	4.77%
Dart	2.0.0			60.08	1,763,952	133.35	2,538,220	54.95%	30.50%
Go	1.10.2	0.14	16,160	22.12	2,425,316	24.10	2,748,784	8.22%	11.77%
Java	1.8.0_172	0.60	67,540	53.47	3,946,268	103.54	4,119,388	48.36%	4.20%
Lua	5.3.5			91.35	2,523,332	101.73	3,079,336	10.20%	18.06%
PHP	7.1.16			27.89	1,297,824	failed	failed
Pypy	6.0.0			31.55	3,993,764	79.87	4,034,956	60.50%	1.02%
Python3	3.7.0			121.63	3,923,812	157.71	4,335,568	22.88%	9.50%
Ruby	2.3.7p456			103.63	3,012,148	107.91	3,031,872	3.97%	0.65%

The new result is up, as usual the most important thing is database update benchmark

Top rankers for this time is Rust, Java, C, C#, Go, and C++, the rest has some error or ranked lower than that. Here's some result of multiple queries benchmark:

This is C# version for https://openglcolor.mpeters.me that I've been using a lot lately. Paste your colors one per line using RRGGBB or RGB format, with or without hash/pound sign (#).

The result will appear here (Ctrl/Cmd+A, Ctrl/Cmd+C to copy paste):

Redoing the old benchmark on my office's Mac, I have installed cockroachdb on Mac using
brew install cockroach

Start the server
cockroach start --insecure --listen-addr=localhost

Create the database
cockroach sql --insecureCREATE DATABASE test3;GRANT ALL ON DATABASE test3 TO test3;

And do the benchmark:

alias time='/usr/local/bin/gtime -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"'

time go run cockroach.go lib.go

INSERT: 1.653080201s (0.17 ms/op)
UPDATE: 1.947936008s (0.19 ms/op)
SELECT: 1m44.425784366s (1.60 ms/op: 65096)

CPU: 43.20sReal: 108.71sRAM: 89072KB

The benchmark performed on i7-4770HQ, 16GB RAM, SSD harddisk, with OSX Mojave 10.14.2 under Go 1.11. Conclusion: the result is really impressive :3 Great job CockroachDB Team.

What are our options? Raster

Paint.NET, supports: Windows
Medibang Paint Pro, supports: Windows, Mac, Android, iOS
Krita, supports: Windows, Mac, Linux, Steam
Adobe Photoshop CS2, supports: Windows, Mac

Vector

Inkscape, supports: Windows, Mac, Linux
Synfig, supports: Windows, Mac, Linux
Karbon, supports: Windows, Mac, Linux, BSD

Misc Tools

Irfan View, supports: Windows

Since I don't like GIMP, i wont put it.

Found out that there's a new programming language called V. At glance it's like combination of Go and Rust. Seems really promising, has really fast compilation speed.

No global state
No null
No undefined values
Option types
Generics
Immutability by default
Partially pure functions
Hot Code Reload
REPL
C/C++ converter
Native cross platform UI library
Run Everywhere

But no compiler yet (but there's already software built with it), wait until May 2019. I hope it hype :3

There's a lot of screen resolution out there, how to make our UI objects (canvas) fit all resolution? One of the easiest solution is to envelope the canvas with borders, here's how you do it:

Create a canvas object
Set the Canvas Scaler to Scale with Screen Size
Set the Reference Resolution to for example: 1080 x 800
Set the Screen Match Mode to Match with Or Height
Set the match to 1 if your current screen width is smaller, 0 if height is smaller
Create an Image as background inside the Canvas
Add Aspect Ratio Fitter script
Set the Aspect Mode to Fit in Parent (so the UI anchor can be anywhere)
Set the Aspect Ratio to 1080/800 = 1.35

Now you can add any UI elements inside the background Image.

Last piece is add this piece of script on the canvas' Awake method:

var canvasScaler = GetComponent<CanvasScaler>();
var ratio = Screen.height / (float) Screen.width;
var rr = canvasScaler.referenceResolution;
canvasScaler.matchWidthOrHeight = (ratio < rr.x / rr.y) ? 1 : 0;

This would ensure that the scaling/aspect ratio works correctly across any screen resolutions. There would be border on top-bottom if screen is taller than aspect ratio, and there would be border on left-right if screen is wider than aspect ratio.

Today we will benchmark a single node version of distributed database (and some non-distributed database for comparison), the client all written with Go (with any available driver). The judgement will be about performance (that mostly write, and some infrequent read), not about the distribution performance (I will take a look in some other time). I searched a lot of database from DbEngines for database that could suit my needs for my next project. For session kv-store I'll be using obviously first choice is Aerospike, but since they cannot be run inside server that I rent (that uses OpenVZ), so I'll go for second choice that is Redis. Here's the list of today's contender:

CrateDB, a highly optimized for huge amount of data (they said), probably would be the best for updatable time series, also with built-in search engine, so this one is quite fit my use case probably to replace [Riot (small scale) or Manticore (large scale)] and [InfluxDB or TimescaleDB], does not support auto increment
CockroachDB, self-healing database with PostgreSQL-compatible connector
MemSQL, which also can replace kv-store, there's a limit of 128GB for free version
The client/connector is MySQL-compatible
MariaDB (MySQL), one of the most popular open source RDBMS, for the sake of comparison
PostgreSQL, my favorite RDBMS, for the sake of comparison
NuoDB on another benchmark even faster than GoogleSpanner or CockroachDB
TiDB, a work in progress approach of CockroachDB but with MySQL-compatible connector

What's the extra motivation of this post?

I almost never use distributed database, since all of my project have no more than 200 concurrent users/sec. I've encountered bottleneck before, and the culprit is multiple slow complex queries, that could be solved by queuing to another message queue, and process them one by one instead of bombing database's process at the same time and hogging out the memory.

The benchmark scenario would be like this:
1. 50k inserts of single column string value, 200k inserts of 2 column unique value, 900k insert of unique
INSERT INTO users(id, uniq_str) -- x50k
INSERT INTO items(fk_id, typ, amount) -- x50k x4
INSERT INTO rels(fk_low, fk_high, bond) -- x900k

2. while inserting at 5%+, there would be at least 100k random search queries of unique value/, and 300k random search queries, every search queries, there would be 3 random update of amount
SELECT * FROM users WHERE uniq_str = ? -- x100k
SELECT * FROM items WHERE fk_id = ? AND typ IN (?) -- x100k x3
UPDATE items SET amount = amount + xxx WHERE id = ? -- x100k x3

3. while inserting at 5%+, there would be also at least 100k random search queries
SELECT * FROM items WHERE fk_id = ?

4. while inserting at 5%+, there also at least 200k query of relations and 50% chance to update the bond
SELECT * FROM rels WHERE fk_low = ? or fk_high = ? -- x200k
UPDATE rels SET bond = bond + xxx WHERE id = ? -- x200k / 2

This benchmark represent simplified real use case of the game I'm currently develop. Let's start with PostgreSQL 10.7 (current one on Ubuntu 18.04.1 LTS), the configuration generated by pgtune website:

max_connections = 400
shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 5242kB
min_wal_size = 2GB
max_wal_size = 4GB
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

Create the user and database first:

sudo su - postgres
createuser b1
createdb b1
psql
GRANT ALL PRIVILEGES ON DATABASE b1 TO b1
\q

Add to pg_hba.conf if required, then restart:

local all b1 trust
host all b1 127.0.0.1/32 trust
host all b1 ::1/128 trust

For slow databases, all values reduced by 20 except query-only.

$ go run pg.go lib.go

[Pg] RandomSearchItems (100000, 100%) took 24.62s (246.21 µs/op)

[Pg] SearchRelsAddBonds (10000, 100%) took 63.73s (6372.56 µs/op)
[Pg] UpdateItemsAmounts (5000, 100%) took 105.10s (21019.88 µs/op)
[Pg] InsertUsersItems (2500, 100%) took 129.41s (51764.04 µs/op)
USERS CR : 2500 / 4999
ITEMS CRU : 17500 / 14997 + 698341 / 14997
RELS CRU : 2375 / 16107 / 8053
SLOW FACTOR : 20
CRU µs/rec : 5783.69 / 35.26 / 7460.65

Next we'll try with MySQL 5.7, create user and database first, then multiply all memory config by 10 (since there are automatic config generator for mysql?):

innodb_buffer_pool_size=4G

$ sudo mysql

CREATE USER 'b1'@'localhost' IDENTIFIED BY 'b1';

CREATE DATABASE b1;

GRANT ALL PRIVILEGES ON b1.* TO 'b1'@'localhost';

FLUSH PRIVILEGES;

sudo mysqltuner # not sure if this useful

And here's the result:

$ go run maria.go lib.go

[My] RandomSearchItems (100000, 100%) took 16.62s (166.20 µs/op)
[My] SearchRelsAddBonds (10000, 100%) took 86.32s (8631.74 µs/op)
[My] UpdateItemsAmounts (5000, 100%) took 172.35s (34470.72 µs/op)
[My] InsertUsersItems (2500, 100%) took 228.52s (91408.86 µs/op)
USERS CR : 2500 / 4994
ITEMS CRU : 17500 / 14982 + 696542 / 13485
RELS CRU : 2375 / 12871 / 6435
SLOW FACTOR : 20
CRU µs/rec : 10213.28 / 23.86 / 13097.44

Next we'll try with MemSQL 6.7.16-55671ba478, while the insert and update performance is amazing, the query/read performance is 3-4x slower than traditional RDBMS:

$ go run memsql.go lib.go # 4 sec before start RU

[Mem] InsertUsersItems (2500, 100%) took 4.80s (1921.97 µs/op)
[Mem] UpdateItemsAmounts (5000, 100%) took 13.48s (2695.83 µs/op)
[Mem] SearchRelsAddBonds (10000, 100%) took 14.40s (1440.29 µs/op)
[Mem] RandomSearchItems (100000, 100%) took 64.87s (648.73 µs/op)
USERS CR : 2500 / 4997
ITEMS CRU : 17500 / 14991 + 699783 / 13504
RELS CRU : 2375 / 19030 / 9515
SLOW FACTOR : 20
CRU µs/rec : 214.75 / 92.70 / 1255.93

$ go run memsql.go lib.go # 2 sec before start RU

[Mem] InsertUsersItems (2500, 100%) took 5.90s (2360.01 µs/op)

[Mem] UpdateItemsAmounts (5000, 100%) took 13.76s (2751.67 µs/op)

[Mem] SearchRelsAddBonds (10000, 100%) took 14.56s (1455.95 µs/op)

[Mem] RandomSearchItems (100000, 100%) took 65.30s (653.05 µs/op)

USERS CR : 2500 / 4998

ITEMS CRU : 17500 / 14994 + 699776 / 13517

RELS CRU : 2375 / 18824 / 9412

SLOW FACTOR : 20

CRU µs/rec : 263.69 / 93.32 / 1282.38

$ go run memsql.go lib.go # SLOW FACTOR 5

[Mem] InsertUsersItems (10000, 100%) took 31.22s (3121.90 µs/op)

[Mem] UpdateItemsAmounts (20000, 100%) took 66.55s (3327.43 µs/op)

[Mem] RandomSearchItems (100000, 100%) took 85.13s (851.33 µs/op)

[Mem] SearchRelsAddBonds (40000, 100%) took 133.05s (3326.29 µs/op)
USERS CR : 10000 / 19998
ITEMS CRU : 70000 / 59994 + 699944 / 53946

RELS CRU : 37896 / 300783 / 150391

SLOW FACTOR : 5

CRU µs/rec : 264.80 / 121.63 / 1059.16

Next we'll try CrateDB 3.2.7, with similar setup like PostgreSQL, the result:

$ go run crate.go lib.go

[Crate] SearchRelsAddBonds (10000, 100%) took 49.11s (4911.38 µs/op)
[Crate] RandomSearchItems (100000, 100%) took 101.40s (1013.95 µs/op)
[Crate] UpdateItemsAmounts (5000, 100%) took 246.42s (49283.84 µs/op)
[Crate] InsertUsersItems (2500, 100%) took 306.12s (122449.00 µs/op)
USERS CR : 2500 / 4965
ITEMS CRU : 17500 / 14894 + 690161 / 14895
RELS CRU : 2375 / 4336 / 2168
SLOW FACTOR : 20
CRU µs/rec : 13681.45 / 146.92 / 19598.85

Next is CockroachDB 2.1.3, the result:

$ go run cockroach.go lib.go
[Cockroach] SearchRelsAddBonds (10000, 100%) took 61.87s (6187.45 µs/op)
[Cockroach] RandomSearchItems (100000, 100%) took 93.12s (931.22 µs/op)
[Cockroach] UpdateItemsAmounts (5000, 100%) took 278.10s (55620.39 µs/op)
[Cockroach] InsertUsersItems (2500, 100%) took 371.76s (148704.47 µs/op)
USERS CR : 2500 / 4993
ITEMS CRU : 17500 / 14979 + 699454 / 14979
RELS CRU : 2375 / 5433 / 2716
SLOW FACTOR : 20
CRU µs/rec : 16615.02 / 133.14 / 20673.81

Next is NuoDB 3.4.1, the storage manager and transaction engine config and the benchmark result:

$ chown nuodb:nuodb /media/nuodb
$ nuodbmgr --broker localhost --password nuodb1pass
start process sm archive /media/nuodb host localhost database b1 initialize true
start process te host localhost database b1
--dba-user b2 --dba-password b3
$ nuosql b1 --user b2 --password b3

$ go run nuodb.go lib.go
[Nuo] RandomSearchItems (100000, 100%) took 33.79s (337.90 µs/op)
[Nuo] SearchRelsAddBonds (10000, 100%) took 72.18s (7218.04 µs/op)
[Nuo] UpdateItemsAmounts (5000, 100%) took 117.22s (23443.65 µs/op)
[Nuo] InsertUsersItems (2500, 100%) took 144.51s (57804.21 µs/op)
USERS CR : 2500 / 4995
ITEMS CRU : 17500 / 14985 + 698313 / 14985
RELS CRU : 2375 / 15822 / 7911
SLOW FACTOR : 20
CRU µs/rec : 6458.57 / 48.39 / 8473.22

Next is TiDB 2.1.7, the config and the result:

sudo sysctl -w net.core.somaxconn=32768
sudo sysctl -w vm.swappiness=0
sudo sysctl -w net.ipv4.tcp_syncookies=0
sudo sysctl -w fs.file-max=1000000

$ pd-server --name=pd1 \
--data-dir=pd1 \
--client-urls="http://127.0.0.1:2379" \
--peer-urls="http://127.0.0.1:2380" \
--initial-cluster="pd1=http://127.0.0.1:2380" \
--log-file=pd1.log
$ tikv-server --pd-endpoints="127.0.0.1:2379" \
--addr="127.0.0.1:20160" \
--data-dir=tikv1 \
--log-file=tikv1.log
$ tidb-server --store=tikv --path="127.0.0.1:2379"--log-file=tidb.log

$ go run tidb.go lib.go
[Ti] InsertUsersItems (125, 5%) took 17.59s (140738.00 µs/op)
[Ti] SearchRelsAddBonds (500, 5%) took 9.17s (18331.36 µs/op)
[Ti] RandomSearchItems (5000, 5%) took 10.82s (2163.28 µs/op)
# failed with bunch of errors on tikv, such as:
[2019/04/26 04:20:11.630 +07:00] [ERROR] [endpoint.rs:452] [error-response] [err="locked LockInfo { primary_lock: [116, 128, 0, 0, 0, 0, 0, 0, 50, 95, 114, 128, 0, 0, 0, 0, 0, 0, 96], lock_version: 407955626145349685, key: [116, 128, 0, 0, 0, 0, 0, 0, 50, 95, 114, 128, 0, 0, 0, 0, 0, 0, 96], lock_ttl: 3000, unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } }"]

These benchmark performed using i7-4720HQ 32GB RAM with SSD disk. At first there's a lot that I want to add to this benchmark to make this huge '__'), such as:

YugaByteDB, similar to CockroachDB/ScyllaDB but written in C++
DGraph, a graph database written in Go, the backup is local same as MemSQL (you cannot do something like this ssh foo@bar "pg_dump | xz - -c" | pv -r -b > /tmp/backup_`date +%Y%m%d_%H%M%S`.sql.xz")
Cayley, a graph layer written in Go, can support many backend storage
ScyllaDB, a C++ version of Cassandra
Clickhouse, claimed to be fastest OLAP database, but doesn't support UPDATE.
RQLite, a distributed SQLite
Redis, obviously for the sake of comparison
OrientDB, multi-model graph database
RethinkDB, document-oriented database
ArangoDB, multi-model database, with built-in Foxx Framework for creating REST APIs
Tarantool, a redis competitor with ArrangoDB-like features but with Lua instead of JS, I want to see if this simpler to use but with near equal performance than Aerospike
MongoDB, one of the most popular open source document database, for the sake of comparison, I'm not prefer this one because of the memory usage.
Aerospike, fastest distributed kv-store I ever found, just for the sake of comparison, the free version limited to 2 namespace with 4 billions object. Too bad this one cannot be installed on OpenVZ-based VM.

As you probably know, I'm a huge fan of Go for backend, not any other language, so I exclude any JVM-based database (in exception of OrientDB because their screenshot looks interesting). For now, I have found what I need, so probably i'll add the rest later. The code for this benchmark can be found here: https://github.com/kokizzu/hugedbbench (send pull request then i'll run and update this post) and the spreadsheet here: http://tiny.cc/hugedb

The chart (lower is better) shown below:

It's interesting to see the result of jeromefoe's metser and smallnest's gosercomp serialization benchmark, the combined best results are:

Colfer
GoGo's Protobuf (instead of Google's)
Gencode

There's also interesting result (that optimized for deserializing speed, since it's only indexing/pointer to value, but in expense of bandwidth) such as FlatBuffers (but perform badly on metser's benchmark), or ZeroFormatter which not included in both benchmark, but on the original C# implementation has best result (also explanation how it works). But if you have to use JSON anyway for browser compatibility, please use jsoniter instead of Golang's default. If you really need to communicate between services, it's preferably to use binary format (gRPC instead of REST). For best practice on using gRPC, see videos below:

How FlatBuffers works:

self-note tl;dr

use FlatBuffers if you don't care about bandwidth, want real fast deserialization
use ProtoBuf if you use care about bandwidth, communicating between services using gRPC (since it's already been implemented on most languages' library)
use JSON if you need browser compatibility, eg. using REST
use Colfer or Gencode if you care about bandwidth, real fast in both case (serialization and deserialization), also both client and server written in Go
use ZeroFormatter if both client and server written in C#

Haven't research about bound check tho (when network package forged/tampered), not sure which of those binary formats are secure against those kind of attack.

Also checkout these interesting new database (still experimental so it doesn't support important feature such as replication, but you can use this for any embedded database use-case) that tries to reduce/minimize serialization-deserialization process from disk to memory/network (which most databases do convert from row/column/document to struct then serialize before sending to client, not sure how it will affect the query performance tho):

JonoonDB (FlatBuffers)
ProfaneDB (ProtoBuff)

LXC/LXD is lightweight OS-level virtualization on Linux, much like OpenVZ. It was used by early version of Docker. The benefit of using LXC/LXD is when you need a virtualization but also need fast startup and near-baremetal performance (especially compared to full-virtualization like KVM or VirtualBox). The difference between Docker and LXC is which level they are targeting, Docker is more for application deployment, where LXC is machine level. LXD adds REST API for LXC. Other main difference between LXC and Docker is that Docker has a copy-on-write file system built-in. To start using LXD, just install and run:

sudo apt install lxc lxd libvirt-bin zfsutils-linux
sudo lxd init

# there would be questions to be answered like these:
Would you like to use LXD clustering? (yes/no) [default=no]:
Do you want to configure a new storage pool? (yes/no) [default=yes]:
Name of the new storage pool [default=default]:
Name of the storage backend to use (dir, lvm, zfs) [default=zfs]:
Create a new ZFS pool? (yes/no) [default=yes]:
Would you like to use an existing block device? (yes/no) [default=no]:
Size in GB of the new loop device (1GB minimum) [default=100GB]:
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to create a new local network bridge? (yes/no) [default=yes]:
What should the new bridge be called? [default=lxdbr0]:
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
Would you like LXD to be available over the network? (yes/no) [default=no]: yes
Address to bind LXD to (not including port) [default=all]: 127.0.0.1
Port to bind LXD to [default=8443]:
Trust password for new clients:
Again:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:

# run one container
lxc launch ubuntu:18.04

# run command inside, enable ssh with password, change the root password
lxc exec container1 bash
echo '
PermitRootLogin yes
PasswordAuthentication yes
'> /etc/ssh/sshd_config
systemctl restart ssh
passwd

Then you'll need to expose (or port forward) from outside to your container:

# get ip from your container
lxc list
+------------+---------+-----------------------+------------+-----------+
| NAME | STATE | IPV4 | TYPE | SNAPSHOTS |
+------------+---------+-----------------------+------------+-----------+
| container1 | RUNNING | 10.123.126.200 (eth0) | PERSISTENT | 0 |
+------------+---------+-----------------------+------------+-----------+

# forward real port 2200 to container's port 22 and vice versa
iptables -A FORWARD -i eth0 -j DROP
iptables -A FORWARD -i lxdbr0 -m state --state NEW,INVALID -j DROP
iptables -A FORWARD -i eth0 -d 10.123.126.200 -p tcp --dport 2200 -j ACCEPT
iptables -t nat -A PREROUTING -p tcp --dport 2200 -j DNAT --to 10.123.126.200:22

You can test whether the port forwarding and ssh works using these command from another computer:

ssh -o PreferredAuthentications=keyboard-interactive,password -o PubkeyAuthentication=no root:@thePublicIpAddress -p 2200

If you need to expose more ports, for example container's 80 to real's 8080 for example, you can add the rules like this:

iptables -A FORWARD -i eth0 -d 10.123.126.200 -p tcp --dport 8080 -j ACCEPT
iptables -t nat -A PREROUTING -p tcp --dport 8080 -j DNAT --to 10.123.126.200:80

But for this case, I think it's better to use a reverse proxy instead.

Framework Benchmark 18 is out (half year after previous result), the shocking result that Vert.x version of Javascript just killing almost everyone except Rust. Top performing programming languages for updating-database benchmark are: Rust, Java, Javascript, C++, C#, Go, Kotlin, Dart, Python.

For multiple-queries benchmark, the top performers are: Rust, Java, Javascript, C, Kotlin, C++, Clojure, Go, PHP, Perl, C#.

Rust is quite interesting, the only drawback that I found other than the syntax is the slow compile, it took nearly 6 seconds for even a minor changes (with Actix framework) in ramdisk to recompile, even with slow compile flags turned off.

Latest update (2019-07-19) from the-benchmarker's web-framework:

Language (Runtime)	Framework (Middleware)	Requests / s	Throughput
`c` (`11`)	agoo-c (0.5)	199670.00	115.49 MB
`python` (`3.7`)	japronto (0.1)	177634.00	212.57 MB
`java` (`8`)	rapidoid (5.5)	153167.00	275.56 MB
`go` (`1.12`)	fasthttprouter (0.1)	146986.67	236.54 MB
`python` (`3.6`)	vibora (0.0)	144171.33	163.66 MB
`c` (`99`)	kore (3.1)	142837.67	370.30 MB
`cpp` (`11`)	evhtp (1.2)	141011.33	136.87 MB
`java` (`8`)	act (1.8)	137266.33	236.87 MB
`ruby` (`2.6`)	agoo (2.8)	132990.67	76.84 MB
`rust` (`1.36`)	gotham (0.4)	130192.33	266.35 MB
`crystal` (`0.29`)	router.cr (0.2)	123911.33	116.40 MB
`nim` (`0.2`)	jester (0.4)	123719.00	248.70 MB
`crystal` (`0.29`)	raze (0.3)	122186.33	114.87 MB
`crystal` (`0.29`)	spider-gazelle (1.4)	120138.00	128.27 MB
`crystal` (`0.29`)	kemal (0.25)	114424.33	187.01 MB
`rust` (`1.36`)	actix-web (1.0)	114286.67	163.27 MB
`crystal` (`0.29`)	amber (0.28)	105704.33	193.62 MB
`rust` (`1.36`)	nickel (0.11)	102067.33	202.98 MB
`csharp` (`7.3`)	aspnetcore (2.2)	100367.67	163.49 MB
`rust` (`1.36`)	iron (0.6)	99692.33	125.66 MB
`crystal` (`0.29`)	orion (1.7)	95829.67	156.64 MB
`go` (`1.12`)	gorouter (4.0)	91250.00	121.51 MB
`node` (`12.6`)	polkadot (1.0)	90498.00	135.64 MB
`go` (`1.12`)	chi (4.0)	89401.33	119.52 MB
`node` (`12.6`)	0http (1.0)	88940.67	133.26 MB
`go` (`1.12`)	gin (1.4)	88229.00	154.70 MB
`go` (`1.12`)	violetear (7.0)	87979.00	116.68 MB
`node` (`12.6`)	restana (3.3)	87181.67	130.61 MB
`go` (`1.12`)	echo (4.1)	86944.33	152.32 MB
`go` (`1.12`)	kami (2.2)	85569.00	113.85 MB
`go` (`1.12`)	beego (1.12)	83531.33	112.24 MB
`go` (`1.12`)	gorilla-mux (1.7)	83107.67	110.75 MB
`kotlin` (`1.3`)	ktor (1.2)	76189.67	118.63 MB
`go` (`1.12`)	gf (1.8)	73145.67	110.94 MB
`node` (`12.6`)	polka (0.5)	71049.67	106.46 MB
`scala` (`2.12`)	akkahttp (10.1)	69006.00	147.87 MB
`node` (`12.6`)	rayo (1.3)	68066.67	102.05 MB
`python` (`3.7`)	falcon (2.0)	60301.00	141.34 MB
`swift` (`5.0`)	perfect (3.1)	60239.67	56.60 MB
`node` (`12.6`)	muneem (2.4)	58723.67	87.98 MB
`scala` (`2.12`)	http4s (0.18)	58317.33	102.08 MB
`node` (`12.6`)	fastify (2.6)	58029.33	147.94 MB
`node` (`12.6`)	foxify (0.1)	53745.00	112.74 MB
`java` (`8`)	spring-boot (2.1)	52174.00	39.04 MB
`node` (`12.6`)	koa (2.7)	50993.67	107.80 MB
`python` (`3.7`)	blacksheep (0.1)	50145.67	102.88 MB
`python` (`3.7`)	bottle (0.12)	49704.67	122.36 MB
`node` (`12.6`)	restify (8.2)	45617.00	79.87 MB
`php` (`7.3`)	slim (3.12)	43847.33	217.11 MB
`php` (`7.3`)	zend-expressive (3.2)	42281.00	209.34 MB
`php` (`7.3`)	symfony (4.3)	42019.67	208.50 MB
`python` (`3.7`)	starlette (0.12)	41710.67	89.72 MB
`node` (`12.6`)	express (4.17)	41081.33	100.31 MB
`php` (`7.3`)	zend-framework (3.1)	39650.00	196.61 MB
`swift` (`5.0`)	kitura (2.7)	39061.33	72.50 MB
`ruby` (`2.6`)	roda (3.22)	38720.67	36.90 MB
`swift` (`5.0`)	vapor (3.3)	38685.00	64.54 MB
`python` (`3.7`)	hug (2.5)	37882.33	93.84 MB
`php` (`7.3`)	lumen (5.8)	37822.00	196.49 MB
`ruby` (`2.6`)	cuba (3.9)	35257.00	41.55 MB
`crystal` (`0.28`)	lucky (0.14)	33885.00	41.73 MB
`crystal` (`0.29`)	onyx (0.5)	32685.67	83.76 MB
`node` (`12.6`)	turbo_polka (2.0)	31139.67	29.22 MB
`ruby` (`2.6`)	rack-routing (0.0)	29710.33	17.13 MB
`node` (`12.6`)	hapi (18.1)	29298.33	75.73 MB
`php` (`7.3`)	laravel (5.8)	28941.33	151.14 MB
`swift` (`5.0`)	kitura-nio (2.7)	28372.00	53.53 MB
`python` (`3.7`)	fastapi (0.33)	27457.67	59.12 MB
`python` (`3.7`)	aiohttp (3.5)	23169.00	52.40 MB
`ruby` (`2.6`)	flame (4.18)	20298.33	11.70 MB
`python` (`3.7`)	molten (0.27)	19610.00	36.40 MB
`python` (`3.7`)	flask (1.1)	19088.33	46.94 MB
`ruby` (`2.6`)	hanami (1.3)	18242.67	137.89 MB
`rust` (`nightly`)	rocket (0.4)	17988.33	27.86 MB
`python` (`3.7`)	bocadillo (0.18)	17408.33	33.59 MB
`python` (`3.7`)	sanic (19.6)	14934.00	26.61 MB
`ruby` (`2.6`)	sinatra (2.0)	14907.33	38.66 MB
`swift` (`5.0`)	swifter (1.4)	11351.67	14.52 MB
`python` (`3.7`)	quart (0.9)	10817.67	21.55 MB
`python` (`3.7`)	responder (1.3)	8826.33	19.23 MB
`python` (`3.7`)	django (2.2)	7604.67	22.02 MB
`python` (`3.7`)	tornado (5.1)	7089.33	20.92 MB
`python` (`3.7`)	masonite (2.2)	6298.67	15.47 MB
`crystal` (`0.29`)	athena (0.7)	6247.67	7.81 MB
`ruby` (`2.6`)	rails (5.2)	3680.33	11.28 MB
`python` (`3.7`)	cyclone (0.0)	2889.33	7.85 MB

It's interesting to see new frameworks (or one that I never heard of.. Vibora, Agoo, and Gotham for example) performing well.

Tricks to be Motivated

Elixir/Erlang better than Go, really?

TechEmpower Framework Benchmark Round 15

Cross Platform Mobile App Development

Rendering Sprite using ECS and JobSystem in Unity

Hybrid ECS in Unity

Costajob HTTP Benchmark

Introduction to gitflow

String Associative Array Benchmark 2

Techempower Framework Benchmark Round 17

Hex to C# Colors Converter

CockroachDB 2.1.3 Benchmark

Free Digital Painting Software

New Promising Programming Language: V

How to make 2D Game that fit multiple resolution in Unity

Huge List of Database Benchmark

Serialization Benchmark

Expose LXC/LXD Container Ports to Public

Techempower Framework Benchmark Round 18

The Benchmarker's Web Framework Benchmark